All Topics  
Reliability engineering

 

   Email Print
   Bookmark   Link






 

Reliability engineering



 
 
Reliability engineering is an engineering
Engineering

Engineering is the discipline and profession of applying Technology and science knowledge and utilizing natural laws and physical resources in order to design and implement materials, structures, machines, devices, systems, and process that safely realize a desired objective and meet specified criteria....
 field, that deals with the study of reliability
Reliability

In general, reliability is the ability of a person or system to perform and maintain its functions in routine circumstances, as well as hostile or unexpected circumstances....
: the ability of a system
System

System is a set of interacting or interdependent entities, real or abstract, forming an integrated whole.The concept of an "integrated whole" can also be stated in terms of a system embodying a set of relationships which are differentiated from relationships of the set to other elements, and from relationships between an element of the se...
 or component to perform its required functions under stated conditions for a specified period of time. It is often reported in terms of a probability
Probability

Probability, or wikt:chance, is a way of expressing knowledge or belief that an Event will occur or has occurred. In mathematics the concept has been given an exact meaning in probability theory, that is used extensively in such areas of study as mathematics, statistics, finance, gambling, science, and philosophy to draw conclusions about t...
.

eliability may be defined in several ways:

Reliability engineers rely heavily on statistics
Statistics

Statistics is a Mathematics pertaining to the collection, analysis, interpretation or explanation, and presentation of data. It also provides tools for prediction and forecasting based on data....
, probability theory
Probability theory

Probability theory is the branch of mathematics concerned with analysis of Statistical randomness phenomena. The central objects of probability theory are random variables, stochastic processes, and event s: mathematical abstractions of determinism events or measured quantities that may either be single occurrences or evolve over time in an a...
, and reliability theory
Reliability theory

Reliability theory developed apart from the mainstream of probability and statistics. It was originally a tool to help nineteenth centuryMarine insurance and life insurance companies compute profitable rates to charge their customers....
.






Discussion
Ask a question about 'Reliability engineering'
Start a new discussion about 'Reliability engineering'
Answer questions from other users
Full Discussion Forum



Encyclopedia


Reliability engineering is an engineering
Engineering

Engineering is the discipline and profession of applying Technology and science knowledge and utilizing natural laws and physical resources in order to design and implement materials, structures, machines, devices, systems, and process that safely realize a desired objective and meet specified criteria....
 field, that deals with the study of reliability
Reliability

In general, reliability is the ability of a person or system to perform and maintain its functions in routine circumstances, as well as hostile or unexpected circumstances....
: the ability of a system
System

System is a set of interacting or interdependent entities, real or abstract, forming an integrated whole.The concept of an "integrated whole" can also be stated in terms of a system embodying a set of relationships which are differentiated from relationships of the set to other elements, and from relationships between an element of the se...
 or component to perform its required functions under stated conditions for a specified period of time. It is often reported in terms of a probability
Probability

Probability, or wikt:chance, is a way of expressing knowledge or belief that an Event will occur or has occurred. In mathematics the concept has been given an exact meaning in probability theory, that is used extensively in such areas of study as mathematics, statistics, finance, gambling, science, and philosophy to draw conclusions about t...
.

Overview

Reliability Block Diagram
Reliability may be defined in several ways:
  • The idea that something is fit for purpose with respect to time;
  • The capacity of a device or system to perform as designed;
  • The resistance to failure of a device or system;
  • The ability of a device or system to perform a required function under stated conditions for a specified period of time
    Time

    Time is a component of the measurement used to sequence events, to compare the durations of events and the intervals between them, and to quantify the motions of objects....
    ;
  • The probability that a functional unit will perform its required function for a specified interval under stated conditions.
  • The ability of something to "fail well
    Failing badly

    Failing badly and failing well are concepts in systems security and network security describing how a system reacts to failure. The terms have been popularized by Bruce Schneier, a cryptography and security consultant....
    " (fail without catastrophic consequences)


Reliability engineers rely heavily on statistics
Statistics

Statistics is a Mathematics pertaining to the collection, analysis, interpretation or explanation, and presentation of data. It also provides tools for prediction and forecasting based on data....
, probability theory
Probability theory

Probability theory is the branch of mathematics concerned with analysis of Statistical randomness phenomena. The central objects of probability theory are random variables, stochastic processes, and event s: mathematical abstractions of determinism events or measured quantities that may either be single occurrences or evolve over time in an a...
, and reliability theory
Reliability theory

Reliability theory developed apart from the mainstream of probability and statistics. It was originally a tool to help nineteenth centuryMarine insurance and life insurance companies compute profitable rates to charge their customers....
. Many engineering techniques are used in reliability engineering, such as reliability prediction, Weibull
Weibull distribution

In probability theory and statistics, the Weibull distribution is a continuous probability distribution. It is often called the Rosin?Rammler distribution when used to describe the size distribution of Granular material....
 analysis, thermal management, reliability testing and accelerated life testing. Because of the large number of reliability techniques, their expense, and the varying degrees of reliability required for different situations, most projects develop a reliability program plan to specify the reliability tasks that will be performed for that specific system.

The function of reliability engineering is to develop the reliability requirements for the product, establish an adequate reliability program, and perform appropriate analyses and tasks to ensure the product will meet its requirements. These tasks are managed by a reliability engineer, who usually holds an accredited
School accreditation

Educational accreditation is a type of quality assurance process under which services and operations of an educational institution or program are evaluated by an external body to determine if applicable standards are met....
 engineering degree and has additional reliability-specific education and training. Reliability engineering is closely associated with maintainability engineering and logistics engineering
Logistic engineering

Logistic Engineering deals with the science of Logistics. Logistics is about the purchasing, transport, storage, distribution , warehousing of raw materials, semi-finished/work-in-process goods and finished goods....
. Many problems from other fields, such as security engineering
Security engineering

Security engineering is a specialized field of engineering that deals with the development of detailed engineering plans and designs for security features, controls and systems....
, can also be approached using reliability engineering techniques. This article provides an overview of some of the most common reliability engineering tasks. Please see the references for a more comprehensive treatment.

Many types of engineering
Engineering

Engineering is the discipline and profession of applying Technology and science knowledge and utilizing natural laws and physical resources in order to design and implement materials, structures, machines, devices, systems, and process that safely realize a desired objective and meet specified criteria....
 employ reliability engineers and use the tools and methodology of reliability engineering. For example:
  • System engineers design complex systems having a specified reliability
  • Mechanical engineers may have to design a machine or system with a specified reliability
  • Automotive engineers have reliability requirements for the automobiles (and components) which they design
  • Electronics engineers must design and test their products for reliability requirements.
  • In software engineering
    Software engineering

    Software engineering is the application of a systematic, disciplined, quantifiable approach to the development, operation, and maintenance of software, and the study of these approaches....
     and systems engineering
    Systems engineering

    Systems engineering is an interdisciplinary field of engineering that focuses on how complex engineering projects should be designed and managed....
     the reliability engineering is the subdiscipline of ensuring that a system
    System

    System is a set of interacting or interdependent entities, real or abstract, forming an integrated whole.The concept of an "integrated whole" can also be stated in terms of a system embodying a set of relationships which are differentiated from relationships of the set to other elements, and from relationships between an element of the se...
     (or a device in general) will perform its intended function(s) when operated in a specified manner for a specified length of time. Reliability engineering
    Engineering

    Engineering is the discipline and profession of applying Technology and science knowledge and utilizing natural laws and physical resources in order to design and implement materials, structures, machines, devices, systems, and process that safely realize a desired objective and meet specified criteria....
     is performed throughout the entire life cycle
    New product development

    In business and engineering, new product development is the term used to describe the complete process of bringing a new product or service to market....
     of a system, including development, test, production and operation.


Reliability theory

Main articles: reliability theory
Reliability theory

Reliability theory developed apart from the mainstream of probability and statistics. It was originally a tool to help nineteenth centuryMarine insurance and life insurance companies compute profitable rates to charge their customers....
, failure rate
Failure rate

Failure rate is the frequency with which an engineered system or component failure, expressed for example in failures per hour. It is often denoted by the Greek alphabet ? and is important in reliability theory....
.


Reliability theory is the foundation of reliability engineering. For engineering purposes, reliability is defined as: the probability
Probability

Probability, or wikt:chance, is a way of expressing knowledge or belief that an Event will occur or has occurred. In mathematics the concept has been given an exact meaning in probability theory, that is used extensively in such areas of study as mathematics, statistics, finance, gambling, science, and philosophy to draw conclusions about t...
 that a device will perform its intended function during a specified period of time under stated conditions.


Mathematically, this may be expressed as,

,


where is the failure probability density function
Probability density function

In mathematics, a probability density function is a function that represents a probability distribution in terms of integrals.Formally, a probability distribution has density ƒ, if ƒ is a non-negative Lebesgue integration function such that the probability of the interval [ab] is given by...
 and is the length of the period of time (which is assumed to start from time zero).

Reliability engineering is concerned with four key elements of this definition:

  • First, reliability is a probability. This means that failure is regarded as a random phenomenon: it is a recurring event, and we do not express any information on individual failures, the causes of failures, or relationships between failures, except that the likelihood for failures to occur varies over time according to the given probability function. Reliability engineering is concerned with meeting the specified probability of success, at a specified statistical confidence level
    Confidence interval

    In statistics, a confidence interval is an interval estimation of a population parameter. Instead of estimating the parameter by a single value, an interval likely to include the parameter is given....
    .
  • Second, reliability is predicated on "intended function:" Generally, this is taken to mean operation without failure
    Failure

    Failure in general refers to the state or condition of not meeting a desirable or intended objective. It may be viewed as the opposite of success....
    . However, even if no individual part of the system fails, but the system as a whole does not do what was intended, then it is still charged against the system reliability. The system requirements specification is the criterion against which reliability is measured.
  • Third, reliability applies to a specified period of time. In practical terms, this means that a system has a specified chance that it will operate without failure before time . Reliability engineering ensures that components and materials will meet the requirements during the specified time. Units other than time may sometimes be used. The automotive industry might specify reliability in terms of miles, the military might specify reliability of a gun for a certain number of rounds fired. A piece of mechanical equipment may have a reliability rating value in terms of cycles of use.
  • Fourth, reliability is restricted to operation under stated conditions. This constraint is necessary because it is impossible to design a system for unlimited conditions. A Mars Rover
    Mars Rover

    A Mars rover is a spacecraft which propels itself across the surface of Mars after Mars landing .Rover have several advantages over stationary Lander : they examine more territory, they can be directed to interesting features, they can place themselves in sunny positions to weather winter months and they can advance the knowledge of how...
     will have different specified conditions than the family car. The operating environment must be addressed during design and testing.

Reliability program plan

Many tasks, methods, and tools can be used to achieve reliability. Every system requires a different level of reliability. A commercial airliner
Airliner

An airliner is a large fixed-wing aircraft with the primary function of transporting paying passengers and carrying cargo. Such planes are owned by airlines....
 must operate under a wide range of conditions. The consequences of failure are grave, but there is a correspondingly higher budget. A pencil sharpener may be more reliable than an airliner, but has a much different set of operational conditions, insignificant consequences of failure, and a much lower budget.

A reliability program plan is used to document exactly what tasks, methods, tools, analyses, and tests are required for a particular system. For complex systems, the reliability program plan is a separate document
Document

A document is a bounded physical representation of body of information designed with the capacity to communication. A document may manifest symbolic, diagrammatic or sensory-representational information....
. For simple systems, it may be combined with the systems engineering
Systems engineering

Systems engineering is an interdisciplinary field of engineering that focuses on how complex engineering projects should be designed and managed....
 management plan. The reliability program plan is essential for a successful reliability program and is developed early during system development. It specifies not only what the reliability engineer does, but also the tasks performed by others. The reliability program plan is approved by top program management.

Reliability requirements

For any system, one of the first tasks of reliability engineering is to adequately specify the reliability requirements. Reliability requirements address the system itself, test and assessment requirements, and associated tasks and documentation. Reliability requirements are included in the appropriate system/subsystem requirements specifications, test plans, and contract statements.

System reliability parameters

Requirements are specified using reliability parameter
Parameter

In mathematics, statistics, and the mathematical sciences, a parameter is a quantity that defines certain characteristics of systems or function s....
s. The most common reliability parameter is the Mean_Time_Between_Failures (MTBF), which can also be specified as the failure rate
Failure rate

Failure rate is the frequency with which an engineered system or component failure, expressed for example in failures per hour. It is often denoted by the Greek alphabet ? and is important in reliability theory....
 or the number of failures during a given period. These parameters are very useful for systems that are operated on a regular basis, such as most vehicle
Vehicle

Vehicles, derived from the Latin word, vehiculum, are non-living means of transport. Most often they are manufactured , although some other means of transport which are not made by humans also may be called vehicles; examples include icebergs and floating tree trunks....
s, machinery, and electronic
Electronics

Electronics refers to the flow of charge through nonmetal electrical conductor , whereas electrical refers to the flow of charge through metal electrical conductor....
 equipment. Reliability increases as the MTBF increases. The MTBF is usually specified in hours, but can also be used with other units of measurement such as miles or cycles.

In other cases, reliability is specified as the probability of mission success. For example, reliability of a scheduled aircraft flight can be specified as a dimensionless probability or a percentage.

A special case of mission success is the single-shot device or system. These are devices or systems that remain relatively dormant and only operate once. Examples include automobile airbags, thermal batteries
Battery (electricity)

In electronics, a battery or voltaic cell is a combination of one or more electrochemical cell Galvanic cells which store chemical energy that can be converted into electric potential energy, creating electricity....
 and missiles. Single-shot reliability is specified as a probability of success, or is subsumed into a related parameter. Single-shot missile reliability may be incorporated into a requirement for the probability of hit.

For such systems, the probability of failure on demand (PFD)
Safety Integrity Level

Safety Integrity Level is defined as a relative level of risk-reduction provided by a safety function, or to specify a target level of risk reduction....
 is the reliability measure
Measure

Measure can mean:* Measurement, the process of estimating the magnitude of some attribute of an object relative to some unit of measurement* Measure , a way to assign non-negative real numbers to subsets...
. This PFD is derived from failure rate and mission time for non-repairable systems. For repairable systems, it is obtained from failure rate and mean-time-to-repair (MTTR) and test interval. This measure may not be unique for a given system as this measure depends on the kind of demand. In addition to system level requirements, reliability requirements may be specified for critical subsystems. In all cases, reliability parameters are specified with appropriate statistical confidence interval
Confidence interval

In statistics, a confidence interval is an interval estimation of a population parameter. Instead of estimating the parameter by a single value, an interval likely to include the parameter is given....
s.

Reliability modelling

Reliability modelling is the process of predicting or understanding the reliability
Reliability

In general, reliability is the ability of a person or system to perform and maintain its functions in routine circumstances, as well as hostile or unexpected circumstances....
 of a component or system. Two separate fields of investigation are common: The physics of failure approach uses an understanding of the failure mechanisms involved, such as crack propagation or chemical corrosion
Corrosion

Corrosion means the breaking down of essential properties in a material due to chemical reactions with its surroundings. In the most common use of the word, this means a loss of electrons of metals reacting with water and oxygen....
; The parts stress modelling
Parts stress modelling

Parts stress modelling is a method in engineering and especially electronics to find an expected value for the rate of failure of the mechanical and electronic components of a system....
 approach is an empirical method for prediction based on counting the number and type of components of the system, and the stress they undergo during operation.

For systems with a clearly defined failure time (which is sometimes not given for systems with a drifting parameter), the empirical distribution function
Empirical distribution function

In statistics, an empirical distribution function is a cumulative distribution function that concentrates probability 1/n at each of the n numbers in a sample ....
 of these failure times can be determined. This is done in general in an accelerated experiment with increased stress. These experiments can be divided into two main categories:

Early failure rate studies determine the distribution with a decreasing failure rate over the first part of the bathtub curve
Bathtub curve

The bathtub curve is widely used in reliability engineering, although the general concept is also applicable to humans. It describes a particular form of the hazard function which comprises three parts:...
. Here in general only moderate stress is necessary. The stress is applied for a limited period of time in what is called a censored test. Therefore, only the part of the distribution with early failures can be determined.

In so-called zero defect experiments, only limited information about the failure distribution is acquired. Here the stress, stress time, or the sample size is so low that not a single failure occurs. Due to the insufficient sample size, only an upper limit of the early failure rate can be determined. At any rate, it looks good for the customer if there are no failures.

In a study of the intrinsic failure distribution, which is often a material property, higher stresses are necessary to get failure in a reasonable period of time. Several degrees of stress have to be applied to determine an acceleration model. The empirical failure distribution is often parametrised with a Weibull
Weibull distribution

In probability theory and statistics, the Weibull distribution is a continuous probability distribution. It is often called the Rosin?Rammler distribution when used to describe the size distribution of Granular material....
 or a log-normal
Log-normal distribution

In probability and statistics, the log-normal distribution is the single-tailed probability distribution of any random variable whose logarithm is normal distribution....
 model.

It is a general praxis to model the early failure rate with an exponential distribution. This less complex model for the failure distribution has only one parameter: the constant failure rate. In such cases, the Chi-square distribution
Chi-square distribution

In probability theory and statistics, the chi-square distribution is one of the most widely used theoretical probability distributions in inferential statistics, e.g., in statistical significance tests....
 can be used to find the goodness of fit
Goodness of fit

The goodness of fit of a statistical model describes how well it fits a set of observations. Measures of goodness of fit typically summarize the discrepancy between observed values and the values expected under the model in question....
 for the estimated failure rate. Compared to a model with a decreasing failure rate, this is quite pessimistic. Combined with a zero-defect experiment this becomes even more pessimistic. The effort is greatly reduced in this case: one does not have to determine a second model parameter (e.g. the shape parameter of a Weibull distribution
Weibull distribution

In probability theory and statistics, the Weibull distribution is a continuous probability distribution. It is often called the Rosin?Rammler distribution when used to describe the size distribution of Granular material....
, or its confidence interval (e.g by an MLE / Maximum likelihood
Maximum likelihood

Maximum likelihood estimation is a popular statistics method used for fitting a mathematical model to data. The modeling of real world data using estimation by maximum likelihood offers a way of tuning the free parameters of the model to provide a good fit....
 approach) - and the sample size is much smaller.

Reliability test requirements

Because reliability is a probability, even highly reliable systems have some chance of failure. However, testing reliability requirements is problematic for several reasons. A single test is insufficient to generate enough statistical data. Multiple tests or long-duration tests are usually very expensive. Some tests are simply impractical. Reliability engineering is used to design a realistic and affordable test program that provides enough evidence that the system meets its requirement. Statistical confidence levels
Confidence interval

In statistics, a confidence interval is an interval estimation of a population parameter. Instead of estimating the parameter by a single value, an interval likely to include the parameter is given....
 are used to address some of these concerns. A certain parameter is expressed along with a corresponding confidence level: for example, an MTBF of 1000 hours at 90% confidence level. From this specification, the reliability engineer can design a test with explicit criteria for the number of hours and number of failures until the requirement is met or failed.

The combination of reliability parameter value and confidence level greatly affects the development cost and the risk to both the customer and producer. Care is needed to select the best combination of requirements. Reliability testing may be performed at various levels, such as component, subsystem, and system
System

System is a set of interacting or interdependent entities, real or abstract, forming an integrated whole.The concept of an "integrated whole" can also be stated in terms of a system embodying a set of relationships which are differentiated from relationships of the set to other elements, and from relationships between an element of the se...
. Also, many factors must be addressed during testing, such as extreme temperature and humidity, shock, vibration, and heat. Reliability engineering determines an effective test strategy so that all parts are exercised in relevant environments. For systems that must last many years, reliability engineering may be used to design an accelerated life test.

Requirements for reliability tasks

Reliability engineering must also address requirements for various reliability tasks and documentation during system development, test, production, and operation. These requirements are generally specified in the contract statement of work and depend on how much leeway the customer wishes to provide to the contractor. Reliability tasks include various analyses, planning, and failure reporting. Task selection depends on the criticality of the system as well as cost. A critical system may require a formal failure reporting and review process throughout development, whereas a non-critical system may rely on final test reports. The most common reliability program tasks are documented in reliability program standards, such as MIL-STD-785 and IEEE 1332. Failure reporting analysis and corrective action systems are a common approach for product/process reliability monitoring.

Design for reliability

Design For Reliability (DFR), is an emerging discipline that refers to the process of designing reliability into products. This process encompasses several tools and practices and describes the order of their deployment that an organization needs to have in place in order to drive reliability into their products. Typically, the first step in the DFR process is to set the system’s reliability requirements. Reliability must be "designed in" to the system. During system design
Design

Design is used both as a noun and a verb. The term is often tied to the various applied arts and engineering . As a verb, "to design" refers to the process of originating and planning for a product, structure, system, or component with intention....
, the top-level reliability requirements are then allocated to subsystems by design engineers and reliability engineers working together.

Reliability design begins with the development of a model
Mathematical model

A mathematical model uses mathematics language to describe a system. Mathematical models are used not only in the natural sciences and engineering disciplines but also in the social sciences ; physicists, engineers, computer sciences, and economists use mathematical models most extensively....
. Reliability models use block diagrams and fault trees to provide a graphical means of evaluating the relationships between different parts of the system. These models incorporate predictions based on parts-count failure rates taken from historical data. While the predictions are often not accurate in an absolute sense, they are valuable to assess relative differences in design alternatives.

Fault Tree
One of the most important design techniques is redundancy
Redundancy (engineering)

In engineering, redundancy is the duplication of critical wikt:Components of a system with the intention of increasing reliability of the system, usually in the case of a backup or fail-safe....
. This means that if one part of the system fails, there is an alternate success path, such as a backup system. An automobile brake light might use two light bulbs. If one bulb fails, the brake light still operates using the other bulb. Redundancy significantly increases system reliability, and is often the only viable means of doing so. However, redundancy is difficult and expensive, and is therefore limited to critical parts of the system. Another design technique, physics of failure, relies on understanding the physical processes of stress, strength and failure at a very detailed level. Then the material or component can be re-designed to reduce the probability of failure. Another common design technique is component derating
Derating

Derating is the technique employed in power electrical and electronic devices wherein the devices are operated at less than their rated maximum power dissipation taking into consideration the case/body temperature, ambient temperature and the type of cooling mechanism used....
: Selecting components whose tolerance significantly exceeds the expected stress, as using a heavier gauge wire that exceeds the normal specification for the expected electrical current.

Many tasks, techniques and analyses are specific to particular industries and applications. Commonly these include:

  • Built-in test (BIT)
  • Failure mode and effects analysis
    Failure mode and effects analysis

    A failure modes and effects analysis is a procedure for analysis of potential failure modes within a system for classification by severity or determination of the effect of failures on the system....
     (FMEA)
  • Reliability simulation modeling
  • Thermal analysis
    Thermal analysis

    Thermal analysis is a branch of materials science where the properties of materials are studied as they change with temperature. Several methods are commonly used - these are distinguished from one another by the property which is measured:...
  • Reliability Block Diagram analysis
  • Fault tree analysis
    Fault tree analysis

    Fault tree analysis is a failure analysis in which an undesired state of a system is analyzed using boolean logic to combine a series of lower-level events....
  • Root cause analysis
    Root cause analysis

    Root cause analysis is a class of problem solving methods aimed at identifying the root causes of problems or events. The practice of RCA is predicated on the belief that problems are best solved by attempting to correct or eliminate root causes, as opposed to merely addressing the immediately obvious symptoms....
  • Sneak circuit analysis
  • Accelerated Testing
  • Reliability Growth analysis
  • Weibull analysis
  • Electromagnetic analysis
  • Statistical interference
    Statistical interference

    When two probability distributions overlap, statistical interference exists. Knowledge of the distributions can be used to determine the likelihood that one parameter exceeds another, and by how much....
  • Avoid Single Point of Failure
    Single Point of Failure

    A Single Point of Failure, , is a part of a system which, if it fails, will stop the entire system from working. They are undesirable in any system whose goal is high availability, be it a network, software application or other industrial system....


Results are presented during the system design reviews and logistics reviews. Reliability is just one requirement among many system requirements. Engineering trade studies are used to determine the optimum
Optimization (mathematics)

In mathematics, the simplest case of optimization, or mathematical programming, refers to the study of problems in which one seeks to maxima and minima or maxima and minima a Function of a real variable by systematically choosing the values of Real number or integer variables from within an allowed set....
 balance between reliability and other requirements and constraints.

Reliability testing

Reliability Sequential Test Plan
The purpose of reliability testing is to discover potential problems with the design as early as possible and, ultimately, provide confidence that the system meets its reliability requirements.

Reliability testing may be performed at several levels. Complex systems may be tested at component, circuit board, unit, assembly, subsystem and system levels. (The test level nomenclature varies among applications.) For example, performing environmental stress screening tests at lower levels, such as piece parts or small assemblies, catches problems before they cause failures at higher levels. Testing proceeds during each level of integration through full-up system testing, developmental testing, and operational testing, thereby reducing program risk. System reliability is calculated at each test level. Reliability growth techniques and failure reporting, analysis and corrective active systems (FRACAS) are often employed to improve reliability as testing progresses. The drawbacks to such extensive testing are time and expense. Customers may choose to accept more risk
Risk

Risk is a concept that denotes the precise probability of specific eventualities. Technically, the notion of risk is independent from the notion of value and, as such, eventualities may have both beneficial and adverse consequences....
 by eliminating some or all lower levels of testing.

It is not always feasible to test all system requirements. Some systems are prohibitively expensive to test; some failure mode
Failure mode

Failure causes are defects in design, process, quality, or part application, which are the underlying cause of the failure or which initiate a process which leads to failure....
s may take years to observe; some complex interactions result in a huge number of possible test cases; and some tests require the use of limited test ranges or other resources. In such cases, different approaches to testing can be used, such as accelerated life testing, design of experiments
Design of experiments

Design of experiments, or experimental design, is the design of all information-gathering exercises where variation is present, whether under the full control of the experimenter or not....
, and simulation
Simulation

Simulation is the imitation of some real thing, state of affairs, or process. The act of simulating something generally entails representing certain key characteristics or behaviors of a selected physical or abstract system....
s.

The desired level of statistical confidence also plays an important role in reliability testing. Statistical confidence is increased by increasing either the test time or the number of items tested. Reliability test plans are designed to achieve the specified reliability at the specified confidence level
Confidence interval

In statistics, a confidence interval is an interval estimation of a population parameter. Instead of estimating the parameter by a single value, an interval likely to include the parameter is given....
 with the minimum number of test units and test time. Different test plans result in different levels of risk to the producer and consumer. The desired reliability, statistical confidence, and risk levels for each side influence the ultimate test plan. Good test requirements ensure that the customer and developer agree in advance on how reliability requirements will be tested.

A key aspect of reliability testing is to define "failure
Failure

Failure in general refers to the state or condition of not meeting a desirable or intended objective. It may be viewed as the opposite of success....
". Although this may seem obvious, there are many situations where it is not clear whether a failure is really the fault of the system. Variations in test conditions, operator differences, weather
Weather

Weather is a set of all the Phenomenon occurring in a given atmosphere at a given time. Weather phenomena lie in the hydrosphere and troposphere....
, and unexpected situations create differences between the customer and the system developer. One strategy to address this issue is to use a scoring conference process. A scoring conference includes representatives from the customer, the developer, the test organization, the reliability organization, and sometimes independent observers. The scoring conference process is defined in the statement of work. Each test case is considered by the group and "scored" as a success or failure. This scoring is the official result used by the reliability engineer.

As part of the requirements phase, the reliability engineer develops a test strategy with the customer. The test strategy makes trade-offs between the needs of the reliability organization, which wants as much data as possible, and constraints such as cost, schedule, and available resources. Test plans and procedures are developed for each reliability test, and results are documented in official reports.

Accelerated testing

The purpose of accelerated life testing is to induce field failure in the laboratory at a much faster rate by providing a harsher, but nonetheless representative, environment. In such a test the product is expected to fail in the lab just as it would have failed in the field—but in much less time. The main objective of an accelerated test is either of the following:
  • To discover failure modes
  • To predict the normal field life from the high stress lab life


An Accelerated testing program can be broken down into the following steps:
  • Define objective and scope of the test
  • Collect required information about the product
  • Identify the stress(es)
  • Determine level of stress(es)
  • Conduct the Accelerated test and analyse the accelerated data.


Common way to determine a life stress relationship are
  • Arrhenius Model
  • Eyring Model
  • Inverse Power Law Model
  • Temperature-Humidity Model
  • Temperature Non-thermal Model

Software reliability

Software reliability is a special aspect of reliability engineering. System reliability, by definition, includes all parts of the system, including hardware
Hardware

Hardware is a general term that refers to the physical cultural artifacts of a technology. It may also mean the physical components of a computer system, in the form of computer hardware....
, software, operators and procedures. Traditionally, reliability engineering focuses on critical hardware parts of the system. Since the widespread use of digital integrated circuit
Integrated circuit

In electronics, an integrated circuit is a miniaturized electronic circuit that has been manufactured in the surface of a thin Wafer of semiconductor material....
 technology, software has become an increasingly critical part of most electronics
Electronics

Electronics refers to the flow of charge through nonmetal electrical conductor , whereas electrical refers to the flow of charge through metal electrical conductor....
 and, hence, nearly all present day systems. There are significant differences, however, in how software and hardware behave. Most hardware unreliability is the result of a component or material
Material

Materials are substances or components with certain physical properties which are used as inputs to Production, costs, and pricing or manufacturing....
 failure that results in the system not performing its intended function. Repairing or replacing the hardware component restores the system to its original unfailed state. However, software does not fail in the same sense that hardware fails. Instead, software unreliability is the result of unanticipated results of software operations. Even relatively small software programs can have astronomically large combinations of inputs and states that are infeasible to exhaustively test. Restoring software to its original state only works until the same combination of inputs and states results in the same unintended result. Software reliability engineering must take this into account.

Despite this difference in the source of failure between software and hardware — software doesn’t wear out — some in the software reliability engineering community believe statistical models used in hardware reliability are nevertheless useful as a measure of software reliability, describing what we experience with software: the longer you run software, the higher the probability you’ll eventually use it in an untested manner and find a latent defect that results in a failure (Shooman 1987), (Musa 2005), (Denney 2005).

As with hardware, software reliability depends on good requirements, design and implementation. Software reliability engineering relies heavily on a disciplined software engineering
Software engineering

Software engineering is the application of a systematic, disciplined, quantifiable approach to the development, operation, and maintenance of software, and the study of these approaches....
 process to anticipate and design against unintended consequence
Unintended consequence

Unintended consequences are outcomes that are not the results originally intended in a particular situation. The unintended results may be foreseen or unforeseen, but they should be the logical or likely results of the action....
s. There is more overlap between software quality engineering
Quality Engineering

Quality Engineering is a quarterly academic journal focusing on quality control and quality assurance management through use of physical technology, International standard information, and statistical tools....
 and software reliability engineering than between hardware quality and reliability. A good software development plan is a key aspect of the software reliability program. The software development plan describes the design and coding standards, peer reviews
Software peer review

In software development, peer review refers to a type of software review in which a work product is examined by its author and one or more colleagues, in order to evaluate its technical content and quality....
, unit test
Unit test

In computer programming, unit testing is a software design and development method where the programmer gains confidence that individual units of source code are fit for use....
s, configuration management
Configuration management

Configuration management is a field of management that focuses on establishing and maintaining consistency of a product's performance and its functional and physical attributes with its requirements, design, and operational information throughout its life....
, software metrics and software models to be used during software development.

A common reliability metric is the number of software faults, usually expressed as faults per thousand lines of code. This metric, along with software execution time, is key to most software reliability models and estimates. The theory is that the software reliability increases as the number of faults (or fault density) goes down. Establishing a direct connection between fault density and mean-time-between-failure is difficult, however, because of the way software faults are distributed in the code, their severity, and the probability of the combination of inputs necessary to encounter the fault. Nevertheless, fault density serves as a useful indicator for the reliability engineer. Other software metrics, such as complexity, are also used.

Testing is even more important for software than hardware. Even the best software development process results in some software faults that are nearly undetectable until tested. As with hardware, software is tested at several levels, starting with individual units, through integration and full-up system testing. Unlike hardware, it is inadvisable to skip levels of software testing. During all phases of testing, software faults are discovered, corrected, and re-tested. Reliability estimates are updated based on the fault density and other metrics. At system level, mean-time-between-failure data is collected and used to estimate reliability. Unlike hardware, performing the exact same test on the exact same software configuration does not provide increased statistical confidence. Instead, software reliability uses different metrics such as test coverage.

Eventually, the software is integrated with the hardware in the top-level system, and software reliability is subsumed by system reliability. The Software Engineering Institute's Capability Maturity Model
Capability Maturity Model

The Capability Maturity Model in software engineering is a model of the maturity of the capability of certain business processes. A maturity model can be described as a structured collection of elements that describe certain aspects of maturity in an organization, and aids in the definition and understanding of an organization's processes....
 is a common means of assessing the overall software development process for reliability and quality purposes.

Reliability operational assessment

After a system is produced, reliability engineering monitors, assesses, and corrects deficiencies. Monitoring includes electronic and visual surveillance of critical parameters identified during the fault tree analysis design stage. The data is constantly analyzed using statistical techniques, such as Weibull
Weibull distribution

In probability theory and statistics, the Weibull distribution is a continuous probability distribution. It is often called the Rosin?Rammler distribution when used to describe the size distribution of Granular material....
 analysis and linear regression
Linear regression

In statistics, linear regression is used for two things;Linear regression is a form of regression analysis in which the relationship between one or more independent variables and another variable, called the dependent variable, is modeled by a least squares function, called linear regression equation....
, to ensure the system reliability meets requirements. Reliability data and estimates are also key inputs for system logistics
Logistics

Logistics is the management of the flow of goods, information and other resources, including energy and people, between the point of origin and the point of consumption in order to meet the requirements of consumers ....
. Data collection is highly dependent on the nature of the system. Most large organizations have quality control
Quality control

In engineering and manufacturing, quality control and quality engineering are used in developing systems to ensure product s or Service are designed and produced to meet or exceed customer requirements....
 groups that collect failure data on vehicles, equipment, and machinery. Consumer product failures are often tracked by the number of returns. For systems in dormant storage or on standby, it is necessary to establish a formal surveillance program to inspect and test random samples. Any changes to the system, such as field upgrades or recall repairs, require additional reliability testing to ensure the reliability of the modification. Since it is not possible to anticipate all the failure modes of a given system, especially ones with a human element, failures will occur. The reliability program also includes a systematic root cause analysis
Root cause analysis

Root cause analysis is a class of problem solving methods aimed at identifying the root causes of problems or events. The practice of RCA is predicated on the belief that problems are best solved by attempting to correct or eliminate root causes, as opposed to merely addressing the immediately obvious symptoms....
 that identifies the causal relationships involved in the failure such that effective corrective actions may be implemented. When possible, system failures and corrective actions are reported to the reliability engineering organization.

One of the most common methods for Reliability Operational Assessment is a FRACAS – a systematic approach for reliability, safety and logistics assessment based on Failure / Incident reporting, management, analysis and corrective/preventive actions. Organizations today are adopting this method and utilize commercial systems such as a Web based FRACAS application enabling and organization to create a failure/incident data repository from which statistics can be derived to view accurate and genuine reliability, safety and quality performances.

Some of the common outputs from a FRACAS system includes: Field MTBF, MTTR, Spares Consumption, Reliability Growth, Failure/Incidents distribution by type, location, part no., serial no, symptom etc.

Reliability organizations

Systems of any significant complexity are developed by organizations of people, such as a commercial company or a government
Government

Government is the body within any organization that has the authority to make and the power to enforce laws, regulations, or rules. Typically, the government refers to a civil government -- local, provincial, or national -- but commercial, academic, religious, or other formal organizations are also administered by governing bodies....
 agency. The reliability engineering organization must be consistent with the company's organizational structure
Organizational structure

An organizational structure is a mostly hierarchical concept of subordination of entities that collaborate and contribute to serve one common aim....
. For small, non-critical systems, reliability engineering may be informal. As complexity grows, the need arises for a formal reliability function. Because reliability is important to the customer, the customer may even specify certain aspects of the reliability organization.

There are several common types of reliability organizations. The project manager
Project manager

A project manager is a professional in the field of project management. Project managers can have the responsibility of the planning, execution, and closing of any project, typically relating to construction industry, architecture, computer networking, telecommunications or software development....
 or chief engineer
Engineer

An engineer is a person professionally engaged in a field of engineering. Engineers are concerned with developing economical and safe solutions to practical problems, by applying mathematics and scientific knowledge while considering technical constraints....
 may employ one or more reliability engineers directly. In larger organizations, there is usually a product assurance or specialty engineering organization, which may include reliability, maintainability
Maintainability

In software engineering, the ease with which a software product can be modified in order to:* correct defects* meet new requirements* make future maintenance easier, or...
, quality
Quality

Quality may refer to:Concepts:* Quality * Quality , an attribute or a property* Quality , which has separate meanings in thermodynamics and harmonics...
, safety
Safety

Safety is the state of being "safe" , the condition of being protected against physical, social, spiritual, financial, political, emotional, occupational, psychological, educational or other types or consequences of failure, damage, error, accidents, harm or any other event which could be considered non-desirable....
, human factors
Human factors

Human factors is a term that covers:* The science of understanding the properties of human capability .* The application of this understanding to the design and development of systems and services ....
, logistics
Logistics

Logistics is the management of the flow of goods, information and other resources, including energy and people, between the point of origin and the point of consumption in order to meet the requirements of consumers ....
, etc. In such case, the reliability engineer reports to the product assurance manager or specialty engineering manager.

In some cases, a company may wish to establish an independent reliability organization. This is desirable to ensure that the system reliability, which is often expensive and time consuming, is not unduly slighted due to budget and schedule pressures. In such cases, the reliability engineer works for the project on a day-to-day basis, but is actually employed and paid by a separate organization within the company.

Because reliability engineering is critical to early system design, it has become common for reliability engineers, however the organization is structured, to work as part of an integrated product team.

Certification

The American Society for Quality
American Society for Quality

American Society for Quality , formerly known as American Society for Quality Control , is a knowledge-based global community of quality control experts, with nearly 85,000 members dedicated to the promotion and advancement of quality tools, principles, and practices in their workplaces and in their communities....
 has a program to become a Certified Reliability Engineer, CRE. Certification is based on education, experience, and a certification test: periodic recertification is required. The body of knowledge for the test includes: reliability management, design evaluation, product safety, statistical tools, design and development, modeling, reliability testing, collecting and using data, etc.

Another highly respected certification program is the (Certified Reliability Professional). To achieve certification, candidates must complete a series of courses focused on important Reliability Engineering topics, successfully apply the learned body of knowledge in the workplace and publicly present this expertise in an industry conference or journal.

Reliability engineering education

Some Universities offer graduate degrees in Reliability Engineering (e.g., see University of Maryland, College Park
University of Maryland, College Park

The University of Maryland, College Park is a public research university located in the city of College Park, Maryland in Prince George's County, Maryland outside Washington, D.C....
 and Concordia University
Concordia University

Concordia University is a comprehensive public university anglophone university located in Montreal, Quebec, Canada. In 2006, Concordia was home to 38,809 students, making it among the largest in Canada....
, Montreal, Canada). Other reliability engineers typically have an engineering degree, which can be in any field of engineering, from an accredited
School accreditation

Educational accreditation is a type of quality assurance process under which services and operations of an educational institution or program are evaluated by an external body to determine if applicable standards are met....
 university
University

A university is an institution of higher education and research, which grants academic degrees in a variety of subjects. A university provides both undergraduate education and postgraduate education....
 or college
College

File:Government college for Women Dhoke Kala Khan.JPGCollege is a term most often used today to denote an education institution. More broadly, it can be the name of any group of collegialitys, for example, an electoral college, a College of Arms or the College of Cardinals....
 program. Many engineering programs offer reliability courses, and some universities have entire reliability engineering programs. A reliability engineer may be registered as a Professional Engineer
Professional Engineer

Professional Engineer is the term for registered or licensed engineers in some countries who are permitted to offer their professional services directly to the public....
 by the state, but this is not required by most employers. There are many professional conferences and industry training programs available for reliability engineers. Several professional organizations exist for reliability engineers, including the IEEE Reliability Society
IEEE Reliability Society

The is a society of the Institute of Electrical and Electronics Engineers with a focus on Reliability Engineering....
, the , and the .

See also

  • Bayesian inference
    Bayesian inference

    Bayesian inference is statistical inference in which evidence or observations are used to update or to newly infer the probability that a hypothesis may be true....
  • Burn in
    Burn in

    Burn-in is the process by which components of a system are exercised prior to being placed in service .The intention is to detect those particular components that would fail as a result of the initial, high-failure rate portion of the Bathtub curve of component reliability....
  • Failing badly
    Failing badly

    Failing badly and failing well are concepts in systems security and network security describing how a system reacts to failure. The terms have been popularized by Bruce Schneier, a cryptography and security consultant....
  • Failure rate
    Failure rate

    Failure rate is the frequency with which an engineered system or component failure, expressed for example in failures per hour. It is often denoted by the Greek alphabet ? and is important in reliability theory....
  • Human reliability
    Human reliability

    Human reliability is related to the field of human factors engineering, and refers to the reliability of humans in fields such as manufacturing, transportation, the military, or medicine....
  • Highly accelerated stress test
    Highly accelerated stress test

    The highly accelerated stress test method was invented by Nihal Sinnadurai while working as a Research Engineer at British Telecommunications Research Laboratories in 1968 in order to perform highly accelerated reliability testing of electronics components that are likely to encounter humid environments during normal operation....
  • Highly Accelerated Life Test
    Highly Accelerated Life Test

    A Highly Accelerated Life Test , is a stress testing methodology developed by Gregg K. Hobbs. It is commonly associated with electronics and is performed to obtain information about a product's reliability....
  • Logistic engineering
    Logistic engineering

    Logistic Engineering deals with the science of Logistics. Logistics is about the purchasing, transport, storage, distribution , warehousing of raw materials, semi-finished/work-in-process goods and finished goods....
  • Performance engineering
    Performance Engineering

    Within systems engineering, performance engineering encompasses the set of roles, skills, activities, practices, tools, and deliverables applied at every phase of the Systems Development Lifecycle which ensures that a solution will be designed, implemented, and operationally supported to meet the non-functional requirements defined for the so...
  • Professional engineer
    Professional Engineer

    Professional Engineer is the term for registered or licensed engineers in some countries who are permitted to offer their professional services directly to the public....
  • Product qualification
  • Quality engineering
    Quality Engineering

    Quality Engineering is a quarterly academic journal focusing on quality control and quality assurance management through use of physical technology, International standard information, and statistical tools....
  • Reliability
    Reliability

    In general, reliability is the ability of a person or system to perform and maintain its functions in routine circumstances, as well as hostile or unexpected circumstances....
  • Reliable system design
  • Reliability theory
    Reliability theory

    Reliability theory developed apart from the mainstream of probability and statistics. It was originally a tool to help nineteenth centuryMarine insurance and life insurance companies compute profitable rates to charge their customers....
  • Reliability theory of aging and longevity
    Reliability theory of aging and longevity

    Reliability theory of aging and longevity is a scientific approach aimed to gain theoretical insights into mechanisms of biological aging and species survival patterns by applying a general theory of systems failure, known as reliability theory....
  • Redundancy (total quality management)
    Redundancy (total quality management)

    In total quality management, TQM, redundancy in quality or redundant quality means quality which exceeds the required quality level. Engineering_tolerance may be too accurate, for example, creating unnecessarily high costs of production....
  • Security engineering
    Security engineering

    Security engineering is a specialized field of engineering that deals with the development of detailed engineering plans and designs for security features, controls and systems....
  • Single Point of Failure
    Single Point of Failure

    A Single Point of Failure, , is a part of a system which, if it fails, will stop the entire system from working. They are undesirable in any system whose goal is high availability, be it a network, software application or other industrial system....
  • Software engineering
    Software engineering

    Software engineering is the application of a systematic, disciplined, quantifiable approach to the development, operation, and maintenance of software, and the study of these approaches....
  • Systems engineering
    Systems engineering

    Systems engineering is an interdisciplinary field of engineering that focuses on how complex engineering projects should be designed and managed....
  • Safety engineering
    Safety engineering

    Safety engineering is an applied science strongly related to systems engineering and the subset System Safety Engineering. Safety engineering assures that a life-critical system behaves as needed even when pieces fail....
  • Statistics
    Statistics

    Statistics is a Mathematics pertaining to the collection, analysis, interpretation or explanation, and presentation of data. It also provides tools for prediction and forecasting based on data....
  • Temperature cycling
    Temperature cycling

    Temperature cycling is the process of cycling through two temperature extremes, typically at relatively high rates of change. It is an environmental Stress testing used in evaluating product reliability as well as in manufacturing to catch early-term, latent defects by inducing failure through thermal Fatigue ....


Further reading

  • Blanchard, Benjamin S. (1992), Logistics Engineering and Management (Fourth Ed.), Prentice-Hall, Inc., Englewood Cliffs, New Jersey.
  • Ebeling, Charles E., (1997), An Introduction to Reliability and Maintainability Engineering, McGraw-Hill Companies, Inc., Boston.
  • Denney, Richard (2005) Succeeding with Use Cases: Working Smart to Deliver Quality. Addison-Wesley Professional Publishing. ISBN . Discusses the use of software reliability engineering in use case
    Use case

    A use case in software engineering and systems engineering is a description of a system?s behaviour as it responds to a request that originates from outside of that system....
     driven software development.
  • Gano, Dean L. (2007), "Apollo Root Cause Analysis" (Third Edition), Apollonian Publications, LLC., Richland, Washington
  • Kapur, K.C., and Lamberson, L.R., (1977), Reliability in Engineering Design, John Wiley & Sons, New York.
  • Kececioglu, Dimitri, (1991) "Reliability Engineering Handbook", Prentice-Hall, Englewood Cliffs, New Jersey
  • Trevor Kletz
    Trevor Kletz

    Trevor Kletz Order of the British Empire is a prolific British author on the topic of chemical engineering safety. He is credited with introducing the concept of inherent safety, and was a major promoter of Hazop....
     (1998) Process Plants: A Handbook for Inherently Safer Design CRC ISBN: 1560326190
  • Leemis, Lawrence, (1995) Reliability: Probabilistic Models and Statistical Methods, 1995, Prentice-Hall. ISBN 0-13-720517-1
  • MacDiarmid, Preston; Morris, Seymour; et al., (1995), Reliability Toolkit: Commercial Practices Edition, Reliability Analysis Center and Rome Laboratory, Rome, New York.
  • Modarres, Mohammad; Kaminskiy, Mark; Krivtsov, Vasiliy (1999), "Reliability Engineering and Risk Analysis: A Practical Guide, CRC Press, ISBN 0-8247-2000-8.
  • Musa, John (2005) Software Reliability Engineering: More Reliable Software Faster and Cheaper, 2nd. Edition, AuthorHouse. ISBN
  • Neubeck, Ken (2004) "Practical Reliability Analysis", Prentice Hall, New Jersey
  • Neufelder, Ann Marie, (1993), Ensuring Software Reliability, Marcel Dekker, Inc., New York.
  • O'Connor, Patrick D. T. (2002), Practical Reliability Engineering (Fourth Ed.), John Wiley & Sons, New York.
  • Shooman, Martin, (1987), Software Engineering: Design, Reliability, and Management, McGraw-Hill, New York.
  • Tobias, Trindade, (1995), Applied Reliability, Chapman & Hall/CRC, ISBN 0-442-00469-9
  • Nelson, Wayne B., (2004), Accelerated Testing - Statistical Models, Test Plans, and Data Analysis, John Wiley & Sons, New York, ISBN 0-471-69736-2


US standards

  • MIL-STD-785, Reliability Program for Systems and Equipment Development and Production, U.S. Department of Defense.
  • MIL-HDBK-217, Reliability Prediction of Electronic Equipment, U.S. Department of Defense.
  • MIL-STD-2173, Reliability Centered Maintenance Requirements, U.S. Department of Defense (superseded by )
  • MIL-HDBK-338B, Electronic Reliability Design Handbook, U.S. Department of Defense.
  • MIL-STD-1629A, PROCEDURES FOR PERFORMING A FAILURE MODE, EFFECTS AND CRlTlCALlTY ANALYSIS
  • MIL-HDBK-781A, Reliability Test Methods, Plans, and Environments for Engineering Development, Qualification, and Production, U.S. Department of Defense.
  • IEEE 1332, IEEE Standard Reliability Program for the Development and Production of Electronic Systems and Equipment, Institute of Electrical and Electronics Engineers.
  • Federal Standard 1037C
    Federal Standard 1037C

    Federal Standard 1037C, entitled Telecommunications: Glossary of Telecommunication Terms is a United States Federal Standard, issued by the General Services Administration pursuant to the Federal Property and Administrative Services Act of 1949, as amended....
     in support of MIL-STD-188
    MIL-STD-188

    MIL-STD-188 is a series of U.S. military standards relating to telecommunications....


UK standards

In the UK, there are more up to date standards maintained under the sponsorship of UK MOD as Defence Standards. The relevant Standards include:

DEF STAN 00-40 Reliability and Maintainability (R&M)
  • PART 1: Issue 5: Management Responsibilities and Requirements for Programmes and Plans
  • PART 4: (ARMP-4)Issue 2: Guidance for Writing NATO R&M Requirements Documents
  • PART 6: Issue 1: IN-SERVICE R & M
  • PART 7 (ARMP-7) Issue 1: NATO R&M Terminology Applicable to ARMP’s
DEF STAN 00-41 : Issue 3: RELIABILITY AND MAINTAINABILITY MOD GUIDE TO PRACTICES AND PROCEDURES

DEF STAN 00-42 RELIABILITY AND MAINTAINABILITY ASSURANCE GUIDES
  • PART 1: Issue 1: ONE-SHOT DEVICES/SYSTEMS
  • PART 2: Issue 1: SOFTWARE
  • PART 3: Issue 2: R&M CASE
  • PART 4: Issue 1: Testability
  • PART 5: Issue 1: IN-SERVICE RELIABILITY DEMONSTRATIONS
DEF STAN 00-43 RELIABILITY AND MAINTAINABILITY ASSURANCE ACTIVITY
  • PART 2: Issue 1: IN-SERVICE MAINTAINABILITY DEMONSTRATIONS
DEF STAN 00-44 RELIABILITY AND MAINTAINABILITY DATA COLLECTION AND CLASSIFICATION
  • PART 1: Issue 2: MAINTENANCE DATA & DEFECT REPORTING IN THE ROYAL NAVY, THE ARMY AND THE ROYAL AIR FORCE
  • PART 2: Issue 1: DATA CLASSIFICATION AND INCIDENT SENTENCING - GENERAL
  • PART 3: Issue 1: INCIDENT SENTENCING - SEA
  • PART 4: Issue 1: INCIDENT SENTENCING - LAND
DEF STAN 00-45 Issue 1: RELIABILITY CENTERED MAINTENANCE

DEF STAN 00-49 Issue 1: RELIABILITY AND MAINTAINABILITY MOD GUIDE TO TERMINOLOGY DEFINITIONS

These can be obtained from . There are also many commercial standards, produced by many organistions including the SAE, MSG, ARP, and IEE.

External links

  • NIST/SEMATECH, "Engineering Statistics Handbook",
Professional
  • - Automated tools and electronic information for reliability engineering activities.