Failure Mode and Effects Analysis (FMEA)

“Without continual growth and progress, such words as improvement, achievement, and success have no meaning.” Benjamin Franklin.

Share Give it a Spin!
Follow by Email

FMEA is mainly used within the automotive, pharmaceutical and chemical industries. It is a powerful tool to detect and prevent products or processes failures and part of the continuous improvement culture.

There are several points which are analyzed:

Severity or Gravity

Determine all failure modes based on the functional requirements and their effects Examples of failure modes are: electrical short circuits, corrosion or deformation.

It is important to note that a failure in one component can lead to a failure in another component. The failure mode should be listed in technical terms and by function. Thus, the final effect of each failure mode must be taken into account. A failure effect is defined as the result of a failure mode in the system function perceived by the user. Therefore, it is necessary to record these effects in writing as they will be seen or experienced by the user. Examples of effects of failures are: low performance, noise and damage to a user. Each effect receives a severity number (S) that ranges from 1 (no danger) to 10 (critical). These numbers will help engineers prioritize failure modes and their effects. If the severity of an effect has a grade 9 or 10, you should consider changing the design by eliminating the failure mode or protecting the user from its effect. A grade 9 or 10 is reserved for those effects that would cause harm to the user.

Occurrence or Frequency

In this step it is necessary to observe the cause of the failure and determine how often it occurs. This can be achieved by observing similar products or processes and documenting their failures. The cause of a failure is seen as a weak point of the design. All potential causes of failure mode must be identified and documented using technical terminology. Examples of causes are: erroneous algorithms, excessive voltage or inadequate operating conditions.

A failure mode receives a probability number (O) that can range from 1 to 10. The actions must be developed if the incidence is high (> 4 for failures not related to safety and> 1 when the number of pass severity 1 is 9 or 10). This step is known as the detailed development of the FMEA process. The incidence can also be defined as a percentage. If a non-safety related problem has an incidence of less than 1% it can be given a figure of 1; depending on the product and user specifications.

Detection Capability

When the appropriate actions have been determined, it is necessary to check their efficiency and carry out a design verification. The proper inspection method must be selected. In the first place, an engineer must observe the current controls of the system that prevent the failure modes or that detect it before it reaches the consumer.

Subsequently, testing, analysis and monitoring techniques that have been used in similar systems to detect failures must be identified. From these controls, an engineer can know the possibility that failures occur and how to detect them. Each combination of the two previous steps receives a detection number (D). This number represents the ability of planned tests and inspections to eliminate defects and detect failure modes.

After these three basic steps, the priority numbers of the risk known as (RPN) are calculated.

Risk priority numbers

Risk priority numbers are not an important part of the criteria for selecting an action plan against failure modes. They are rather a parameter of help in the evaluation of these actions. After assessing the severity, incidence and detectability the priority numbers of the risk can be calculated by multiplying these three numbers: NPR = S x O x D. This must be done for the entire process or design. Once it is calculated, it is easy to determine the areas that should be of greatest concern. Failure modes that have a higher priority number of risk should be those that receive the highest priority to develop corrective actions. This means that it is not always the failure modes with the highest severity numbers that must be solved first.

There may be less serious failures, but they occur more often and are less detectable. After assigning these values, a series of actions with a purpose are recommended, responsibilities are assigned and implementation dates are defined. These actions may include specific inspections, testing, quality testing, redesign, etc. After implementing the actions in the design or process, the risk priority number must be checked again to confirm the improvements. These tests are usually represented graphically for easy viewing. Whenever changes are made to a process or design, the AMFE must be updated.

Some obvious but important points should be kept in mind:

  • Try to eliminate the failure mode (some failures are more avoidable than others).
  • Reduce the incidence of failure mode.
  • Improve detection

Interpretation of results

When analyzing the results of the FMEA, it will be necessary to act on those priority points for the optimization of the design of the product / service. These points are those with a high NPR and those with a higher Gravity Index.

The actions that are carried out as a result of the analysis of the result of the FMEA can only be oriented to:

  • Reduce the Probability of Occurrence (preferable). The design of the process or the product must be changed.
  • Increase the Probability of Location (implies cost increase).
  • A misinterpretation can come from:

Not having identified all the functions or benefits of the object of study, or, these functions do not correspond to the needs and expectations of the user or client.
Do not consider all Potential Failure Modes because you believe that one of them could never occur.