Specifications and Product Failure Definitions
Before life data can be analyzed or even collected, the definition of failure must be established. While this may seem like a simplistically obvious piece of advice, lack of generally accepted definitions for performance-related failures can result in misunderstandings over validation and reliability specifications, wasted test time, and squandered resources. The process of characterizing the reliability and defining the failures for a product is directly related to the product's mission.
A textbook definition of reliability is:
The conditional probability, at a given confidence level, that the equipment will perform its intended functions satisfactorily or without failure, i.e., within specified performance limits, at a given age, for a specified length of time, function period, or mission time, when used in the manner and for the purpose intended while operating under the specified application and operation environments with their associated stress levels. 
With all of the conditions removed, this boils down to defining reliability as the probability of the product to perform its intended mission without failing. The definition of reliability springs directly from the product mission, in that product failure is the inability of the product to perform its defined mission.
Universal Failure Definitions
One of the most important reasons is that different groups within the organization may have different definitions as to what sort of behavior actually constitutes a failure. This is often the case when comparing the different practices of design and manufacturing engineering groups. Identical tests performed on the same product by these groups may produce radically different results simply because the two groups have different definitions of product failure. For a reliability program to be effective, there must be a commonly accepted definition of failure for the entire organization. Of course, this definition may require a little flexibility depending on the type of product, development phase, etc., but as long as everyone is familiar with the commonly accepted definition of failure, communications will be more effective and the reliability program will be easier to manage.
Another benefit of having universally agreed-upon failure definitions is that it will minimize the tendency to rationalize away failures on certain tests. This can be a problem, particularly during product development, as it is a tendency of engineers and managers to overlook or diminish the importance of failure modes that are unfamiliar or not easily replicable. This tendency is only human, and a person who has spent a great deal of time developing a product is sometimes justified in writing off an oddball failure as a “glitch” or as being due to some other external error. However, this type of mentality also results in products being released into the field that have poorly defined but very real failure modes. Having a specific failure definition that applies to all or most types of tests will help alleviate this problem. However, a degree of flexibility is called for in the definition of failure, particularly with complex products that may have a number of distinct failure modes. For this reason, it may be advisable to have a multi-tiered failure definition structure that can accommodate the behavioral vagaries of complex equipment. The following three-level list of failure categories is given as an example:
During testing, all of these occurrences should be logged with codes to separate the three failure types. Other test-process-related issues, such as deviations from test plans, should be logged in the a separate test log. There should be a timely review of logged occurrences to insure proper classification prior to metric calculation and reporting.
Dimitri, Reliability Engineering Handbook, Vol. 1, Prentice-Hall,
Copyright © 2001 ReliaSoft Corporation, ALL RIGHTS RESERVED