Reliability HotWire: eMagazine for the Reliability Professional
Reliability HotWire

Issue 6, August 2001

Reliability Basics

Sources of Reliability Data
Part 2 - Field Data

Last month in "Part 1: Reliability Testing Basics," we discussed reliability testing as a source of reliability data. While reliability testing is vital to the implementation of a reliability program, it is not the sole source of product reliability performance data. Indeed, the data received from the field is the "true" measure of product performance, and is directly linked to the financial aspects of a product. In fact, a great deal of field data may be more finance-related than reliability-related. However, given the importance of the link between reliability and income, it is important to insure that adequate reliability information can be gleaned from field performance data. In many cases, it is not too difficult to adapt current field data collection programs to include information that is directly applicable to reliability reporting.

Field Data Examples
Some of the most prevalent types of field data are discussed below. These discussions will tend towards generalizations, as every organization has different methods of monitoring the performance of its products once they are in the field. However, the descriptions below give a good general view of how different types of field data may be collected and put to use for a reliability program.

Sales and Forecasting Data
Sales and forecasting information is a sort of general-use data type that is necessary as a basis for many other analyses of field data. Essentially, this information provides you with a figure for the population of products in the field. Knowing how many units are being used at any given time period is absolutely vital to performing any sort of reliability-oriented calculations. Having an accurate measurement of the number of failures in the field is basically useless if there is not a good figure for the total number of units in the field at that time.

Warranty Data
The warranty data type is somewhat of a catch-all category that may or may not include the other types of field data listed below, and may not contain adequate information to track reliability-related data. Since most warranty systems are designed to track finances and not performance data, some types of warranty data may have very little use for reliability purposes. However, it may be possible to garner adequate reliability information based on the inputs of the warranty data, if not the actual warranty data itself. This of course is a case of "garbage in, garbage out," and a poorly set-up warranty tracking system will yield poor or misleading data regarding the reliability of the product. At the very least, there should be a degree of confidence regarding the raw number of failures or warranty hits during a particular time period. This, coupled with accurate shipping data, will allow a crude approximation of reliability based on the number of units that failed versus the number of units operating in the field in any given time period.

Field Service Data
The field service data type is connected with field service calls where a repair technician has to physically repair a failed product. This is a potentially powerful source of field reliability information, if a system is in place to gather the necessary data during the service call. However, the job of the service technician is to restore the customer's equipment to operating condition as quickly as possible, and not necessarily to perform a detailed failure analysis. This can lead to a number of problems. First, the service technician may not be recording information necessary to reliability analysis, such as how much time the product accumulated before it failed. Second, the technician may take a "shotgun" approach to repair. That is, based on the failure symptom, the technician will replace all of the parts whose failure may result in that particular system. It may be that only one of the parts that were replaced had actually failed, so it is necessary to perform a failure analysis on all of the parts to determine which one was actually the cause of the product failure. Unfortunately, this is not always done, and if it is, the parts that have had no problem found with them will often be returned to field service circulation. This may lead to another potential source of error in field service data, which is that used parts with unknown amounts of accumulated time and damage may be used as replacement parts on subsequent service calls. This makes tracking and characterizing field reliability very difficult. From a reliability perspective, it is always best to record necessary failure information, avoid using the "shotgun" approach to servicing failed equipment, and always use new units when making part replacements.

Customer Support Data
The customer support data type comes from phone-in customer support services. In many cases, it may be directly related to the field service data in that the customer with a failed product will call to inform the organization. In some circumstances, it may be possible to solve the customer's problem over the phone, or adequately diagnose the cause of the problem so that a replacement part can be sent directly to the customer without resorting to a service technician having to make an on-site visit. It would be hoped that the customer support and field service data reside in the same database, but this is not always the case. Regardless of the location, customer support data must always be screened with care, as the data does not always reflect actual problems with the product. Many customer support calls may concern usability issues or other instances of the customer not being able to properly use the product. In cases such as this, there will be a cost to the organization or warranty hit, even though there is no real fault or failure for the product. For example, a product that is very reliable, but has a poorly-written user manual may generate a great deal of customer support calls. This is because, even though the product is working perfectly, the customers are having difficulty operating the product. This is a good example of one of the sources of the "disconnect" between in-house and field reliability data.

Returned Parts/Failure Analysis Data
As was mentioned earlier, failed parts or systems are sometimes returned for a more detailed failure analysis than can be provided by the field service technician. Data from this area is usually more detailed regarding the cause of failure, and is usually more useful to design or process engineers than to reliability engineers. However, it is still an important source of information regarding the reliability behavior of the product. This is especially true if the field service technicians are using the "shotgun" approach to servicing the failed product. If this is the case, it is necessary for all of the returned parts to be analyzed to determine the true cause of the failure. The results of the failure analysis should be linked to the field service records in order to provide a complete picture of the nature of the failure. In many cases, this does not occur or the returned parts are not analyzed in a timely fashion. Even if they are, there tend to be a significant proportion of returned parts with which no problem can be found. This is another example of a potential cause of the disparity between lab and field reliability data. However, even if the failure analysis group is unable to assign a cause to the failure, a failure has taken place, and the organization has most likely taken a warranty hit. In the field, the performance the customer experiences is the final arbiter of the reliability of the product.

Field Data Collection
Depending on the circumstances, collection of field data for reliability analyses can either be a simple matter or major headache. Even if there is not a formal field data collection system in place, odds are that much of the necessary general information is being collected already in order to track warranty costs, financial information, etc. The potential drawback is that the data collection system may not be set up to collect all of the types of data necessary to perform a thorough reliability analysis. As mentioned earlier, many field data collection methodologies focus on aspects of the field performance other than reliability. Usually, it is a small matter to modify data collection processes to gather the necessary reliability information.

For example, in one instance the field repair personnel were only collecting information specific to the failure of the system and what they did to correct the fault. No information was being collected on the time accumulated on the systems at the time of failure. Fortunately, it was a simple matter to have the service personnel access the usage information, which was stored on a computer chip in the system. This information was then included with the rest of the data collected by the service technician, which allowed for a much greater resolution in the failure times used in the calculation of field reliability. Previously, the failure time was calculated by subtracting the failure date from the date the product was shipped. This could cause problems in that the product could remain unused for months after it was shipped. By adding the relatively small step of requiring the service technicians to record the accumulated use time at failure, a much more accurate model of the field reliability of this unit could be made.

Another difficulty in using field data to perform reliability analyses is that the data may reside in different places, and in very different forms. The field service data, customer support data, and failure analysis data may be in different databases, each of which may be tailored to the specific needs of the group recording the data. The challenge in this case is in developing a method of gathering all of the pertinent data from the various sources and databases and pulling it into one central location where it can easily be processed and analyzed.

The "Disconnect" Between In-House and Field Data
It should be noted at this point that there may be a "disconnect," or seeming lack of correlation, between the reliability performance of the products in the field and the results of in-house reliability testing. A typical rule of thumb is to expect the unreliability in the field to be twice what was observed in the lab. Some of the specific causes of this disparity have already been discussed, but in general the product will usually receive harsher treatment in the field than in the lab. Units being tested in the labs are often hand-built or carefully set up and adjusted by engineers prior to the beginning of the test. Furthermore, the tests are performed by trained technicians who are adept at operating the product being tested. Most end-use customers do not have the advantage of a fine-tuned unit and training and experience in its operation, thus leading to many more operator-induced failures than would be experienced during in-house testing. Also, final production units are subject to manufacturing variation and transportation damage that test units might not undergo, leading to yet more field failures that would not be experienced in the lab. Finally, the nature of the data that goes into the calculations will be different. In-house reliability data is usually a great deal more detailed than the catch-as-catch-can type of non-parametric data that characterizes a great deal of field data. As can be imagined, there are any number of sources for the variation between field reliability data and in-house reliability test results. However, with careful monitoring and analysis of both sources of data, it should be possible to model the relationship between the two, allowing for more accurate prediction of field performance based on reliability testing results.


ReliaSoft Corporation

Copyright © 2001 ReliaSoft Corporation, ALL RIGHTS RESERVED