Determining
Reliability for Complex Systems
Part 1 - Analytical Techniques
A
complex system is one that cannot be broken down to groups of series and
parallel components. In many cases it is not easy to recognize which
components are in series and which are in parallel in a complex system.
The following network is a good example of such a complex system:
As
the figure illustrates, this system cannot be broken down into a group of
series and parallel systems. This complicates the problem of
determining the system's reliability. If the system can be broken down to
series/parallel configurations, it is a relatively simple matter to
determine the mathematical or analytical formula that describes the
system's reliability. However, for a complex system, determination of the
system reliability becomes more involved.
In this
article, we will look at some of the techniques that can be employed to
determine the mathematical expression that expresses the reliability of
the system in terms of the reliabilities of its components. It is assumed
that the reliability values for the components have been determined using
standard (or accelerated) life data analysis techniques, so that the
reliability function for each component is known. With this
component-level reliability information available, it then becomes
necessary to determine how these component reliability values are combined
to determine the reliability function for the overall system.
There are a
number of advantages to using analytical techniques to determine system
reliability, as opposed to the more common method of using simulation. The
primary advantage of the analytical solution is that a mathematical
expression that describes the reliability of the system is obtained. Once
the system's reliability function has been determined, other calculations
on the system can be performed. Such calculations include:
-
Determination of the system's pdf.
-
Determination of the warranty period.
-
Determination of the system's failure rate.
-
Determination of the system's MTTF.
In addition,
optimization and reliability allocation techniques can be utilized to aid
engineers in their design improvement efforts. Another advantage of using
analytical techniques is the ability to perform static calculations and
analyze systems with a mixture of static and time-dependent components.
Finally, the reliability importance of components over time can be
calculated with this methodology.
Several methods
exist for analytically obtaining the reliability of a complex system:
-
Decomposition method
- Event space
method
- Path-tracing
method
We will examine
each of these methods, illustrating the techniques involved with simple
system examples.
Decomposition
Method
The decomposition method is an application of the law of total
probability. It involves choosing a "key" component and then calculating
the reliability of the system twice: once as if the key component failed
(R=0) and once as if the key component succeeded (R=1). These two
probabilities are then combined to obtain the reliability of the system,
since at any given time the key component will be failed or operating.
Using probability theory, the equation is:

Assuming that the components are statistically independent, this reduces to:

Consider three
units in series.
- A is
the event of Unit 1 success
- B is
the event of Unit 2 success
- C is
the event of Unit 3 success
- s is
the event of system success
First select a " key" component for the system. Selecting Unit 1, the
probability of success of the system is:

If Unit 1
survives, then:

That is, if
Unit 1 is operating, the probability of the success of the system is the
probability of Units 2 and 3 succeeding.
If Unit 1
fails, then:

That is, if
Unit 1 is not operating, the system has failed since a series system
requires all of the components to be operating for the system to operate.
Thus the
reliability of the system is:

Another
Illustration of the Decomposition Method
Consider the following system:

- A is
the event of Unit 1 success
- B is
the event of Unit 2 success
- C is
the event of Unit 3 success
- s is
the event of system success
Selecting Unit
3 as the key, the system reliability is:

If Unit 3
survives, then:

That is, since
Unit 3 represents half of the parallel section of the system, as long as
it is operating, the entire system operates.
If Unit 3
fails, then the system is reduced to:

The reliability
of the system is given by:

or:

Event Space
Method
The event space method is an application of the mutually exclusive
events axiom. All mutually exclusive events are determined, and those
which result in system success are considered. The reliability of the
system is simply the probability of the union of all mutually exclusive
events that yield a system success. Similarly, the unreliability is the
probability of the union of all mutually exclusive events that yield a
system failure. This is illustrated in the following example.
Consider the
following system, with reliabilities R1, R2, and R3
for a given time:
- A is
the event of Unit 1 success
- B is
the event of Unit 2 success
- C is
the event of Unit 3 success
The mutually exclusive system events are:
X1 = ABC - all units succeed
X2 = ABC -
only Unit 1 fails
X3 = ABC -
only Unit 2 fails
X4 = ABC
- only Unit 3 fails
X5 = ABC -
Units 1 and 2 fail
X6 = ABC
- Units 1 and 3 fail
X7 = ABC
- Units 2 and 3 fail
X8 = ABC
- all units fail
|
System events X6,
X7, and X8 result in system failure. Thus the
probability of failure of the system is:

Since events X6,
X7, and X8 are mutually exclusive, then:

And:

Combining terms
yields:

Since:

then:

This is of
course the same result as the one obtained previously using the
decomposition method.
If R1 =
99.5%, R2 = 98.7%, and R3 = 97.3%, then:

or Rs
= 99.95%.
Path-Tracing
Method
With this method, every path from a starting point to an ending point
is considered. Since system success involves having at least one path
available from one end of the Reliability Block Diagram (RBD) to the
other, as long as at least one path from the beginning to the end of the
path is available, the system has not failed. One could consider the RBD
to be a plumbing schematic. If a component in the system fails, the
"water" can no longer flow through it. As long as there is at least one
path for the "water" to flow from the start to the end of the system, the
system is successful. This method involves identifying all of the paths
the "water" could take and calculating the reliability of the path based
on the components that lie along that path. The reliability of the system
is simply the probability of the union of these paths. In order to
maintain consistency of the analysis, starting and ending blocks for the
system must be defined.
Consider the
following system:
The successful
paths for this system are X1
= ABD and X2 = ACD. The reliability of the system
is simply the probability of the union of these paths.


Thus:

In
the following system, a starting and an ending node must be defined.

Assume the following starting and ending nodes:

The paths for this system are X1 = 1,2 and X2
= 3. The probability of success for the system is given by:

or:

A
modified version of this method is used by
ReliaSoft's BlockSim to calculate the analytical solution to system
reliability diagrams.
The examples used to illustrate these techniques used fairly simple
systems to simplify the mathematics involved. The same techniques can be
used to determine the reliability of more complex systems. It should be
fairly obvious that the expressions for the system reliability will get
larger as the number of components in the system increases. The way the
components are arranged reliability-wise will also have an effect on the
size of the final system reliability term. In fact, even moderately-sized
complex systems can prove to be too unwieldy to solve by hand. Computer
programs can be employed to solve these large complex systems, but to the
best of our knowledge,
ReliaSoft's BlockSim is the only software package available that is
capable of this type of analysis.
While these
analytical techniques for determining system reliability can yield results
not available with other techniques, there are also some drawbacks. The
biggest disadvantage of the analytical method is that formulations can
become very complicated. The more complicated a system is, the larger and
more difficult it will be to analytically formulate an expression for the
system's reliability. For particularly detailed systems, this process can
be quite time-consuming, even with the use of computers. Furthermore, when
the maintainability of the system or some of its components must be taken
into consideration, an analytical solution may be impossible to compute.
In these situations, the use of simulation methods may be more
advantageous than attempting to develop a solution analytically. We will
take a look at these simulation methods in next month's issue of
Reliability HotWire.
|