Determining Reliability for Complex Systems
Part 1 - Analytical Techniques
A complex system is one that cannot
be broken down to groups of series and parallel components. In many cases it is not easy to recognize which components
are in series and which are in parallel in a complex system. The
following network is a good example of such a complex system:
As the figure illustrates, this system cannot be broken down into a
group of series and parallel systems. This
complicates the problem of determining the system's reliability. If
the system can be broken down to series/parallel configurations, it is a
relatively simple matter to determine the mathematical or analytical
formula that describes the system's reliability. However, for a
complex system, determination of the system reliability becomes more
involved.
In this article, we will look at some of the
techniques that can be employed to determine the mathematical expression
that expresses the reliability of the system in terms of the
reliabilities of its components. It is assumed that the reliability
values for the components have been determined using standard (or
accelerated) life data analysis techniques, so that the reliability
function for each component is known. With this component-level
reliability information available, it then becomes necessary to determine
how these component reliability values are combined to determine the
reliability function for the overall system.
There are a number of advantages to using
analytical techniques to determine system reliability, as opposed to the
more common method of using simulation. The primary advantage of the
analytical solution is that a mathematical expression that describes the
reliability of the system is obtained. Once the system's reliability
function has been determined, other calculations on the system can be
performed. Such calculations include:
- Determination of the system's pdf.
- Determination of
the warranty period.
- Determination of the system's failure rate.
- Determination of the system's
MTTF.
In addition, optimization and reliability
allocation techniques can be utilized to aid engineers in their design
improvement efforts. Another advantage of using analytical
techniques is the ability to perform static calculations and analyze
systems with a mixture of static and time-dependent components. Finally, the reliability importance of components over time can be
calculated with this methodology.
Several
methods exist for analytically obtaining the reliability of a complex system:
- Decomposition method
- Event space method
- Path-tracing method
We will examine each of these methods,
illustrating the techniques involved with simple system examples.
Decomposition Method
The decomposition method is an application of the law of total probability.
It involves choosing a "key" component and then
calculating the reliability of the system twice: once as if the key
component failed (R=0) and once as if the key component succeeded (R=1). These two probabilities are then combined to obtain the
reliability of the system, since at any given time the key component will
be failed or operating. Using probability theory, the equation is:
Rs =
P(s
Ç
A) + P(s
Ç
A)
Assuming
that the components are statistically independent, this reduces to:
Rs = P(s | A) P(A) + P(s
| A) P(A)
Consider three units in series.
- A is the event of Unit 1 success
- B is the event of Unit 2 success
- C is the event of Unit 3 success
- s is the event of system success
First select a " key" component for the
system. Selecting Unit 1, the probability of success of the system
is:
Rs = P(s | A) P(A) + P(s
| A) P(A)
If Unit 1
survives, then:
P(s | A) = R2·R3
That is, if Unit 1 is operating, the probability of the success of the
system is the probability of Units 2 and 3 succeeding.
If Unit 1 fails, then:
P(s | A)
= 0
That is, if Unit 1 is not operating, the system has failed since a
series system requires all of the components to be operating for the
system to operate.
Thus the reliability of the system is:
Rs = R2·R3·P(A)
= R1·R2·R3
Another Illustration of the Decomposition Method
Consider the following system:

- A is the event of Unit 1 success
- B is the event of Unit 2 success
- C is the event of Unit 3 success
- s is the event of system success
Selecting Unit 3 as the key, the system reliability is:
Rs = P(s | C)P(C) + P(s
| C)P(C)
If Unit 3 survives, then:
P(s | C) =
1
That is, since Unit 3 represents half of the parallel section of the
system, as long as it is operating, the entire system operates.
If Unit 3 fails, then the system is reduced to:
P(s | C)
= R1R2
The reliability of the system is given by:
Rs = P(C) + R1·R2·P(C)
= R3 + R1·R2·(1-R3)
or,
Rs = R3 + R1·R2
- R1·R2·R3
Event Space Method
The event space method is an application of the mutually exclusive
events axiom. All mutually exclusive events are determined, and those
which result in system success are considered. The reliability of the
system is simply the probability of the union of all mutually exclusive
events that yield a system success. Similarly, the unreliability is the
probability of the union of all mutually exclusive events that yield a
system failure. This is illustrated in the following example.
Consider the following system, with reliabilities
R1, R2, and R3 for a
given time:
- A is the event of Unit 1 success
- B is the event of Unit 2 success
- C is the event of Unit 3 success
The mutually exclusive system events are:
|
X1 = ABC - all units succeed
X2 = ABC
- only Unit 1 fails
X3 = ABC
- only Unit 2 fails
X4 = ABC
- only Unit 3 fails
X5 = ABC
- Units 1 and 2 fail
X6 = ABC
- Units 1 and 3 fail
X7 = ABC
- Units 2 and 3 fail
X8 = ABC
- all units fail
|
System events X6, X7, and X8 result in
system failure. Thus the probability of failure of the system
is:
Pf = P(X6
È
X7 È
X8)
Since events X6, X7, and X8 are
mutually exclusive, then:
Pf = P(X6) +
P(X7) + P( X8)
And:
P(X6) = P(ABC)
= (1-R1)(R2)(1-R3)
P(X7) = P(ABC)
= (R1)(1-R2)(1-R3)
P(X8) = P(ABC)
= (1-R1)(1-R2)(1-R3)
Combining terms yields:
Pf = 1 - R1·R2
- R3 + R1·R2·R3
Since:
Rs = 1 - Pf,
then:
Rs = R1·R2
+ R3 - R1·R2·R3
This is of course the same result as the one obtained previously using
the decomposition method.
If R1 = 99.5%, R2 = 98.7%, and R3 =
97.3%, then:
Rs = 0.995·0.987 +
0.973 - 0.995·0.987·0.973
=
0.999515755
or:
Rs =
99.95%
Path-Tracing Method
With this method, every path from a starting point to an
ending point is considered. Since system success involves having at least
one path available from one end of the Reliability Block Diagram (RBD) to the other, as long as at
least one path from the beginning to the end of the path is available, the
system has not failed. One could consider the RBD to be a plumbing
schematic. If a component in the system fails, the "water"
can no longer flow through it. As long as there is at least one path
for the "water" to flow from the start to the end of the system,
the system is successful. This method involves identifying all of
the paths the "water" could take and calculating the reliability
of the path based on the components that lie along that path. The
reliability of the system is simply the probability of the union of these
paths. In order to maintain consistency of the analysis, starting
and ending blocks for the system must be defined.
Consider the following system:
The successful paths for this system are X1
= ABD and X2 = ACD. The reliability of the system is simply the probability of the union of
these paths.
Rs = P(X1
È
X2)
P(X1
È
X2) = P(X1) + P(X2) - P(X1 Ç
X2)
=
P(ABD) + P(ACD) - P(ABCD)
Thus:
Rs = RA·RB·RD
+ RA·RC·RD - RA·RB·RC·RD
In the following system, a starting and an ending node
must be defined.

Assume the following starting and ending nodes:

The paths for this system are X1 = 1,2 and X2
= 3. The probability of success for the system is given by:
P(X1 U X2)
= P(X1) + P(X2) - P(X1 Ç
X2)
= P(1,2) + P(3) -
P(1,2,3)
or:
Rs = R1·R2
+ R3 - R1·R2·R3
A
modified version of this method is used by ReliaSoft's
BlockSim to calculate the analytical solution to system reliability
diagrams.
The examples used to illustrate these techniques used
fairly simple systems to simplify the mathematics involved. The same techniques can be used to determine the reliability of
more complex systems. It should be fairly obvious that the
expressions for the system reliability will get larger as the number of
components in the system increases. The way the components are
arranged reliability-wise will also have an effect on the size of the
final system reliability term. In fact, even moderately-sized
complex systems can prove to be too unwieldy to solve by hand. Computer programs can be employed to solve these large complex systems,
but to the best of our knowledge, ReliaSoft's
BlockSim is the only software package available that is capable of
this type of analysis.
While these
analytical techniques for determining system reliability can yield results
not available with other techniques, there are also some drawbacks. The biggest disadvantage of the analytical method is that formulations can
become very complicated. The more complicated a system is, the
larger and more difficult it will be to analytically formulate an
expression for the system's reliability. For particularly detailed
systems, this process can be quite time-consuming, even with the use of
computers. Furthermore, when the maintainability of the system or
some of its components must be taken into consideration, an analytical
solution may be impossible to compute. In these situations, the use
of simulation methods may be more advantageous than attempting to develop
a solution analytically. We will take a look at these simulation
methods in next month's issue of Reliability
HotWire.
|