Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

17
System Reliability Modeling

To design, analyze, and evaluate the reliability and maintainability characteristics of a system, there must be an understanding of the system's relationships to all the subsystems, assemblies, and components. Many times, this can be accomplished through logical and mathematical models of the system that show the functional relationships among all the components, the subsystems, and the overall system. The reliability of the system is a function of the reliabilities of its components and building blocks.

17.1 Reliability Block Diagram

Engineering analysis of the system has to be conducted in order to develop a reliability model. The engineering analysis consists of the following steps:

Develop a functional block diagram of the system based on physical principles governing the operations of the system.
Develop the logical and topological relationships between functional elements of the system.
Determine the extent to which a system can operate in a degraded state, based on performance evaluation studies.
Define the spare and repair strategies (for maintenance systems).

Based on the preceding analysis, a reliability block diagram is developed, which can be used to calculate various measures of reliability and maintainability. The reliability block diagram (RBD) is a pictorial way of showing the success or failure combinations for a system. A system reliability block diagram presents a logical relationship of the system, subsystems, and components. Some of the guidelines for drawing these diagrams are as follows:

A group of components that are essential for the performance of the system and/or its mission are drawn in series (Figure 17.1).
Components that can substitute for other components are drawn in parallel (Figure 17.3).
Each block in the diagram is like a switch: it is closed when the component it represents is working and is opened when the component has failed. Any closed path through the diagram is a success path.

The failure behavior of all the redundant components must be specified. Some of the common types of redundancies are:

Active Redundancy or Hot Standby. The component has the same failure rate as if it was operating in the system.
Passive Redundancy, Spare, or Cold Standby. The standby component cannot fail. This is generally assumed for spare or shelf items.
Warm Standby. The standby component has a lower hazard rate than the operating component. This is usually a realistic assumption.

This chapter describes how to design, analyze, and evaluate the reliability of a system based on the parts, assemblies, and subsystems that compose a system. Most of the concepts in this chapter are explained using one level of the system hierarchical process. For example, we will illustrate how to compute system reliability if we know the reliabilities of the subsystems. Then the same methods and logic can be used to combine assemblies of the subsystem, and so on.

17.2 Series System

In a series system, all subsystems must operate successfully if the system is to function or operate successfully. This implies that the failure of any subsystem will cause the entire system to fail.

The reliability block diagram of a series system is shown in Figure 17.1. The reliability of each block is represented by R_i(t) and the times to failure are represented by TTF(i). The units need not be physically connected in series for the system to be called a series system.

c17-fig-0001 — Figure 17.1 Series system representation.

System reliability can be determined using the basic principles of probability theory. We make the assumption that all the subsystems are probabilistically independent. This means that whether or not one subsystem works does not depend on the success or failure of other subsystems.

Let us first consider the static case. Let R_i be the reliability of the ith subsystem, i = 1, 2, … , n. Let E_s be the event that the system functions successfully and E_i be the event that each subsystem i functions successfully (i = 1, 2, …, n). Then

(17.1) $c17-math-0001$

because the system will function if and only if all the subsystems function. If all the events E_i, i = 1, 2, … , n, are probabilistically independent, then

(17.2) $c17-math-0002$

Equation 17.2 can be generalized for time-dependent or dynamic reliability models. If we denote the time to failure random variable for the ith subsystem by T_i, i = 1, 2, … , n. Then for the series system, the system reliability is given by

(17.3) $c17-math-0003$

If we assume that all the random variables, T_i, i = 1, 2, … , n, are independent, then

(17.4) $c17-math-0004$

Hence, we can state the following equation:

(17.5) $c17-math-0005$

From Equation 17.2, it is clear that the reliability of the system reduces with an increase in the number of subsystems or components in the series system (see Figure 17.2).

c17-fig-0002 — Figure 17.2 Effects of part reliability and number of parts on system reliability in series configuration.

Assume that the time-to-failure distribution for each subsystem/component of a system is exponential and has a constant failure rate, λ_i. For the exponential distribution, the component reliability is

(17.6) $c17-math-0006$

Hence, the system reliability is given by:

(17.7) $c17-math-0007$

The system also has an exponential time-to-failure distribution, and the constant system failure rate is given by:

(17.8) $c17-math-0008$

and the mean time between failures for the system is

(17.9) $c17-math-0009$

The system hazard rate is constant if all the components of the system are in series and have constant hazard rates. The assumptions of a constant hazard rate and a series system make the mathematics simple, but this is rarely the case in practice.

For the general case, taking the log of both sides of Equation 17.5, we have

(17.10) $c17-math-0010$

Also recall that

(17.11) $c17-math-0011$

which means that

(17.12) $c17-math-0012$

(17.13) $c17-math-0013$

Applying this to Equation 17.10, we have

(17.14) $c17-math-0014$

Thus, the hazard rate for the system is the sum of the hazard rates of the subsystems under the assumption that the time-to-failure random variables for all the subsystems are independent, regardless of the form of the pdf's for the time-to-failure random variables for all the subsystems.

Example 17.2

Two subsystems of a system functionally operate in series and have the time to failure random variable with the pdfs given by

$c17-math-5020$

where η_i is the parameter for the pdf for the ith subsystem. Time is measured in hours. We want to answer the following five parts.

Find the system hazard function, h_S(t).
Find the system reliability function, R_S(t).
Find the pdf, f_S(t), for the time to failure for the system.
If η₁ = 300 hours and η₂ = 400 hours, find R_S(20 hours).
For the values in (d) find t* such that R_S(t*) = 0.90.

Solution:

We can easily notice that f_i(t) is a Weibull distribution with

$c17-math-5021$

So, the reliability function and the hazard function for each subsystem are

$c17-math-5022$

$c17-math-5023$

Find the system hazard function, h_S(t). Using Equation 17.14, we have

$c17-math-5024$
where 1/η = 1/η₁ + 1/η₂.
Find the system reliability function, R_S(t).
From part (a), and using Equation 17.5, we have $c17-math-5025$
Find the system pdf, f_S(t).
$c17-math-5027$
If η₁ = 300 hours and η₂ = 400 hours, find R_S (20 hours). First, 1/η = 1/η₁ + 1/η₂ = 1/300 + 1/400, so η = 171.4 hours.
Now,

$c17-math-5029$
For the values in (d), find t* such that R_S(t*) = 0.90:

$c17-math-5030$

17.3 Products with Redundancy

Redundancy exists when one or more of the parts of a system can fail and the system will still be able to function with the parts that remain operational. Two common types of redundancy are active and standby. In active redundancy, all the parts are energized and operational during the operation of a system. In active redundancy, the parts will consume life at the same rate as the individual components.

In standby redundancy, some parts do not contribute to the operation of the system, and they get switched on only when there are failures in the active parts. In standby redundancy, the parts in standby ideally should last longer than the parts in active redundancy.

There are three conceptual types of standby redundancy: cold, warm, and hot. In cold standby, the secondary parts are shut down until needed. This lowers the number of hours that the part is active and typically assumes negligible consumption of useful life, but the transient stresses on the parts during switching may be high. This transient stress can cause faster consumption of life during switching. In warm standby, the secondary parts are usually active, but are idling or unloaded. In hot standby, the secondary parts form an active parallel system. The life of the hot standby parts are assumed to be consumed at the same rate as active parts.

17.3.1 Active Redundancy

An active redundant system is a standard “parallel” system. That fails only when all components have failed. Sometimes, the parallel system is called a 1-out-of-n or (1, n) system, which implies that only one (or more) out of n subsystems has to operate for the system to be operational or functional. Thus, a series system is an n-out-of-n system. The reliability block diagram of a parallel system is given in Figure 17.3.

c17-fig-0003 — Figure 17.3 Active redundant system.

The units need not be physically connected in parallel for the system to be called a parallel system. The system will fail if all of the subsystems or all of the components fail by time t, or the system will survive the mission time, t, if at least one of the units survives by time t. Then, the system reliability can be expressed as

(17.15) $c17-math-0015$

where Q_s(t) is the probability of system failure, or

(17.16) $c17-math-0016$

under the assumption that the time to failure random variables for all the subsystems are probabilistically independent.

The system reliability for a mission time, t, is

(17.17) $c17-math-0017$

For the static situation or for an implied fixed value of t, we have an equation similar to Equation 17.2, which is given by

(17.18) $c17-math-0018$

Figure 17.4 shows the effect of component reliability on system reliability for an active parallel system for a static situation.

c17-fig-0004 — Figure 17.4 Effect of part reliability and number of parts on system reliability in an active redundant system.

We can use Equation 17.2 and Equation 17.18 to calculate the reliability of systems that have subsystems in series and in parallel. This is illustrated in Example 17.3.

After we know the system reliability function from Equation 17.17, the system hazard rate is given by:

(17.19) $c17-math-0019$

where f_S(t) is the system time-to-failure probability density function (pdf). The mean life, or the expected life, of the system is determined by:

(17.20) $c17-math-0020$

where T_S is the time to failure for the system.

For example, if the system consists of two units (n = 2) with an exponential failure distribution with constant failure rates λ₁ and λ₂, then the system mean life is given by Equation 17.21. Note that the system mean life is not equal to the reciprocal of the sum of the component's constant failure rates, and we can prove that the hazard rate is not constant over time, although the individual unit failure rates are constant.

(17.21) $c17-math-0021$

Example 17.4

Consider an electronics system consisting of two parts with constant failure rates as given below:

$c17-math-5034$

$c17-math-5035$

Assume that failures are governed by a constant failure rate λ_i for the ith part. Determine:

The system reliability for a 1000-hour mission
The system MTTF
The failure probability density function
The system “failure rate.”

Solution:

For a constant failure rate, the reliability R_i of the ith part has the form:

$c17-math-5036$

For a parallel system:

$c17-math-5037$

The failure probability density function is:

$c17-math-5038$

Substituting numbers in the equation for system reliability, we get the answer for part (a):

$c17-math-5039$

The MTTF (part b) for the parallel system is

$c17-math-5040$

The failure probability density function (part c) for the parallel system is

$c17-math-5041$

The system hazard rate for the parallel system is given by:

$c17-math-5042$

The system failure rate for the parallel system (part (d)) can be obtained by substituting the results in the equation stated above. We will find that h_S(t) is a function of time and is not constant over time.

If the time to failure for all n components is exponentially distributed with MTBF θ, then the MTBF for the system is given by

(17.22) $c17-math-0022$

Here, θ = MTBF for every component or subsystem. Thus, each additional component increases the expected life of the system but at a slower and slower rate. This motivates us to consider standby redundant systems in the next section.

17.3.2 Standby Systems

A standby system consists of an active unit or subsystem and one or more inactive (standby) units that become active in the event of the failure of the functioning unit. The failures of active units are signaled by a sensing subsystem, and the standby unit is brought to action by a switching subsystem. The simplest standby configuration is a two-unit system, as shown in Figure 17.6. In general, there will be n number of units with (n − 1) of them in standby.

c17-fig-0006 — Figure 17.6 Stand-by system.

Let us now develop the system reliability models for the standby situation with two subsystems. Let f_i(t) be the pdf for the time to failure random variable, T_i, for the ith unit, i = 1, 2, and f_S(t) be the pdf for the time to failure random variable, T_S, for the system. Let us first consider a situation with only two units under the assumption that the sensing and the switching mechanisms are perfect. Thus, the second unit is switched on when the first component fails. Thus, T_S = T₁ + T₂, and T_S is nothing but a convolution of two random variables. Hence,

(17.23) $c17-math-0023$

Similarly, if we have a primary active component and two standby components, we have

(17.24) $c17-math-0024$

We can evaluate Equation 17.23 when both T₁ and T₂ have the exponential distribution as below:

(17.25) $c17-math-0025$

From Equation 17.25, we have

(17.26) $c17-math-0026$

The MTBF_S, θ_S, for the system is given by

(17.27) $c17-math-0027$

as is expected since T_S = T₁ + T₂ and E[T_S] = E[T₁] + E[T₂].

When the active and the standby units have equal constant failure rates, λ, and the switching and sensing units are perfect, the reliability function for such a system is given by

(17.28) $c17-math-0028$

We can rewrite Equation 17.26 in the form

(17.29) $c17-math-0029$

or as shown in Equation 17.30, where AR₍₂₎ is the contribution to the reliability value of the system by the second component

(17.30) $c17-math-0030$

This can easily be generalized to a situation where we have one primary component and two or more standby components. For example, if we have one primary component and (n − 1) standby components, and all have exponential time to failure with a constant failure rate of λ, then the system reliability function is given by

(17.31) $c17-math-0031$

17.3.3 Standby Systems with Imperfect Switching

Switching and sensing systems are not perfect. There are many ways these systems can fail. Let us look at a situation where the switching and sensing unit simply fails to operate when called upon to do its job. Let the probability that the switch works when required be p_SW. Then, the system reliability for one primary component and one standby is given by

(17.32) $c17-math-0032$

When the main and the standby units have exponential time-to-failure distributions, we can use Equation 17.30 to develop the following equation:

(17.33) $c17-math-0033$

Now, let us generalize Equation 17.32, where the switching and sensing unit is dynamic and the switching and sensing unit starts its life at the same time the active or the main unit starts its life. Let T_SW denote the time to failure for the switching and sensing unit, where its pdf and reliability functions are denoted by f_SW(t) and R_SW(t), respectively. Then the reliability of the system is given by

(17.34) $c17-math-0034$

If the time to failure of the switching and sensing unit follows an exponential distribution with a failure rate of λ_SW, then Equation 17.34 reduces to

(17.35) $c17-math-0035$

If we consider a special case where both the main unit and the standby units have exponential time-to-failure distributions with parameter λ, then Equation 17.35 reduces to

(17.36) $c17-math-0036$

Example 17.6

A B7XX plane has two similar computers onboard for flight control functions: one that is operational and the second as an active standby. The time to failure for each computer follows an exponential distribution with an MTBF of 4000 hours.

Find the reliability of the computer system (consisting of both computers) for 800 hours when the switching is perfect and the second computer is instantaneously switched on when the first computer fails. Also find the MTBF of the computer system.

Solution:

We have, using Equation 17.28,

$c17-math-5044$

Find the MTBF of the computer system when the switching and sensing unit is not perfect and the switching mechanism has a reliability of 0.98 when it is required to function.

Solution:

We have

$c17-math-5045$

Find the reliability of the computer system for 800 hours when the switching mechanism is not perfect and is dynamic. The time to failure for the switching mechanism also has exponential distribution with MTBF of 12,000 hours.

Solution:

We have

$c17-math-5046$

17.3.4 Shared Load Parallel Models

A situation that is common in engineering systems and their design is called a shared load parallel model. In this case, the two parallel components/units share a load together. Thus, the load on each unit is half of the total load. When one of the units fails, the other unit must take the full load. An example of a shared load parallel configuration is one in which two bolts are used to hold a machine element, and if one of the bolts fails, the other bolt must take the full load. The stresses on the bolt now will be doubled, and this will result in an increased hazard rate for the surviving bolt.

Let f₁_h(t) and f₂_h(t) be pdfs for the time to failure for the two units under half or shared load, and f₁_F(t) and f₂_F(t) be the pdfs under the full load for each unit, respectively. In this case, we can prove that the pdf for the time to failure of the system is

(17.37) $c17-math-0037$

The reliability function for the system if both units are identical (such as identical bolts), where we have f₁_h(t) = f₂_h(t) = f_h(t) and f₁_F(t) = f₂_F(t) = f_F(t), can be shown as

(17.38) $c17-math-0038$

If both f_h(t) and f_F(t) follow exponential distributions with parameters λ_h and λ_F, respectively, then it can be shown that the reliability function for the system is

(17.39) $c17-math-0039$

17.3.5 (k, n) Systems

A system consisting of n components is called a k-out-of-n or (k, n) system if the system only operates when at least k or more components are in an operating state. The reliability block diagram (Figure 17.8) for the k-out-of-n system is drawn similar to the parallel system, but in this case at least k items need to be operating for the system to be functional.

c17-fig-0008 — Figure 17.8 k-out-of-n system.

In this configuration, the system works if and only if at least k components out of the n components work, 1 ≤ k ≤ n. When R_i = R(t) for all i, with the assumption that the time to failure random variables are independent, we have

(17.40) $c17-math-0040$

and the probability of system failure, where Q(t) = 1 − R(t), is

(17.41) $c17-math-0041$

The probability density function can be determined by

(17.42) $c17-math-0042$

and the system hazard rate is given by

(17.43) $c17-math-0043$

If R(t) = e^−t/^θ, for an exponential case, the MTBF for the system is given by

(17.44) $c17-math-0044$

The reliability function for the system is mathematically complex to compute in a closed form when the components have different failure distributions. We will present the methodology later on in this chapter to solve this problem.

17.3.6 Limits of Redundancy

It is often difficult to realize the benefits of redundancy if there are common mode failures, load sharing, and switching and standby failures. Common mode failures are caused by phenomena that create dependencies between two or more redundant parts and which then cause them to fail “simultaneously.” Common mode failures can be caused by many things, such as common electric connections, shared environmental stresses, and common maintenance problems.

Load sharing failures occur when the failure of one part increases the stress level of other parts. This increased stress level can affect the life of the active parts. For redundant engines, motors, pumps, structures, and many other systems and devices in active parallel setup, the failure of one part may increase the load on the other parts and decrease their times to failure (or increase their hazard rates).

Several common assumptions are made regarding the switching and sensing of a standby system. Regarding switching, it is often assumed that switching is in one direction only, that switching devices respond only when directed to switch by the monitor, and that switching devices do not fail if not energized. Regarding standby, the general assumption is that standby nonoperating units cannot fail if not energized. When any of these idealizations are not met, switching and standby failures occur. Monitor or sensing failures include both dynamic (failure to switch when active path fails) and static (switching when not required) failures.

17.4 Complex System Reliability

If the system architecture cannot be decomposed into some combination of series-parallel structures, it is deemed a complex system. There are three methods for reliability analysis of a complex system using Figure 17.9 as an example.

c17-fig-0009 — Figure 17.9 A complex system.

17.4.1 Complete Enumeration Method

The complete enumeration method is based on a list of all possible combinations of states of the subsystems. Table 17.1 lists 2⁵ = 32 system states, which are all the possible states of the system given in Figure 17.9 based on the states of the subsystems. The symbol O stands for “system in operating state,” and F stands for “system in failed state.” Letters in uppercase denote a unit in an operating state, and lowercase letters denote a unit in a failed state.

Table 17.1 Complete enumeration example

System description	System condition	System status
All components operable	ABCDE	O
One unit in failed state	aBCDE	O
	AbCDE	O
	ABcDE	O
	ABCdE	O
	ABCDe	O
Two units in failed state	abCDE	F
	aBcDE	O
	aBCdE	O
	aBCDe	O
	AbcDE	F
	AbCdE	O
	AbCDe	O
	ABcdE	O
	ABcDe	O
	ABCde	O
Three units in failed state	ABcde	F
	AbCde	O
	AbcDe	F
	AbcdE	F
	aBCde	O
	aBcDe	O
	aBcdE	O
	abCDe	F
	abCdE	F
	abcDE	F
Four units in failed state	Abcde	F
	aBcde	F
	abCde	F
	abcDe	F
	abcdE	F
All five units in failed state	abcde	F

Each combination representing the system status can be written as a product of the probabilities of units being in a given state; for example, the second combination in Table 17.1 can be written as (1 − R_A)R_BR_CR_DR_E, where (1 − R_A) denotes the probability of failure of unit A by time t. The system reliability can be written as the sum of all the combinations for which the system is in operating state, O, that is,

(17.45) $c17-math-0045$

After simplification, the system reliability can be represented as

(17.46) $c17-math-0046$

17.4.2 Conditional Probability Method

The conditional probability method is based on the law of total probability, which allows system decomposition by a selected unit and its state at time t. For example, system reliability is equal to the reliability of the system given that unit A is in its operating state at time t, denoted by R_S|A_S, times the reliability of unit A, plus the reliability of the system, given that unit A is in a failed state at time t, R_S|A_F, times the unreliability of unit A, or

(17.47) $c17-math-0047$

This decomposition process continues until each term is written in terms of the reliability and unreliability of each of the units.

As an example, consider the system given in Figure 17.9 and decompose the system using unit C. Then, the system reliability can be written as

(17.48) $c17-math-0048$

If the unit C is in the operating state at time t, the system reduces to the configuration shown in Figure 17.10. Therefore, the system reliability, given that unit C is in its operating state at time t, is equal to the series-parallel combination as shown above, or

(17.49) $c17-math-0049$

c17-fig-0010 — Figure 17.10 System reduction when unit C is operating.

c17-fig-0011 — Figure 17.11 System reduction when unit C fails.

If unit C is in a failed state at time t, the system reduces to the configuration given in Figure 17.11. Then the system reliability, given that unit C is in a failed state, is given by

(17.50) $c17-math-0050$

The system reliability is obtained by substituting Equation 17.49 and Equation 17.50 into Equation 17.48:

(17.51) $c17-math-0051$

The system reliability is expressed in terms of the reliabilities of its components. Simplification of Equation 17.51 gives the same expression as Equation 17.46.

17.4.3 Concept of Coherent Structures

In general, the concept of coherent systems can be used to determine the reliability of any system (Barlow and Proschan 1975; Leemis 1995; Rausand and Hoyland 2003). The performance of each of the n components in the system is represented by a binary indicator variable, x_i, which takes the value 1 if the ith component functions and 0 if the ith component fails. Similarly, the binary variable ϕ indicates the state of the system, and ϕ is a function of x = (x₁, … , x_n).

The function ϕ(x) is called the structure function of the system. The structure function is represented by using the concept of minimal paths and minimal cuts. A minimal path is the minimal set of components whose functioning ensures the functioning of the system. A minimal cut is the minimal set of components whose failures would cause the system to fail. Let α_j(x) be the jth minimal path series structure for path A_j, j = 1, … , p, and β_k(x) be the kth minimal parallel cut structure for cut B_k, k = 1, … , s. Then we have

(17.52) $c17-math-0052$

and

(17.53) $c17-math-0053$

The structure function of the system using minimum cuts is given by Equation 17.54, and the structure function using minimum cuts is given by Equation 17.55, as follows:

(17.54) $c17-math-0054$

(17.55) $c17-math-0055$

Let us consider the following bridge structure given in Figure 17.12. For the bridge structure (Figure 17.12), we have four minimal paths and four minimal cuts, and their structure functions are given below:

$c17-math-5001$

c17-fig-0012 — Figure 17.12 Reliability block diagram of a bridge structure.

Then the reliability of the system is given by

(17.56) $c17-math-0056$

where X is the random vector of the states of the components (X₁, … , X_n).

We can develop the structure function by putting structure functions of minimum paths and minimum cuts in Equation 17.54 and Equation 17.55, respectively. When we do the expansion, we should remember that each x_i is a binary variable that takes values of 0 or 1, and hence, $c17-math-5002$ for any positive integer n is also a binary variable and takes the value of 0 or 1. If we do the expansion using Equation 17.54 or Equation 17.55, we can prove that the structure function for the system in Figure 17.12 is

(17.57) $c17-math-0057$

If R_i is the reliability of the ith component, then we know that

(17.58) $c17-math-0058$

and the system reliability for the bridge structure is given by

(17.59) $c17-math-0059$

If all R_i = R = 0.9, we have

(17.60) $c17-math-0060$

The exact calculations for R_S are generally very tedious because the paths and the cuts are dependent, since they may contain the same component. Bounds on system reliability are given by

(17.61) $c17-math-0061$

Using these bounds for the bridge structure, we have, when R_i = R = 0.9, the upper bound, R_U, on system reliability, R_S, is

(17.62) $c17-math-0062$

and the lower bound, R_L, is

(17.63) $c17-math-0063$

The bounds on system reliability using the concepts of minimum paths and cuts can be improved.

Example 17.11

Consider a system, shown in Figure 17.13, with six components, which has the following reliability block diagram.

The reliabilities of the components are as follows:

$c17-math-5056$

$c17-math-5057$

$c17-math-5058$

$c17-math-5059$

$c17-math-5060$

$c17-math-5061$

Find the exact reliability of the system using the series-parallel model.

Figure 17.13 Six component series-parallel model.

Solution:

$c17-math-5062$

Find all the minimum paths and minimum cuts for the above system.

Solution:

Components for minimal paths	Components for minimal cuts
1, 2, 5	5, 6
1, 2, 6	1, 3, 4
3, 5	2, 3, 4
3, 6
4, 5
4, 6

Find the lower bound and the upper bound on the system reliability using the equations for the bounds on system reliability, which uses the minimum paths and minimum cuts.

Solution:

Using Equation 17.61, we have

$c17-math-5063$

and

$c17-math-5064$

Thus, the reliability bounds are 0.970617 ≤ R_S ≤ 0.999211.

The lower bound is much better because there is less dependency between the minimum cuts (fewer components share different minimum cuts) than for minimum paths (where some components are part of several minimum paths).

17.5 Summary

The reliability of the system is a function of the reliabilities of its components and building blocks. To design, analyze, and evaluate the reliability and maintainability characteristics of a system, there must be an understanding of the system's relationships to all the subsystems, assemblies, and components. Many times this can be accomplished through logical and mathematical models. Engineering analysis of a system has to be conducted in order to develop a reliability model. Based on this analysis, a reliability block diagram is developed, which can be used to calculate various measures of reliability and maintainability. A reliability block diagram is a pictorial way of showing the success or failure combinations for a system. A system reliability block diagram presents a logical relationship of the system, subsystems, and components.

In a series system, all subsystems must operate successfully if the system is to function or operate successfully. This implies that the failure of any subsystem will cause the entire system to fail. Redundancy is a strategy to resolve this problem. Redundancy exists when one or more of the parts of a system can fail and the system will still be able to function with the parts that remain operational. Two common types of redundancy are active and standby. In active redundancy, all the parts are energized and operational during the operation of a system. In standby redundancy, some parts do not contribute to the operation of the system, and they get switched on only when there are failures in the active parts. In standby redundancy, the parts in standby ideally should last longer than the parts in active redundancy. It is often difficult to realize the benefits of redundancy if there are common mode failures, load sharing, and switching and standby failures. In addition to series systems, there are complex systems. If the system architecture cannot be decomposed into some combination of series-parallel structures, it is deemed a complex system. These two types of systems, series-parallel and complex, require different strategies for monitoring and evaluating system reliability.

Problems

17.1 The reliability block diagram of a system is given below. The number in each box is the reliability of the component. Find the reliability of the system.

Thus, A, B, and C are three subsystems that are in parallel.

17.2 The reliability block diagram of a system is given below. The number in each box is the reliability of the component. Find the reliability of the system.

17.3 There are three components, A, B, and C, and they are represented by different blocks in the following two reliability block diagrams. Both reliability block diagrams use the same component twice. Let the reliabilities of the components be denoted by R_A, R_B, and R_C.

(a) Is there a difference in reliability between the two configurations when the failures or success of all the components are independent of each other? Which system configuration or reliability block diagram has higher reliability? Explain your answer.

(b) Which configuration is more susceptible to common mode failure and why? Assume that each component (A, B, and C) can fail primarily by different mechanisms and those mechanisms are affected by different loads.

17.4 The reliability block diagram shown below is a complex system that cannot be decomposed into a “series-parallel” configuration. We want to determine the reliability equation for the system using the conditional probability method. We have decided to use the component B for the decomposition. Draw the two reliability block diagrams that result from “B operating” and “B failed” conditions.

17.5 Consider the system shown in the block diagram and derive an equation for the reliability of the system. R_X denotes the reliability of each component in the system, where X is the name of the component. For stage 3 (four C components in parallel), and it is a two-out-of-four system, that is, two components need to operate for the system to operate.

17.6 Derive (manually) the reliability equation of the system shown below. This is a complex dynamic system and the failure distribution for each component is shown in the table.

Component	Failure Distribution	Parameter (in Hour or Equivalent)
A	Weibull 3 parameter	β = 3, η = 1000, γ = 100
B	Exponential	MTBF = 1000
C	Lognormal	Mean = 6, standard deviation = 0.5
D	Weibull 3 parameter	β = 0.7, η = 150, γ = −100
E	Normal	Mean = 250, standard deviation = 15

Find the following for this complex system:

(a) System reliability at 100 hours

(b) System reliability at 0 hours

(d) Time when wearout region begins (use the graph)

(e) How long does it take for 75% of the system to fail?

What happens to the results if you switch the properties of component C and D?

17.7 Consider a series system composed of two subsystems where the first subsystem has a Weibull time to failure distribution with parameters η = 2 and θ = 200 hours. The second subsystem has an exponential time to failure distribution with θ = 300 hours. Develop the following functions for the system:

(a) Find the hazard rate function.

(b) Find the reliability function.

17.8 Consider a parallel system composed of two identical subsystems where the subsystem failure rate is λ, a constant.

(a) Assume a pure parallel arrangement and plot the reliability function using a normalized time scale for the abscissa as

$c17-math-5003$

(b) Assume a standby system with perfect switching and plot this reliability function on the same graph.

(c) Assume that the standby system has a switch with a probability of failure of 0.2, and plot this reliability function on the same graph.

(d) Compare the three systems.

17.9 A system consists of a basic unit and two standby units. All units (basic and the two standby) have an exponential distribution for time to failure with a failure rate of λ = 0.02 failures per hour. The probability that the switch will perform when required is 0.98.

(a) What is the reliability of the system at 50 hours?

(b) What is the expected life or MTTF for the system?

17.10 Consider a two-unit pure parallel arrangement where each subsystem has a constant failure rate of λ, and compare this to a standby redundant arrangement that has a constant switch failure rate of λ_SW. Specifically, what is the maximum permissible value of λ_SW such that the pure parallel arrangement is superior to the standby arrangement?

17.11 Consider a system that has seven components and the system will work if any five of the seven components work (5-out-of-7 system). Each component has a reliability of 0.92 for a given period. Find the reliability of the system.

17.12 Consider the following system, which consists of five components. The reliabilities of the components are as follows:

$c17-math-5004$

$c17-math-5005$

$c17-math-5006$

$c17-math-5007$

$c17-math-5008$

(a) Find the exact reliability of the system using the concepts of series and parallel models.

(b) Find all the minimum paths and minimum cuts for the system.

(d) Fine the structure function ϕ(x) using minimum cuts and show that you get the same answer as in part (c).

(e) Find an expression for the reliability of the system based on the structure function developed in part (c). Find the reliability using this equation and show that you get the same answer as you get in part (a).

(f) Find the lower bound, R_L, and the upper bound, R_U, on the system reliability using minimum paths and minimum cuts.

17.13 A system has four components with the following reliability block diagram:

The reliability of the four components is as follows:

$c17-math-5009$

$c17-math-5010$

$c17-math-5011$

$c17-math-5012$

(a) Find the exact reliability of the above system using the concepts of series and parallel models.

(b) Find all the minimum paths and minimum cuts for the above system.

(c) Find the structure function, ϕ(x), of the system using (1) minimum paths and (2) minimum cuts. Show that you get the same answer in both cases. Use the structure function to find the exact value of the system reliability.

(d) Find the lower bound and upper bound on system reliability with the above reliability numbers of the components using all the minimum paths and minimum cuts.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for 17: System Reliability Modeling

Create new playlist

Sign In

Sign Up

17.1 Reliability Block Diagram

17.2 Series System

17.3 Products with Redundancy

17.3.1 Active Redundancy

17.3.2 Standby Systems

17.3.3 Standby Systems with Imperfect Switching

17.3.4 Shared Load Parallel Models

17.3.5 (k, n) Systems

17.3.6 Limits of Redundancy

17.4 Complex System Reliability

17.4.1 Complete Enumeration Method

17.4.2 Conditional Probability Method

17.4.3 Concept of Coherent Structures

17.5 Summary

Problems

Table of Contents for
17: System Reliability Modeling