Chapter 8

Monitoring changes in covariance structure

Over the past decades, many successful MSPC application studies have been reported in the literature, for example Al-Ghazzawi and Lennox (2008); Aparisi (1998); Duchesne et al. (2002); Knutson (1988), Kourti and MacGregor (1995, 1996), Kruger et al. (2001), MacGregor et al. (1991), Marcon et al. (2005), Piovoso and Kosanovich (1992), Raich and Çinar (1996), Sohn et al. (2005), Tates et al. (1999), Veltkamp (1993), Wilson (2001). This chapter shows that the conventional MSPC framework, however, may be insensitive to certain fault conditions that affect the underlying geometric relationships of the LV sets. Section 8.1 demonstrates that even substantial alterations in the geometry of the sample projections may not yield acceptance of the alternative hypothesis that the process is out-of-statistical-control.

As the construction of the model and residuals subspaces as well as the control ellipses/ellipsoid for PCA/PLS models originate from data covariance and cross-covariance matrices, this problem is referred to as a change in covariance structure. Any change in these matrices consequently affects the orientation of these subspaces. Thus, in order to detect such alterations, it is imperative to monitor changes in the underlying data covariance structure, which Section 8.2 highlights. This section also presents preliminaries of the statistical local approach that allows constructing non-negative squared statistics that directly relate to the orientation of the model and residual subspaces and the control ellipses/ellipsoid.

This problem has been addressed by Ge et al. (2010, 2011), Kruger and Dimitriadis (2008); Kruger et al. (2007), and Kumar et al. (2002) by introducing a different paradigm to the MSPC-based framework. Blending the determination of the LV sets into the statistical local approach gives rise to the construction of statistics, which Section 8.3 introduces for PCA. These statistics are referred to as primary residuals that follow an unknown but non-Gaussian distribution.

It follows from the central limit theorem that a sum of random variables follow asymptotically a Gaussian distribution. This is taken advantage of in defining improved residuals that are based on the primary residuals. Section 8.4 revisits the simulation examples in Section 8.1 and shows that the deficiency of conventional MSPC can be overcome by deriving monitoring charts from the improved residuals.

Sections 8.5 introduces a fault diagnosis scheme to extract fault signatures for determining potential root causes of abnormal events. Section 8.6 applies the introduced monitoring approach to experimental data from a gearbox system. As in Section 8.4, the application study of the gearbox system highlights that the improved residuals are more sensitive in detecting abnormal process behavior when compared to conventional score variables.

Section 8.7 then discusses some theoretical aspects that stem from blending the statistical local approach into the conventional MSPC framework. This includes a direct comparison between the monitoring functions derived in Sections 8.3 and the score variables obtained by the PCA models and provides a detailed analysis the Hotelling's T2 and Q statistics derived from the improved residuals. The chapter concludes in Section 8.8 with a tutorial session concerning the material covered, including questions as well as homework and project assignments.

8.1 Problem analysis

This section presents examples demonstrating that conventional MSPC-based process monitoring maybe insensitive to changes in the covariance structure of the process variables. A statistic, developed here, describes under which conditions traditional fault detection charts are insensitive to such changes. All stochastic variables in this section are assumed to be of zero mean, which, according to (2.2), implies that z = z0. For simplicity, this section uses the data vector z instead of z0.

8.1.1 First intuitive example

This example involves two process variables constructed from two i.d. source variables of zero mean, s1 and s2, which have a variance of σ12 = 10 and σ22 = 2. The following transformation describes the construction of the process variables

8.1 8.1

Here, T(0) is a transformation matrix and the index (0) refers to the reference covariance structure. Equation (8.1) is an anticlockwise rotation of the original axes by 30°. Thus, 3934 and 3935 are coordinates of the rotated base, while s1 and s2 are coordinates of the original base. The covariance matrix of 3938 is

8.2 8.2

From (8.1), a total of 100 samples for 3939 and 3940 are generated. The plots in column (a) of Figure 8.1 show the corresponding scatter diagram (upper plot) and the Hotelling's T2 statistic. The anticlockwise rotation can be noticed from the orientation of the ellipse. Moreover, the rotation does not affect the length of the semimajor and semiminor. For α = 0.01, 3943, and the values of semimajor and semiminor are 3944 and 3945, respectively. A detailed discussion on how to construct control ellipses is given in Subsection 1.2.3. Specifically designed changes in the covariance structure of 3946 and 3947 are carried out next in order to demonstrate that conventional MSPC may not be able to detect them.

Figure 8.1 Detectable and undetectable changes in covariance structure.

8.1

8.1.1.1 First change in covariance structure

The following transformation changes the covariance structure between 3948 and 3949

8.3 8.3

where T(1) describes an anticlockwise rotation by 45° and the index (1) refers to the first change. Consequently, T(1)T(0) first represents an anticlockwise rotation by 30° to produce z(0) and a subsequent rotation by 45° to determine z(1). The variables z(1) are the coordinates to a base that is rotated by 75° relative to the original cartesian base. The covariance matrix for z(1), 3961, is

8.4 8.4

Using (8.3), a total of 100 samples are generated for z(1). From this set, the column plots associated with (b) in Figure 8.1 show the scatter diagram (upper plot) and the Hotelling's T2 statistic (lower plot). For the scatter diagram, the dashed and solid lines represent the control ellipse for the variable sets z(1) and z(0), respectively. Furthermore, the Hotelling's T2 statistic for each sample is computed with respect to 3967. Since eight points fall outside the confidence regions for the scatter diagram and the control limit of the Hotelling's T2 statistic, the charts correctly indicate an out-of-statistical-control situation. Consequently, this change in covariance structure between 3969 and 3970 is identifiable.

8.1.1.2 Second change in covariance structure

The same experiment is now repeated, but this time the variance of the i.d. sequences s1 and s2 is σ1 = 3 and 3974, respectively. Applying (8.3) to first produce 3975 and 3976 and subsequently 3977 and 3978 gives rise to the covariance matrix

8.5 8.5

With the reduced variance for s1 and s2, 100 samples are generated using (8.3). The plots in column (c) of Figure 8.1 show the scatter diagram of 3981 and 3982 and the Hotelling's T2 statistic based on 3984. The dashed control ellipse corresponds to 3985 and 3986 and the solid one refers to 3987 and 3988. Despite significant alterations to the covariance structure of 3989 and 3990 these changes are undetected since the dashed control ellipse is inside the solid one. Therefore, the alteration renders the scatter diagrams and the Hotelling's T2 statistic blind.

In essence, if changes to the covariance structure arise that lead to small alterations in the geometry of statistical confidence regions and limits, such events may not be detectable. Next, a more detailed statistical analysis is presented to formulate conditions which render conventional multivariate analysis insensitive.

8.1.2 Generic statistical analysis

The intuitive analysis in the previous subsection suggested that changes in the covariance structure manifest themselves in alterations of the eigenvalues and eigenvectors of the covariance matrix. This follows from (8.2), (8.4) and (8.5). However, this analysis was restricted to rotations of the control ellipse and is therefore limited in a multivariate context. More precisely, since MSPC techniques decompose the data space(s) into model and residual subspaces, a more generic condition must to be developed to investigate whether the above insensitivity can generally arise.

Concentrating on the non-negative quadratic Hotelling's T2 and Q statistics, violations of their control limits are indicative of such changes. This postulates the following condition for changes in the covariance structure to be undetectable.


Condition 8.1.1
A change in the covariance structure of the process variables is undetectable if and only if the Type I error with respect to the control limits or region of the original covariance structure does not exceed the significance α.

This represents a condition that can be satisfied by examining the control limits of the non-negative quadratic statistics. Subsection 3.1.2 showed that the control limit of the Hotelling's T2 statistic is, asymptotically, the critical value of a χ2 distribution for the significance α. On the other hand, the control limit of the Q statistic can be approximated by a χ2 distribution (Box 1954; Jackson and Mudholkar 1979; MacGregor and Kourti 1995; Satterthwaite 1941). With this in mind, it follows that

8.6 8.6

where η and θ are a weight factor and the number of degrees of freedom of a χ2 distribution, respectively. It should be noted that the approximation in (8.6) is also applicable to the Hotelling's T2 statistic.

In the case of PCA,

  • 4004 is the ith largest eigenvalue of Szz, and m0 and m1 are 1 and n, respectively, for the Hotelling's T2 statistic; and
  • 4011, and m0 and m1 are n + 1 and nz, respectively, for the Q statistic.

For PLS,

  • 4017 is 4018, and m0 and m1 are 1 and n, respectively, for the Hotelling's T2 statistic; and
  • 4024, and m0 and m1 are n + 1 and nx, respectively, for the Qe statistic.

Although the relationship below is also applicable to the Qf statistic for PLS, this analysis is not considered here.

Estimating the sample mean and variance of the sequence 4031, 4032, ···  , 4034, (3.30) and (3.31) show that

8.7 8.7

if K0 is sufficiently large. Here, the sub- and superscript (0) refer, as before, to the reference condition, and 4037 and 4038 are the estimated mean and variance, respectively. For a second sequence, 4039, which contains a total of K1 samples 4041, 4042, ···  , 4044, describing a change in the variable covariance structure, the parameters 4045 and 4046 can be obtained. Here, the sub- and superscript (1) refer to the second operating condition. Using the estimates 4048, 4049, 4050 and 4051 allows formulating the following condition for detecting the second and abnormal operating condition.


Condition 8.1.2
If the control limit for 4052, obtained for a significance α, is approximated by 4054, where η1 and θ1 are values for η and θ estimated from 4059, 4060, ···  , 4062 for a sufficiently large K1, is smaller or equal to the control limit for 4064, approximated by 4065, this change is undetectable.

Under the application of above condition, score-based process monitoring using conventional MSPC may be insensitive to changes in the variable covariance structure, which the next subsection illustrates using a three-variable example.

8.1.3 Second intuitive example

The three variables are defined by a linear combination of the two zero mean i.d. source signals 4066 and 4067, which have a variance of 4068 and 4069. As before, the superscript (0) refers to the original covariance structure. According to (2.2), the zero mean error vector 4071, augmented to the common cause variation Ξs(0), has an error covariance matrix 4073. Furthermore, (8.8) defines the score and loading vectors for the data vector 4074.

8.8 8.8

The matrix T(0) stores the eigenvectors of 4076 and 4077 is a vector storing the score variables. Under the assumption that 4078, the covariance matrix of z(0), 4080, is equal to

8.9 8.9

which follows from (6.5). Moreover, the column space of Ξ is equal to the first two column vectors of T(0). For simplicity, is assumed here that Ξ contains these column vectors, implying that the orthogonal complement, 4084, is the transpose of the third column vector and the generalized inverse, Ξ, is the transpose of Ξ.

The contribution of the first, second and third principal components to the sum of the variances of the three process variables are 4087, 4088 and 4089, which follows from (2.116) to (2.122). Equation (6.73) highlights that the first two score variables mainly describe the two source variables, which contribute 97.94% to this sum of variances, whilst the contribution of the third score variable is 2.04% and, according to (3.7), relates to 4092.

The eigenvectors 4093 and 4094 span the model subspace and 4095 spans the residual subspace. As the data space corresponding to z1, z2 and z3 is a Cartesian space, the minimum angles of the axes z1, z2 and z3 to the third eigenvector are 54.74°, 54.74° and 125.26°, respectively. The critical value of a χ2 distribution for two degrees of freedom and α = 0.01 is 4107. The lengths of the semimajor and semiminor of the control ellipse (first two score variables) are, therefore, 4108 and 4109, respectively, λ1 = 16 and λ2 = 8.

To introduce alterations to this data covariance structure and to examine whether these alterations are detectable, a total of four changes are considered. Each of these changes relates to an anticlockwise rotation of the original variable set by 30°. Equation (8.10) shows the corresponding rotation matrix T(1)

8.10 8.10

The first change is a simple rotation of the first two variables

8.11 8.11

where 4114. The remaining three changes also alter the variance of the score variables, listed in Table 8.1, which produces the data vectors z(2), z(3) and z(4)

8.12 8.12

There are now the following five variable sets:

1. the reference set z(0) yielding the loading vectors stored in T(0) and score variances of 16, 8 and 4120;
2. the variable set z(1) representing the loading vectors T(1)T(0) and score variance of 16, 8 and 4123;
3. the variable set z(2) producing the same loading vectors as z(1) but yields score variance of 4, 2 and 0.15;
4. variable set z(3) which produces the same loading vectors as z(2) but the variance of the third score variable is 0.125; and
5. finally variable set z(4) which again yields the same loading vectors as z(2) but the variance of the third score variable is 0.1.

To demonstrate how different these five variable sets are requires the inspection of the corresponding covariance matrices for z(0), z(1), z(2), z(3) and z(4)

8.13 8.13

Table 8.1 Variance of score variables 4872, 4873 and 4874.

NumberTable

The next step is to perform a total of 1000 Monte Carlo simulations for each of the five variable sets, z(0), ···  , z(4). According to Condition 8.1.2, the changes in the covariance structure cannot be detected if the control limits associated with the variable sets representing z(1), z(2), z(3) and z(4) are smaller or equal to the control limit corresponding to z(0). It is important to note, however, that the non-negative quadratic statistics must be constructed from the PCA model related to the variable set z(0). The calculation of the score variables for each of the five variable sets is

8.14

8.14 8.14

Based on (8.14), the five Hotelling's T2 statistics are now constructed from the first two elements of the score vectors 4145, ···  , 4147 and the score covariance matrix 4148. The Q statistics are simply the squared values of the third elements of 4150, ···  , 4152. Each of the 1000 Monte-Carlo simulation experiments include a total of K = 100 samples. This gives rise to a total of 1000 estimates for the control limits of the Hotelling's T2 and Q statistics for z(0), …  , z(4). To assess the sensitivity in detecting each of the four changes, the 2.5 and the 97.5 percentiles as well as the median can be utilized.

Figure 8.2 (a) shows the range limited by the 2.5 and 97.5 percentiles of the control limit for each of the five Hotelling's T2 statistics 4160, ···  , 4162. Plot (b) in this figure shows the ranges for the control limits of 4163, ···  , 4165. The circle inside each of the ranges represents the median. Examining the range for the Hotelling's T2 statistic in relation to Condition 8.1.2, it is clear that the Hotelling's T2 statistic is insensitive to any of the changes introduced to the original covariance structure.

Figure 8.2 Analysis of detectability for different covariance structures.

8.2

A different picture emerges when making the same comparison for the Q statistic. While the range for 4169 covers values between 2.2 and 4 (roughly), the values for 4170 range between around 9.5 and 15.5.1 According to Condition 8.1.2, this implies that this first alteration is detectable by the Q statistic. In contrast, the remaining three changes may not be detectable as the ranges for 4172, 4173 and 4174 have a significant overlap with the range for 4175. More precisely, the 2.5 and 97.5 percentiles for 4176 are larger than those for 4177, 4178 and 4179. Consequently, the second to fourth alterations are not detectable by the Hotelling's T2 and may not be detectable by the Q statistic either.

To graphically illustrate the above findings, a total of 100 samples are generated for variable sets z(0), z(1) and z(2). Referring to these sets as data set 1, data set 2 and data set 3, corresponding to z(0), z(1) and z(2), respectively, Figure 8.3 shows the results of analyzing them using a PCA model established from data set 1. In this figure, the column in rows (a), (b) and (c) represent the analysis of data set 1, data set 2 and data set 3, respectively. The upper plots show the control ellipse and the scatter plots of data sets 1 to 3. The plots in the middle and lower row of Figure 8.3 present the Hotelling's T2 and the Q statistics, respectively.

Figure 8.3 Detectable and undetectable changes in covariance structure.

8.3

The plots associated with index (a) indicate that the projection of each of the 100 samples of data set 1 onto the model subspace fall inside the control ellipse. This, in turn, implies that none of the samples results in a violation of the control limit of the Hotelling's T2 statistic. Also the residual Q statistic has not violated its control limit, Q0.01 = 3.2929, for any of the 100 samples of data set 1. Hence, the hypothesis that the process is in-statistical-control must be accepted.

A different result emerges when inspecting the plots associated with data set 2, representing an anticlockwise rotation of z1z2 axis by 30°. Although projecting the samples onto the model subspace shows no projections outside the control ellipse, the Q statistic highlights that the squared distance of a total of 16 samples from the model subspace is larger than 3.2929. This change is therefore detectable.

Finally, the plots corresponding to data set 3 point out that the projected samples onto the model subspace fall inside the control ellipse and that the squared distance of each sample from the model subspace is less than 3.2929. Consequently, this change remains undetected, which is undesirable. The remainder of this chapter describes the incorporation of the statistical local approach into the MSPC framework to detect such changes.

With regards to the second, third and fourth alterations, one could justifiably argue that if the third eigenvalue is not changed from 0.5 to 0.15, 0.125 and 0.1, respectively, any of these changes are detectable by the Q statistic. This follows from (6.4) and (6.5), which highlight that λ3 corresponds to the noise variance. According to Figure 8.3, the rotation of the control ellipse changes its orientation relative to the original model subspace. Thus, samples that are further away from the center of the ellipse but still inside produce a larger distance to the original model subspace.

If the axes of the rotated control ellipse are linear combinations of the eigenvectors spanning the model subspace, the rotated ellipse remains inside the model subspace. Hence, such an alteration of the covariance structure has no effect on the residual subspace and hence the Q statistic. Revisiting the geometric analysis in Figure 8.1, a change in the orientation and dimension may yield a control ellipse that lies within the original ellipse and is on the model subspace. Equations (6.7) to (6.11) outline that such an alteration results from a change in the covariance matrix of the source signals and may, consequently, remain undetected.

8.2 Preliminary discussion of related techniques

After outlining that the basic MSPC monitoring framework may not detect certain changes in the data covariance structure, a different paradigm is required to address this issue. Revisiting the analysis in Figure 8.1, the exact shape and orientation of a control ellipse is defined by the eigenvectors and eigenvalues of 4201. In other words, if the orientation of the eigenvectors and the eigenvalues could be monitored on-line, any change in the covariance structure can consequently not go unnoticed. It is therefore required to formulate monitoring functions that directly relate to the eigen decomposition of 4202.

Basseville (1988) described a statistical theory, known as the statistical local approach, that can be readily utilized to define vector-valued monitoring functions, referred to as primary residual vectors 4203, of the form

8.15 8.15

where 4204 is a vector of model parameters and 4205. For simplicity, the distribution function of 4206 is assumed to be unknown at this point.

The parameter vectors includes the eigendecomposition of 4207 for PCA and 4208 and 4209 for PLS. The construction of the primary residuals for PCA is discussed in Sections 8.3. For a statistical inference based on 4210, however, the following problem arises. How to construct a monitoring framework if 4211 cannot be assumed to be Gaussian or is unknown, as assumed thus far?

This question can be answered by assuming that z0 stores i.i.d. sequences, that is, 4213, where k and l are sample indices. As the distribution function of 4216 depends on the distribution function of z0, instances of 4218 are also i.i.d. Under these conditions, the following sum of the primary residual vectors

8.16 8.16

follows, asymptotically, a Gaussian distribution function, which is a result of the CLT. Subsection 8.7.1 provides a detailed discussion and a proof of the CLT. The sum in (8.16) is defined as the improved residual vector and is, asymptotically, Gaussian distribution. If 4219 and 4220, 4221 and can be utilized to construct scatter diagrams as well as a Hotelling's T2 statistic as discussed in Subsection 3.1.2.

For PCA, it is sufficient to develop primary residuals related to the eigenvalues and the eigenvectors of 4223, as they determine the orientation of the model and the residual subspaces, and the size and orientation of the control ellipse. For PLS, however, there are two interrelated data spaces. Project 2 in the tutorial session of this chapter extends the development of improved and primary residuals for PLS.

For PCA, the next section discuss the construction of primary and improved residuals describing changes in the geometry of the model and residual subspaces and summarizes their basic statistical properties.

8.3 Definition of primary and improved residuals

Sections 2.1 and 9.3 outline that a PCA monitoring model is completely described by the eigendecomposition of 4224. This includes the orientation of the model and residual subspaces as well as the orientation and size of the n dimensional control ellipsoid. Consequently, the primary residuals rely on the eigendecomposition of 4226, and are derived in Subsection 8.3.1 using the definition of the ith eigenvector 4228. Subsection 8.3.2 shows that primary residuals can also be obtained from 4229. Subsections 8.3.3 and 8.3.4 contrast both types of primary residuals and determine their statistical properties. Finally, Subsection 8.3.5 shows the construction of improved residuals.

8.3.1 Primary residuals for eigenvectors

Starting with the definition of the objective function for obtaining the ith eigenvector

8.17 8.17

the partial derivative of (8.17) allows determining the optimal solution

8.18 8.18

which is given by

8.19 8.19

The above equation relies on the fact that 4231. Now, defining

8.20 8.20

allows simplifying Equation (8.19) to become

8.21 8.21

and consequently

8.22 8.22

It follows from (8.22) that in the vicinity of pi, defined by 4233 for which 4234, the following holds true

8.23 8.23

Equations (8.22) and (8.23) imply that each loading vector pi produces a corresponding statistic 4236 such that 4237, when p is the equal to the ith eigenvector of 4240. In contrast, any deviation from zero indicates that pi is no longer the eigenvector associated with the ith eigenvalue.

The next step is to define two parameter vectors that store the eigenvectors spanning the model and residual subspaces. The vector for the model subspace, 4243, is

8.24 8.24

and that of the residual subspace, 4244, is defined as

8.25 8.25

This gives rise to the following two primary residual vectors for the model subspace

8.26 8.26

and the residual subspace

8.27 8.27

The next subsection develops primary residual vectors for the eigenvalues of 4245.

8.3.2 Primary residuals for eigenvalues

Pre-multiplying (8.20) by 4246 gives rise to

8.28 8.28

The expectation of 4247 directly follows from (8.22)

8.29 8.29

As before, 4248 defines the neighborhood of λi, where 4250. This implies that 4251 holds true if and only if λ is the ith largest eigenvalue of 4254. In a similar fashion to the 4255 and 4256, and 4257 and 4258, 4259 and 4260, and 4261 and 4262, for the retained and discarded eigenvalues can be defined as follows

8.30 8.30

The next subsection provides a detailed examination of the primary residuals.

8.3.3 Comparing both types of primary residuals

The analysis concentrates first on the primary residual vectors 4263 and 4264, which have the dimension nzn and nz(nzn), respectively. These dimensions, therefore, depend on the ratio 4267. If n is close to nz or if n is small compared to nz, the size of 4272 or 4273 can be substantial. This subsection then compares the sensitivity of 4274 and 4275, with 4276 and 4277 for detecting changes in the covariance structure.

8.3.3.1 Degrees of freedom for primary residuals 4278 and 4279

A closer inspection of the primary residuals 4280 and 4281 reveals that its elements may be linearly dependent. This is best demonstrated by a joint analysis

8.31 8.31

which can alternatively be written as

8.32 8.32

In matrix-vector form, (8.32) becomes

8.33 8.33

Since 4282 has full column rank, its rank is equal to 4283. More precisely, a total of 4284 elements in the combined primary residual vector are linearly dependent upon the remaining 4285 ones.

For the primary residual vectors 4286 and 4287, this has the following consequence: if the number of the elements in:

  • 4288 and
  • 4289

is larger than or equal to

images/c08_I0034.gif

there is a linear dependency between these primary residuals. This gives rise to linear dependency among the elements in 4290 and 4291 under the following conditions

NumberTable

and leads to the following criteria for 4304

8.34 8.34

and 4305

8.35 8.35

From the above relationships, it follows that

8.36 8.36

which can only be satisfied if 4306 if nz is even and 4308 if nz is odd. Figure 8.4 summarizes the above findings and shows graphically which condition leads to a linear dependency of the primary residuals in 4310 and 4311.

Figure 8.4 Linear dependency among elements in 4863 and 4864.

8.4

The importance of these findings relates to the construction of the primary residual vectors, since the number of source signals is determined as part of the identification of a principal component model. In other words, the original size of 4312 and 4313 is nzn and nz(nzn), respectively, and known a priori. If the analysis summarized in Figure 8.4 reveals that elements stored in the primary residual vectors are linearly dependent, the redundancy can be removed by eliminating redundant elements in 4316 or 4317, such that this number is smaller or equal to 4318 in both vectors.

8.3.3.2 Sensitivity analysis for 4319, 4320, 4321 and 4322

To investigate whether the primary residuals 4323 and 4324 can both detect changes in the eigenvalues and the eigenvectors of 4325, the examination focuses on:

  • the primary residuals 4326 and 4327 to evaluate their sensitivity in detecting changes in the eigenvectors and eigenvalues associated with the orientation of the model subspace and the orientation and size of the control ellipsoid; and
  • the primary residuals 4328 and 4329 to examine their sensitivity in detecting changes in the eigenvectors and eigenvalues related to the orientation of the residual subspace and, according to (3.16), the approximation of the distribution function of the sum of squared residuals.

The resultant analysis yields the following two lemmas, which are proved below.


Lemma 8.3.1
For a change in the orientation of the model subspace and/or the orientation/size of the control ellipsoid, the primary residual vectors 4330 and 4331 are sensitive in detecting this change, as their expectation differs from zero.

 


Lemma 8.3.2
For a change associated with the orientation of the residual subspace and/or, the approximation of the distribution function of the sum of the squared residuals, the primary residual vectors 4332 and 4333 can both detect this change by producing an expectation that is different from zero.

 


Proof.
The proof commences by rewriting (8.17) as follows

8.37 8.37

and investigating the impact of a change in pi, that is, 4335, ||Δpi|| 7.1 ||pi||, and λi, i.e. 4338, |Δλi| 7.1 λi.

Directional changes of the ith eigenvector

Assuming that λi remains unchanged, (8.19) can be rewritten on the basis of (8.37)

8.38 8.38

Knowing that a change in the covariance structure between the recorded process variables produces a different 4342, denoted here by 4343, (8.38) becomes

8.39 8.39

The expectation of the primary residual vector 4344 and given by

8.40 8.40

It follows that 4345 depends on the changes of the elements in 4346. Equation (8.40) shows that the condition 4347 only arises if and only if pi is also an eigenvector of 4349 associated with λi. This situation, however, cannot arise for all 1 ≤ i ≤ nz unless 4352. An important question is whether the primary residual 4353 also reflect a directional changes of pi. This can be examined by subtracting 4355 from (8.39), where 4356 is the eigenvector of 4357 associated with λi, which yields

8.41 8.41

Pre-multiplying the above equation by 4359 produces

8.42 8.42

It is important to note that if the pre-multiplication is carried out by the transpose of 4360, (8.42) becomes zero, since 4361. Consequently, any directional change of pi manifests itself in 4363. This, in turn, implies that both primary residual vectors, 4364 and 4365, are sufficient in detecting any directional change in pi by a mean different from zero. It should also be noted that if 4367 if both vectors are orthogonal to each other. A closer inspection of (8.42), however, yields that only the trivial case of 4368 can produce ϵi = 0.

Changes in the ith eigenvalue

Now, λi changes under the assumption that pi remains constant. For this change, (8.39) becomes

8.43 8.43

Subtracting 4373, based on the correct eigenvalue 4374, from Equation (8.43) gives rise to

8.44 8.44

and hence, 4375, which implies that 4376 is sensitive to the change in λi. Finally, pre-multiplication of (8.44) by 4378 yields

8.45 8.45

where 4379. Thus, 4380. This analysis highlights that both primary residual vectors, 4381 and 4382, can detect the change in λi.

The above lemmas outline that any change in the covariance structure of z0 can be detected by 4385 and ϕi. Given that:

  • the dimensions of the primary residuals 4387 and 4388 are significantly smaller than those of 4389 and 4390, respectively;
  • the primary residuals for the eigenvectors and the eigenvalues, 4391 and 4392, can detect a change in the covariance structure of z0; and
  • the elements in the primary residual vectors 4394 and 4395 cannot generally be assumed to be linearly independent,

it is advisable to utilize the primary residual vectors 4396 and 4397 for process monitoring. For simplicity, the parameter vectors are now denoted as follows 4398 and 4399. Moreover, the tilde used to discriminate between 4400 and its scaled sum 4401 is no longer required and can be omitted. The next subsection analyzes the statistical properties of 4402 and 4403.

8.3.4 Statistical properties of primary residuals

According to (8.29), the expectation of both primary residual vectors, 4404 and 4405, is equal to zero. The remaining statistical properties of ϕi include its variance, the covariance of ϕi and ϕj, the distribution function of ϕi and the central moments of ϕi. This allows constructing the covariance matrices for 4411 and 4412, 4413 and 4414, respectively.

Variance of ϕi

The variance of ϕi can be obtained as follows:

8.46 8.46

which can be simplified to

8.47 8.47

Given that:

  • 4417; and
  • 4418

it follows that 4419. As ti is Gaussian distributed, central moments of 4421 are 0 if m is odd and 4423. If m is even.2 For m = 2, 4426 and for m = 4, 4428. Substituting this into (8.47) gives rise to

8.48 8.48

Covariance of ϕi and ϕj

The covariance between two primary residuals is

8.49 8.49

and can be simplified to

8.50 8.50

Now, substituting 4431, 4432 and 4433, which follows from the Isserlis theorem (Isserlis 1918) and the fact that ti and tj are statistically independent and Gaussian distributed, (8.50) reduces to

8.51 8.51

Consequently, there is no covariance between ϕi and ϕj, implying that the covariance matrices for 4438 and 4439 reduce to diagonal matrices.

Distribution function of ϕi

The random variable

8.52 8.52

yields the following distribution function

8.53 8.53

since 4441. In other words, the distribution function of ϕi can be obtained by substituting the transformation in (8.52) into the distribution function of a χ2 distribution with one degree of freedom

8.54 8.54

which gives rise to

8.55 8.55

With respect to (8.55), the PDF fi) > 0 within the interval ( − 2λi, ∞), which follows from the fact that 4446. In (8.54) and (8.55), Γ(1/2) is the gamma function, defined by the improper integral 4448. Figure 8.5 shows the probability density function of the primary residuals for various values of λi. The vertical lines in this figure represent the asymptotes at − 2λi.

Figure 8.5 Probability density function of ϕi for different values of λi.

8.5

Central moments of ϕi

The determination of the central moments of ϕi relies on evaluating the definition for central moments, which is given by

8.56 8.56

According to (8.56), the central moments can be obtained directly by evaluating the expectation 4453, which gives rise to

8.57 8.57

Isolating the terms in (8.57) that are associated with 4454 and substituting the central moments for 4455 yields

8.58 8.58

where

8.59 8.59

are binomial coefficients and m! = 1 · 2 · 3 ··· (m − 1) · m. Table 8.2 summarizes the first seven central moments of ϕi.

Table 8.2 First seven central moments of ϕi.

Order m Central moment 4885
1 0
2 4886
3 4887
4 4888
5 4889
6 4890
7 4891

8.3.5 Improved residuals for eigenvalues

Equation (8.16) shows that the improved residuals are time-based sums of the primary residuals and asymptotically Gaussian distributed, given that the primary residuals are i.i.d. sequences. Following from the geometric analysis of the data structure 4458 and its assumptions, discussed in Subsection 2.1.1, the model and residual subspaces are spanned by the n dominant and the remaining nzn eigenvectors of 4461, respectively.

Using the definition of the primary residuals for the eigenvalues, the improved residuals become

8.60 8.60

As the eigenvectors and eigenvalues are functions of 4462, the dependencies on these parameters can be removed from (8.16) and hence, θi = θi(z0, K) with K being the number of samples and ϕi = ϕi(z0(k)). The first and second order moments of θi(z0, K) are as follows

8.61 8.61

and

8.62 8.62

respectively. Note that the factor 2 in (8.28) has been removed, as it is only a scaling factor. The variance of ϕi is therefore 4468. That 4469 follows from the Isserlis theorem (Isserlis 1918). The improved residuals can now be utilized in defining non-negative quadratic statistics.

The separation of the data space into the model and residual subspaces yielded two non-negative quadratic statistics. These describe the variation of the sample projections onto the model subspace (Hotelling's T2 statistic) and onto the residual subspace (Q statistic). With this in mind, the primary residuals associated with the n largest eigenvalues and remaining nzn identical eigenvalues can be used to construct the Hotelling's T2 and residual Q statistics, respectively.

Intuitively, the definition of these statistics is given by

8.63 8.63

and follows the definition of the conventional Hotelling's T2 and Q statistics in (3.8) and (3.15), respectively.

As the number of recorded samples, K, grows so does the upper summation index in (8.60). This, however, presents the following problem. A large K may dilute the impact of a fault upon the sum in (8.60) if only the last few samples describe the abnormal condition. As advocated in Chapter 7, however, this issue can be addressed by considering samples that are inside a sliding window only. Defining the window size by k0, the incorporation of a moving window yields the following formulation of (8.60)

8.64 8.64

The selection of k0 is a trade-off between accuracy and sensitivity. The improved residuals converge asymptotically to a Gaussian distribution, which demands larger values for k0. On the other hand, a large k0 value may dilute the impact of a fault condition and yield a larger average run length, which is the time it takes to detect a fault from its first occurrence. The selection of k0 is discussed in the next section, which revisits the simulation examples in Section 8.1.

8.4 Revisiting the simulation examples of Section 8.1

This section revisits both examples in Section 8.1, which were used to demonstrate that the conventional MSPC framework may not detect changes in the underlying covariance structure.

8.4.1 First simulation example

Figure 8.1 showed that the scatter diagram and the Hotelling's T2 statistic only detected the first change but not the second one. Recall that both changes resulted in a rotation of the control ellipse for 4486 and 4487 by 45°. Whilst the variance of both score variables remained unchanged, the variances for the second change were significantly reduced such that the rotated control ellipse was inside the original one.

Given that both changes yield a different eigendecomposition for the variable pairs 4489, 4490 and 4491, 4492, the primary residuals are expected to have a mean different from zero. Before determining improved residuals, however, k0 needs to be determined. If k0 is too small the improved residuals may not follow a Gaussian distribution accurately, and a too large k0 may compromise the sensitivity in detecting slowly developing faults (Kruger and Dimitriadis 2008; Kruger et al. 2007).

Although the transformation matrix T(0) and the variances of the i.d. score variables 4497 and 4498 are known here, the covariance matrix 4499 and its eigendecomposition would need to be estimated in practice. Table 8.3 summarizes the results of estimating the covariance of both improved residual variables for a variety of sample sizes and window lengths.

Table 8.3 Estimated means and variances of improved residuals

NumberTable

As per their definition, the improved residuals asymptotically follow a Gaussian distribution of zero mean and variance 4500 if the constant term in (8.28) is not considered. The mean and variance for θ1 and θ2 are 2 × 102 = 200 and 2 × 22 = 8, respectively. The covariance E1θ2} = 0 is also estimated in Table 8.3.

The entries in this table are averaged values for 1000 Monte Carlo simulations. In other words, for each combination of K and k0 a total of 1000 data sets are simulated and the mean, variance and covariance values for each set are the averaged estimates. The averages of each combination indicate that the main effect for an accurate estimation is K, the number of reference samples of θ1 and θ2. Particularly window sizes above 50 require sample sizes of 2000 or above to be accurate.

This is in line with expectation, following the discussion in Section 6.4. The entries in Table 8.3 suggest that the number of reference samples for θ1 and θ2, K, need to be at least 50 times larger then the window size k0. Another important issue is to determine how large k0 needs to be to accurately follow a Gaussian distribution. Figure 8.6 shows Gaussian distribution functions in comparison with the estimated distribution functions of ϕ1 and ϕ2, and θ1 and θ2 for k0 = 10, 50 and 200.

Figure 8.6 Distribution functions of primary and improved residuals.

8.6

As expected, the upper plot in this figure shows that the distribution function of primary residuals depart substantially from a Gaussian distribution (straight line). In fact, (8.55) and Figure 8.5 outline that they follow a central χ2 distribution. The plots in the second, third and bottom row, however, confirm that the sum of the primary residuals converge to a Gaussian distribution.

Whilst the smaller window sizes of k0 = 10 and k0 = 50 still resulted in significant departures from the Gaussian distribution, k0 = 200 produced a close approximation of the Gaussian distribution. Together with the analysis of Table 8.3, a window size of k0 = 200 would require a total of K = 200 × 50 = 10 000 reference samples to ensure that the variance of θ1 and θ2 are close to 2 λ12 and 2 λ22, respectively.

Using the same 1000 Monte Carlo simulations to obtain the average values in Table 8.3 yields an average of 200.28 and 7.865 for 4533 and 4534, respectively, and − 0.243 for 4536. After determining an appropriate value for k0, the Hotelling's 4538 statistics can now be be computed as shown in (8.60).

Figure 8.7 compares the conventional Hotelling's T2 statistic with the one generated by the statistical local approach. For k0 = 200, both plots in this figure show a total of 100 samples obtained from the original covariance structure (left portion), the first change (middle portion) and the second change (right portion of the plots).

Figure 8.7 First simulation example revisited.

8.7

As observed in Figure 8.1, the conventional Hotelling's T2 statistic could only detect the first change but not the second one. In contrast, the non-negative quadratic statistic based on the statistical local approach is capable of detecting both changes. More precisely, the change in the direction of both eigenvectors (first change) and both eigenvectors and eigenvalues (second change) yields an expectation for both primary residual function that is different from 0.

8.4.2 Second simulation example

Figures 8.2 and 8.3 highlight that conventional MSPC can only detect one out of the four changes of the original covariance structure. The remaining ones, although major, may not be detectable. Each of these changes alter the orientation of the model and residual subspaces as well as the orientation of the control ellipse. This, in turn, also yields a different eigendecomposition in each of the four cases compared to the eigendecomposition of the original covariance structure.

The primary residuals are therefore expected to have mean values that differ from zero. The first step is to determine an appropriate value for k0. Assuming that the variances for each of the improved residuals, 4543, 4544 and 4545, need to be estimated, the same analysis as in Table 8.3 yields that K should be 100 times larger than k0.

Figure 8.8 compares the estimated distribution function of the improved residuals with a Gaussian distribution function (straight lines) for different values of k0. The estimation of each distribution function was based on K = 100 × 200 = 20 000 samples. As the primary residuals are χ2 distributed the approximated distribution function, consequently, showed no resemblance to a Gaussian one. For k0 = 10 and k0 = 50, the estimated distribution function still showed significant departures from a Gaussian distribution. Selection k0 = 200, however, produced a distribution function that is close to a Gaussian one.

Figure 8.8 Distribution functions of primary and improved residuals.

8.8

This is expected, as the improved residuals are asymptotically Gaussian distributed. In other words, the larger k0 the closer the distribution function is to a Gaussian one. It is important to note, however, that if k0 is selected too large it may dilute the impact of a fault condition and render it more difficult to detect. With this in mind, the selection of k0 = 200 presents a compromise between accuracy of the improved residuals and the average run length for detecting an incipient fault condition.

Figure 8.9 contrasts the conventional non-negative quadratic statistics (upper plots) with those based on the statistical local approach (lower plots) for a total of 100 simulated samples. This comparison confirms that the Hotelling's T2 and Q statistics can only detect the first change but are insensitive to the remaining three alterations.

Figure 8.9 Second simulation example revisited.

8.9

The non-negative quadratic statistics relating to the statistical local approach, however, detect each change. It is interesting to note that the first change only affected the Qθ statistic, whilst the impact of the remaining three changes manifested themselves in the Hotelling's 4560 statistic. This is not surprising, however, given that the primary residuals are a centered measure of variance, which follows from (8.28).

To explain this, the variance of the three score variables can be estimated for each covariance structure. Determining the score variables as 4561, where P stores the eigenvectors of 4563, allows us to estimate these variances. Using a Monte Carlo simulation including 1000 runs, Table 8.4 lists the average values of the estimated variances. The Monte Carlo simulations for each of the five covariance structures were based on a sample size of K = 1000.

Table 8.4 Estimated variances of 4923, 4924 and 4925.

NumberTable

The sensitivity of the Hotelling's 4565 and Qθ statistics for each alternation follows from the estimated averages in this table. The initial 30° rotation produces slightly similar variances for the first and second principal component. The variance of the third principal component, however, is about three and a half times larger after the rotation. Consequently, the Hotelling's 4568 statistic is only marginally affected by the rotation, whereas a very significant significant impact arises for the Qθ statistic.

In contrast, the average eigenvalue for the second, third and fourth alteration produced averaged first and second eigenvalues that are around one quarter of the original ones. The averaged third eigenvalue, however, is very similar to the original one. This explains why these alterations are detectable by the Hotelling's 4570 statistic, while the Qθ statistic does not show any significant response.

Plotting the improved residuals for each covariance structure and K = 1000, which Figure 8.10 shows, also confirms these findings. For a significance of 0.01, the control limits for each improved residual are 4573. The larger variance of the third score variable yielded a positive primary residual for the first alteration. Moreover, the smaller variances of the first and second score variables produced negative primary residuals for the remaining changes.

Figure 8.10 Plots of the three improved residuals for each of the five covariance structures.

8.10

8.5 Fault isolation and identification

For describing a fault condition, Kruger and Dimitriadis (2008) introduced a fault diagnosis approach that extracts the fault signature from the primary residuals. The fault signature can take the form of a simple step-type fault, such as a sensor bias that produces a constant offset, or can have a general deterministic function. For simplicity, the relationship of this diagnosis scheme concentrate first on step-type faults in Subsection 8.5.1. Subsection 8.5.2 then expands this concept to approximate a general deterministic fault signature.

8.5.1 Diagnosis of step-type fault conditions

The augmented data structure to describe a step-type follows from (3.68)

8.65 8.65

where 4574 represents an offset term that describes the fault condition. In analogy to the projection-based variable reconstruction approach, the offset can be expressed as follows

8.66 8.66

Here, 4575 is the fault direction and μ is the fault magnitude. With respect to the convention introduced by Isermann and Ballé (1997), the detection of a fault condition and the estimation of 4577 refers to fault isolation. As μ describes the size of the fault, the estimation of the fault magnitude represents the fault identification step.

Equation (8.67) describes the impact of the offset term upon the primary residual vector for the ith eigenvector

8.67 8.67

for omitting the constant of 2 in (8.20). Substituting (8.65) into (8.67) yields

8.68 8.68

Given that E{ϕi} = 0, E{z0} = 0 and E{ti} = 0, taking the expectation of (8.86) gives rise to

8.69 8.69

Here ⊗ refers to the Kronecker product of two matrices. The results of the two Kronecker products are as follows

8.70a 8.70a

8.70b 8.70b

With 4584, (8.69) has a total of 4585 unknowns but only nz linearly independent equations and is hence an underdetermined system. However, there are a total of nz equations for 1 ≤ i ≤ nz. Hence, (8.69) in augmented form becomes

8.71 8.71

It is interesting to note that the linear dependency in (8.69) and (8.71) follows from the analysis in Subsection 8.3.3 and particularly (8.33). It is therefore possible to remove the redundant 4589 column vectors of Ψ and 4591 elements of the vector ζ, which gives rise to

8.72 8.72

where 4593 and 4594. The expectation on the left hand side of (8.72) can be estimated from the recorded data and the matrix Ψred is made up of the elements of loading vectors and hence known. The elements of the vector ζred are consequently the only unknown and can be estimated by the generalized inverse of Ψred, i.e. 4598

8.73 8.73

For estimating 4599, however, it is possible to rely on the improved residuals, since

8.74 8.74

Here, 4600 and Φf(l) = Φ(z0(l) + Δz0). In other words, the fault condition can be obtained directly from the improved residuals.

From the estimation of 4603, only the terms 4604, 4605, …  , 4607 are of interest, as these allow estimation of υ and μ. The estimate of the fault magnitude is given by

8.75 8.75

For estimating the fault direction, however, only the absolute value for each element of 4610 is available. For determining the sign for each element, the data model of the fault condition can be revisited, which yields

8.76 8.76

and leads to the following test

8.77 8.77

After determining all signs using (8.77), the estimation of the fault direction, 4611, is completed.

It should be noted that the above fault diagnosis scheme is beneficial, as the traditional MSPC approach may be unable to detect changes in the data covariance structure. Moreover, the primary residuals are readily available and the matrix 4612 is predetermined, thus allowing us to estimate the fault signature in a simple and straightforward manner. It should also be noted that 4613 provides a visual aid to demonstrate how the fault signature affects different variable combinations. For this, the individual elements in 4614 can be plotted in a bar chart. The next subsection discusses how to utilize this scheme for general deterministic fault conditions.

8.5.2 Diagnosis of general deterministic fault conditions

The data structure for a general deterministic fault condition is the following extension of (8.65)

8.78 8.78

where Δz0(k) is some deterministic function representing the impact of a fault condition. Utilizing the fault diagnosis scheme derived in (8.67) to (8.73), the fault signature can be estimated, or to be more precise, approximated by a following moving window implementation of (8.73)

8.79 8.79

As in Chapter 7, 4616 is the size of the moving window. The accuracy of approximating the fault signature depends on the selection of 4617 but also the nature of the deterministic function. Significant gradients or perhaps abrupt changes require smaller window sizes in order to produce accurate approximations. A small sample set, however, has the tendency to produce a less accurate estimation of a parameter, which follows from the discussion in Sections 6.4. To guarantee an accurate estimation of the fault signature, it must be assumed that the deterministic function is smooth and does not contain significant gradients or high frequency oscillation. The fault diagnosis scheme can therefore be applied in the presence of gradual drifts, for example unexpected performance deteriorations as simulated for the FCCU application study in Section 7.5 or unmeasured disturbances that have a gradual and undesired impact upon the process behavior.

One could argue that the average of the recorded process variables within a moving window can also be displayed, which is conceptually simpler than extracting the fault signature from the primary or improved residual vectors. The use of the proposed approach, however, offers one significant advantage. The extracted fault signature approximates the fault signature as a squared curve. In other words, it suppresses values that are close to zero and magnifies values that are larger than one. Hence, the proposed fault diagnosis scheme allows a better discrimination between normal operating conditions and the presence of a fault condition. This is exemplified by a simulation example in the next subsection.

8.5.3 A simulation example

This simulation example follows from the data model of the first intuitive example in Subsection 8.1.1. The two variables have the data and covariance structure described in (8.1) and (8.2), respectively. To construct a suitable deterministic fault condition, the three different covariance structures that were initially used to demonstrate that changes in the covariance structure may not be detectable using conventional MSPC have been revisited as follows. Each of the three covariance structures are identical and equal to that of (8.2). The three variable sets containing a total of 5000 samples each are generated as follows

8.80a 8.80a

8.80b 8.80b

8.80c 8.80c

where 1 ≤ k ≤ 5000 is the sample index. It should also be noted that the samples for 4619, 4620 and 4621 are statistically independent of each other. Moreover, each of the source variables has a mean of zero. The properties of the source signals for each of the data sets are therefore

8.81 8.81

8.82 8.82

8.82

Concatenating the three data sets then produced a combined data set of 15 000 samples. The fault diagnosis scheme introduced in Subsections 8.5.1 and 8.5.2, was now applied to the combined data set for a window size of 4622. Figure 8.11 shows the approximated fault signature each of the data sets. As expected, the estimated fault signature for 4623, 4624 and 4625 show negligible departures from zero for the first two data sets. For the third data set, an accurate approximation of the squared fault signature 4626 and 4627 as well as the cross-product term 4628 (dashed line) can be seen at first glance.

Figure 8.11 Approximated fault signature for 4867, 4868 and 4869.

8.11

A closer inspection, however, shows a slight delay with which the original fault signature is approximated, particularly for higher frequency fault signatures in the middle and lower plots in Figure 8.11. According to (8.79), this follows from the moving window approach, which produces an average value for the window. Consequently, for sharply increasing or reducing slopes, like in the case of the sinusoidal signal, the use of the moving window compromises the accuracy of the approximation. The accuracy, however, can be improved by reducing in the window size. This, in turn, has a detrimental effect on the smoothness of the approximation.

The last paragraph in Subsection 8.5.2 raises the question concerning the benefit of the proposed fault diagnosis scheme over a simple moving window average of the process variables. To substantiate the advantage of extracting the squared fault signature from the primary residuals instead of the moving window average of the process variables, Figure 8.12 shows the approximation of the fault signature using a moving window average of the process variables. In order to conduct a fair comparison, the window size for producing the resultant fault signatures in Figure 8.12 was also set to be 4629.

Figure 8.12 Approximated fault signatures for 4870 and 4871.

8.12

It is interesting to note that the variance of the estimated fault signature for the first two data sets appears to be significantly larger relative to the variance of the estimated fault signature when directly comparing Figures 8.11 and 8.12. In fact, the amplitude of the sinusoidal signals is squared when using the proposed approach compared to the moving window average of the recorded process variables. Secondly, the accuracy of estimating the fault signature in both cases is comparable.

Based on the results of this comparison, the benefit of the proposed fault diagnosis scheme over a simple moving window average of the process variables becomes clear if the amplitude of the sinusoidal is reduced from five to three for example. It can be expected in this case that the variance of the estimated fault signature for the first 10 000 samples increases more substantial relative to the reduced fault signature. This, however, may compromise a clear and distinctive discrimination between the fault signature and normal operating condition, particularly for smaller window sizes.

8.6 Application study of a gearbox system

This section extends the comparison between the non-negative quadratic statistics constructed from the improved residuals with those based on the score variables using an application study of a gearbox system. This system is mounted on an experimental test rig to record normal operating conditions as well as a number of fault conditions.

The next subsection gives a detailed description of the gearbox system and Subsection 8.6.2 explains how the fault condition was injected into the system. Subsection 8.6.3 then summarizes the identification of a PCA-based monitoring model and the construction of improved residuals. Subsection 8.6.4 finally contrasts the performance of the non-negative quadratic statistics based on the improved residuals with those relying on the score variables.

8.6.1 Process description

Given the widespread use of gearbox systems, the performance monitoring of such systems is an important research area in a general engineering context, for example in mechanical and power engineering applications. A gearbox is an arrangement involving a train of gears that transmit power and regulate rotational speed, for example, from an engine to the axle of a car.

Figure 8.13 presents a schematic diagram of the two-stage helical gearbox system (upper plot) and a similar gearbox to that used to generate the recorded vibration data (lower plot). Table 8.5 provides details of the gearbox, which was operated under full-load conditions of 260Nm.

Figure 8.13 Schematic diagram of the gearbox system.

8.13

Table 8.5 Specification of gearbox system under study

NumberTable

Figure 8.13 shows that a total of four accelerometers are mounted on this gearbox system, which record the vibration signals simultaneously at a frequency of 6.4 kHz. Each recorded data set includes a total of 32 768 samples. Two data sets were recorded that describe a normal operating condition and a further six data sets that represent a gradually increasing fault condition. Figure 8.14 shows the first 5000 samples of each of the four vibration sensors, z1 to z4, for one of the reference sets.

Figure 8.14 First 5000 samples of reference data.

8.14

8.6.2 Fault description

For a gearbox system, a tooth breakage is a serious localized fault. Such a fault was simulated here by removing a certain percentage of one tooth in the pinion gear. This tooth removal enabled an experimental representation of a gradual fault advancement under predefined conditions.

The simulated tooth breakage represents the chipping of small parts of one tooth, which is one of the common fault conditions in gearbox systems. For this, the total length of one tooth was gradually removed by increments of 10%. The recorded fault conditions here included a 10% to 50% and a 100% removal of the tooth, that is, a total of six additional data sets that describe the removal of one tooth in various stages. These data sets are referred to here as 10% Fault, 20% Fault to 100% Fault.

Consequences of being unable to detect such faults at early stages include productivity decreases in manufacturing processes, reduced efficiency of engines, equipment damage or even failure. An early detection can provide significant improvements in the reduction of operational and maintenance costs, system down-time, and lead to increased levels of safety, which is an ever-growing concern.

An incipient fault in a mechanical system usually affects certain parameters, such as vibration, noise and temperature. Analyzing these external variables allows the performance monitoring of gears, which are usually inaccessible without dismantling the system. Extracting relevant information from the recorded signals is hence essential for detecting any irregularity that could be caused by tooth breakage or wear.

Baydar and Ball (2001), Baydar et al. (1999, 2001) and Ge et al. (2010) analyzed recorded vibration data from this system using a variety of different methods. Other research work on detecting abnormal operating conditions in gearbox systems include Bartelmus and Zimroz (2009), Hu et al. (2007), Stander et al. (2002), Staszewski and Tomlinson (1997), Tan and Mba (2005), Wang (2008) and Wu et al. (2008).

Since there may be more than one vibration sensor, Baydar and Ball (2001); Baydar et al. (1999, 2001) introduced the application of MSPC to successfully detect tooth defects. In a comprehensive comparison, Baydar et al. (2001) demonstrated that MSPC is equally as sensitive in detecting tooth defects as signal-based approaches but easier to implement in practice. More recent application of MSPC methods in monitoring mechanical systems are given by He et al. (2007, 2009) and Malhi and Gao (2004).

8.6.3 Identification of a monitoring model

This section utilizes PCA to identify a data model according to (2.2). Using one of the reference data sets describing a fault-free operating condition, where no portion of the tooth has been removed, the first steps include the centering and scaling of the data and the estimation of the data correlation matrix.

It follows from Figure 8.14 that the mean value for each vibration signal is close to zero. Estimating the mean yielded values of 4632, 4633, 4634 and 4635. The estimated variances for each sensor are 4636, 4637, 4638 and 4639.

By inspecting the variances, it is apparent that 4640 is five times larger than 4641 and also significantly larger than 4642 and 4643. Chapter 4 highlighted that significant differences in variance may lead to dominant contributions of process variables with larger variances than the computed score variables.

Jackson (2003) advocated to use of the correlation matrix (i) to ensure that the variables are dimensionless, that is, their dimension for example °C or bar reduces to one and (ii) each process variable has unity variance, which circumvents dominant contributions of variables with large variances. Using the estimated mean and variance, the estimated correlation matrix of the four sensor readings is given by

8.83 8.83

The elements of 4646 suggest that there is some correlation among sensor readings z1 and z2, between variables z2 and z4 but variable z3 shows hardly any correlation with the other sensor readings. Different from the previous application studies in this book, the process variables cannot be seen as highly correlated. Equations (8.84a) and (8.84b) shows the eigendecomposition of 4652

8.84a 8.84a

8.84b 8.84b

At first glance, the first two eigenvalues are above one whilst the fourth one is significantly below one and also significantly smaller than the third one. Utilizing the stopping rules for PCA models in Subsection 2.3.1, those that assume a high degree of correlation and a significant contribution of the source signals to the process variables are not applicable here. This, for example, eliminates the VPC and VRE techniques.

An alternative is based on (2.122), which states that the sum of the eigenvalues is equal to the sum of the variances of each process variable. In percentage, the inclusion of n = 1, 2, 3 and 4 latent components yields a variance contribution of 38.66%, 63.91%, 86.45% and 100%, respectively. This suggests the retention of three latent components in the PCA model, as 86% of the variance of the scaled sensor readings can be recovered, and yields the following data model

8.85 8.85

Here, 4662, 4663. Estimates of the model and residual subspaces are the first three eigenvectors and the last eigenvector of 4664, respectively. Up to a similarity transformation, 4665, 4666 and 4667. Moreover, Ξ and Sss are assumed to be unknown.

Figure 8.15 compares the estimated distribution function (solid line) with that of a Gaussian distribution of the same mean and variance (dashed-dot line). This comparison shows a good agreement that confirms the validity of the underlying assumptions for the data model in (8.85).

Figure 8.15 Distribution function of sensor readings.

8.15

After establishing a PCA model from one of the reference sets, the next step is to determine the window size k0. To guarantee statistical independence, the performance of a variety of different sizes was tested using the second reference set. This entails the computation of the score variables, based on the PCA model established from the first reference set, the calculation of the four primary and improved residuals and the estimation of the distribution function for each improved residual.

Figure 8.16 contrasts the four estimated distribution functions for the primary residuals and the improved residuals for k0 = 10, k0 = 100 and k0 = 400 with Gaussian ones of the same mean and variance. As expected, very substantial departures from a Gaussian distribution arise for the primary residuals and the improved residuals for k0 = 10. A closer but still inaccurate approximation emerges for k0 = 100.

Figure 8.16 Distribution function of primary and improved residuals.

8.16

Increasing k0 to 400 gives rise to a substantially more accurate approximation of a Gaussian distribution. As increasing this number further showed insignificant differences and reducing the number decreased the accuracy, the selection of k0 = 400 presents a suitable trade-off between accuracy and sensitivity.

8.6.4 Detecting a fault condition

The PCA monitoring model is now applied to each of the recorded data. These include the two reference sets and the six data sets describing the fault condition. The PCA monitoring model determined in the previous subsection allows establishing a total of four non-negative quadratic monitoring statistics:

1. the conventional Hotelling's T2 described in (3.8);
2. the residual Q statistic defined in (3.15);
3. the Hotelling's 4680 statistic based on the first three improved residual variables and defined in (8.63); and
4. the residual Qθ statistic that is constructed from the remaining improved residual variable and also defined in (8.63).

Given that the correlation matrix and the mean vector of the vibration signals are estimates, the control limits for the Hotelling's T2 statistics are obtained by applying (3.5). Applying (3.16) and (3.29) produced the control limits for the Q and Qθ statistics, respectively. The significance for each control limit is 0.05.

Figure 8.17 shows the performance of the monitoring statistics for each of the eight conditions. As expected, the first two reference conditions did not yield a statistically significant number of violations of the control limit. For the remaining data sets, referring to the removal of 10%, 20%, 30%, 40%, 50% and 100% for one of the tooth in the pinion gear, a different picture emerged. Each of these conditions led to a significant number of violations for each statistic.

Figure 8.17 Non-negative quadratic statistics for conventional PCA (upper two plots) and the statistical local approach (lower two plots).

8.17

Table 8.6 lists the calculated percentage number of violations of each statistic. This analysis confirms that percentages for the two reference conditions are 5% or below, indicating an in-statistical-control behavior of the gearbox system. For each of the fault conditions, however, the percentage number of violations exceeds 5% which hence concludes the performance of the gearbox is out-of-statistical-control.

Table 8.6 Violations of control limits by monitoring statistic in [%].

NumberTable

Despite the fact that each monitoring statistic correctly rejected the null hypothesis for each of the fault conditions, it is important to note that the Hotelling's 4685 and Qθ statistic showed a significantly stronger response to the recorded data involving the manipulated pinion gear. This is in line with the observations in Section 8.5 and confirm the sensitivity of the improved residuals in detecting small alterations in the orientation of the model and residual subspaces and the control ellipsoid.

Identifying PCA models on the basis of each of the eight data sets allows to examine the sensitivity issue in more detail. This relies on benchmarking the second reference set and the six data sets describing the fault condition in various stages against the first reference set. More precisely, the departures of the eigenvectors and the differences for each of the four eigenvalues enables assessing the changes in the orientation of the model and residual subspaces and the orientation of the control ellipsoid. These changes can be described as follows

8.86 8.86

Here, the indices i = 1, … , 4 and m = 1, … , 7 represent the latent component and the data set, respectively, where m = 1 symbolizes the second reference data set and m = 2, ··· , 7 corresponds to the data sets 10% Fault, 20% Fault, 30% Fault, 40% Fault, 50% Fault and 100% Fault, respectively. Figure 8.18 summarizes the results of applying (8.86). Whilst the eigenvectors and eigenvalues for the correlation matrices of both reference sets are very close to each other, very significant differences can be noticed for data sets 10% Fault to 100% Fault.

Figure 8.18 Departures of eigenvectors (circles) and eigenvectors (dots) for each set.

8.18

It is interesting to note that the first alteration of the tooth, although only 10% of the length of this tooth was removed, rotated the first eigenvector by around 45°. Apart from the impact of a complete removal of the tooth upon the orientation of the fourth eigenvector, this is the strongest single response.

Similar to the application studies in Subsections 2.1.3 and 6.1.2, the orientation of the model subspace can be assessed by the last eigenvector, which spans the residual subspace. The lower right plot shows a trend that the greater the percentage of the tooth is removed, the larger the rotation of the fourth eigenvector and hence the residual subspace. Given that the model and residual subspaces are complementary subspaces, any rotation of the residual subspace will affect the model subspace too.

With this in mind, it can be concluded that the more severe the fault condition the more significant the impact upon the orientation of the model and residual subspaces. A similar trend, however, cannot be observed for the variance of the fourth eigenvalue. Whilst an initial increase can be noticed, this increase levels off at around 20% if larger portions of the tooth are removed.

Apart from the first eigenpair, the second and third pairs show, approximately, a proportionate response to the severity of the fault condition. The more of the tooth is removed, the larger the variance of the third score variable becomes. The removal of 20% and 30% produce a reduction in the variance of the second score variable, relative to the 10% removal. An increasing trend, however, can also be noticed for the variance of the second score variables with the removal of larger portions.

Based on the experimental data obtained, it can be concluded that the presence of a tooth defect increases the variance of the score variables and with it the variance of the vibration signals, which follows from (2.122). It is interesting to note that the relatively small 10% removal of the tooth has a significantly more pronounced effect on the orientation of the control ellipsoid than the removal of 20% and 30%. It is also interesting to note that the removal of 40% has a similar effect upon the monitoring statistics compared to a complete removal of the tooth.

In contrast, removing 50% of the tooth has a less significant effect on the monitoring model than removing only 40%. As stated above, the only direct relationship between the severity of the fault condition and the impact on the geometry of the monitoring model is the orientation of the residual subspace.

Finally, Table 8.7 summarizes the percentage changes of the variance of the score variables resulting from the fault condition. The score variances are computed with respect to the PCA model established from the first reference set, producing the following percentage changes

8.87 8.87

Table 8.7 Changes in variance of score variables

NumberTable

The entries in Table 8.7 confirm the analysis of the individual conditions in Figure 8.18. The most significant impact upon the variance of the first score variable is the 10% removal of the tooth. For the remaining score variables, the most significant effects are the 40% and 100% removal of the tooth. Moreover, the 10% removal has a considerably stronger impact upon the first three score variances than the 20% and 30% removals. The results in Table 8.7 therefore reflect the observations in the upper two plots in Figure 8.17.

8.7 Analysis of primary and improved residuals

As Subsection 8.3.4 gives a detailed analysis of the statistical properties of the primary residuals, this section presents more theoretical aspects of the primary and improved residuals. The section investigates the first and second moments of the primary residuals of the eigenvectors 4692, after presenting a detailed proof of the CLT for a sum of random variables that are i.i.d. This is followed by an examination of the covariance matrices for ϕi and 4694 to fault conditions. Finally, Subsection 8.7.3 outlines and proves that the non-negative quadratic statistics obtained from the improved residuals are more sensitive than those computed from the score variables.

8.7.1 Central limit theorem

According to (8.16), the statistical local approach relies on a vector-valued function of improved residuals which are, theoretically, an infinite sum of primary residuals. This subsection shows that

8.88 8.88

converges in distribution to 4695 when K → ∞. To simplify the analysis here, we examine the jth element of 4698, ϕj. The definition of the primary residuals in the preceding sections outlined that they have a mean of zero, so we can safely assume here that Ej} = 0. Moreover, for the variance of ϕj, 4702, we write 4703. In summary, the first and second moments of ϕj are

8.89 8.89

With this in mind, the jth element in (8.88) can be reformulated as follows

8.90 8.90

Given that the samples, z0(k), are drawn independently from the distribution function 4707, the instances of the primary residuals ϕj(k) are also drawn independently from the distribution function Fj) in (8.55). Moreover, as pointed out before, each instance is drawn from the same distribution function Fj) and each of these instances are i.i.d. With respect to the preceding discussion, we can now formulate a simplified version of the Lindeberg-Lévy theorem, defining the CLT for the infinite i.i.d. sequence of ϕj.


Theorem 8.7.1
The scaled sum of ϕj(1), ϕj(2), …  , ϕj(K), which have the same distribution function Fj), is asymptotically Gaussian distributed, that is, 4717, given that 4718 and 4719.

 


Proof.
The proof of Theorem 8.7.1 commences by defining the characteristic function of the improved residuals

8.91 8.91

where 4720 and 4721. It is important to note that the characteristic function provides an equal basis for defining and describing the statistical properties of a random variables compared to the cumulative distribution function. For example, if two cumulative distribution functions are identical, so are their characteristic functions. This is taken advantage of here in order to prove that the infinite sum in (8.90) asymptotically follows a Gaussian distribution.

The first step is to substitute the definition of θj into (8.91). With respect to the definition of the characteristic function, this gives rise to

8.92 8.92

The fact that the random variables ϕj(1), ϕj(2), …  , ϕj(K) are i.i.d. implies that their distribution and characteristic functions are identical. The product in (8.92) can therefore be simplified to

8.93 8.93

Here, 4727, that is, the characteristic function of ϕj is the Kth root of the characteristic function of θj. The next step is to develop a Taylor series of 4731 for c = 0

8.94 8.94

where the last term, 4733, 4734, is the Lagrangian remainder. The relationships can be obtained from the definition of the characteristic function

8.95 8.95

According to Equation (8.89), the relationships up to order three are therefore:

  • 4735;
  • 4736;
  • 4737
  • 4738

Here, 4739 is a small correction term and 4740 is the third moment of ϕj. Substituting these relationships in (8.94) yields

8.96 8.96

Substituting (8.96) into (8.93) gives rise to

8.97 8.97

Note that the characteristic function of the standard Gaussian distribution function is 4742 and that 4743. Therefore, the expression in (8.97) asymptotically converges to

8.98 8.98

This follows from 4744.

8.7.2 Further statistical properties of primary residuals

After proving the CLT, we now focus on discussing further properties of the primary residuals. Subsection 8.3.4 shows the first two moments for the primary residuals 4745. This subsection determines the covariance and cross-covariance matrices for the primary residuals tiz0 − λipi. It also discusses how the covariance matrices of 4747 and tiz0 − λipi are affected by a change in the eigenvalues and eigenvectors.

8.7.2.1 Covariance matrices of primary residuals in equation (8.20)

The determination of the covariance and cross-covariance matrices for the primary residual vectors 4749, and 4750 and 4751, i ≠ j, requires the substitution of 4753 and 4754 into 4755 and 4756, respectively.

Covariance matrix

Starting with the covariance matrix 4757

8.99 8.99

the expectation on the right hand side of (8.99) can be analyzed separately by substituting 4758, which yields

8.100a 8.100a

8.100b 8.100b

8.100c 8.100c

The above simplifications follow from E{titj} = 0 for all i ≠ j. The fact that there are 4761 terms in (8.100a) gives rise to the following separation.

  • For i ≠ j ≠ m, there are a total of 4763 terms in this case, which produces

images/c08_I0108.gif

which follows from the Isserlis theorem (Isserlis 1918).
  • There are nz − 1 cases of i ≠ j = m, which yield the general expression

images/c08_I0109.gif

  • The remaining term, i = j = m, is equal to

images/c08_I0110.gif

Putting this all together, (8.100a) reduces to

8.101 8.101

Substituting (8.100a) to (8.101) into (8.99) finally yields

8.102 8.102

Cross-covariance matrix

The matrix 4767, i ≠ j and i, j = 1, 2, … , nz, is similar to that in (8.99) and is given by

8.103 8.103

Using the simplifications applied to (8.99), (8.103) reduces to

8.104a 8.104a

8.104b 8.104b

8.104c 8.104c

Given that E{titjtmtm} = 0 for all m ≠ i, m ≠ j and m ≠ j, m ≠ i, (8.104a) reduces to

8.105 8.105

Putting it all together, (8.103) finally yields

8.106 8.106

8.7.2.2 Covariance matrix 4773 for a change in λi

Under the assumption that the eigenvectors remain constant, changing the covariance of the ith score variable to be λi + Δλi results in the following alteration of 4777

8.107 8.107

The expectation of 4778 is

8.108 8.108

which implies that the covariance matrix matrix becomes

8.109 8.109

Thus, a change in the eigenvalues yield the same covariance matrix for 4779 and 4780.

8.7.2.3 Covariance matrix 4781 for change in pi

An alteration of the eigenvector, 4783, does not have the same isolated impact upon 4784, as is the case for a change in the eigenvalue. This is because a rotation of ith eigenvector affects more then just this eigenvector, since the eigenvectors are mutually orthonormal. If we restrict this examination by assuming that only the ith eigenvector is altered and the remaining nz − 1 eigenvectors and the score covariance matrix remain unchanged, 4788 becomes

8.110 8.110

Note that 4789 for all j ≠ i. The expectation of 4791 is

8.111 8.111

which gives rise to the following covariance matrix

8.112 8.112

The difference between the covariance matrices of 4792 and 4793 is therefore

8.113 8.113

8.7.2.4 Covariance of 4794 for a change in λi

8.114 8.114

and has the following expectation

8.115 8.115

which gives rise to the following covariance

8.116 8.116

which follows from (8.48). The difference between 4796 and 4797 is therefore

8.117 8.117

8.7.2.5 Covariance of 4798 for a change in pi

Changing pi to pi + Δpi implies that it is no longer orthogonal to all of the remaining nz − 1 eigenvectors. Assuming that ||pi + Δpi|| = 1 and defining the nz loading vectors by 4805, 4806 for all j ≠ i, the expectation of 4808 becomes

8.118 8.118

Next, substituting 4809 into (8.118) gives rise to

8.119 8.119

That 4810 is interesting and follows from the assumption that the length of ||p1|| = ||pi + Δpi|| = 1, that is, 4812. Using the mean value 4813, the covariance of 4814 is equal to

8.120 8.120

It follows that 4815, since 4816.

8.7.3 Sensitivity of statistics based on improved residuals

The previous sections showed that the primary residuals 4817 are sufficient for detecting changes in the underlying geometry of the data model in (2.2). Furthermore, the mean and variance of these residuals is 0 and 4818, respectively, and there is no covariance between the ith and jth primary residuals, that is, 4821.

Furthermore, the primary residuals are i.i.d. implying that the improved residuals are asymptotically Gaussian distributed and have the same mean, variance and covariance as the primary residuals. The covariance matrix for the vector containing the improved residuals of the first n eigenpairs of 4823 is equal to 4824, where 4825 is the diagonal matrix storing the n dominant eigenvalues of 4827.

The sensitivity in detecting various fault conditions is now examined for a simple sensor or actuator bias. The scope of Project 3 in the tutorial section covers more complex process faults that alter the eigenvectors and eigenvalues by contrasting the sensitivity of the non-negative quadratic statistics constructed from the improved residuals with those based on the score variables.

8.7.3.1 Sensitivity for detecting sensor or actuator bias

The data model describing a sensor fault is as follows

8.121 8.121

According to Table 2.1, the effect of a sensor or actuator fault can be described by an offset term or a bias and a stochastic term or a precision degradation. Besides bias and precision degradation, both terms can also describe a complete failure or a drift if the offset term is assumed to be time varying and deterministic. For a sensor bias, described by a constant bias term Δz0, the ith improved residual becomes

8.122 8.122

Expectation of 4830

The expectation follows from

8.123 8.123

Variance of 4831

Defining the mean-centered 4832 by

8.124 8.124

simplifies the determination of the variance of 4833

8.125

8.125 8.125

which yields that 4834. The next step is to examine the impact of Δz0 upon the non-negative quadratic monitoring statistics.

Effect of Δz0 upon 4837

This impact can be described by 4838

8.126 8.126

which requires examining 4839

8.127

8.127 8.128

from which it follows that

8.128 8.128

Equation (8.126) can now be evaluated, which yields

8.129 8.129

This compares favorably to the sensitivity of the conventional Hotelling's T2 statistic

8.130 8.130

The difference between (8.128) and (8.129) is then a measure for sensitivity of the Hotelling's T2 statistic constructed from the improved residuals and the score variables

8.131 8.131

For the Hotelling's T2 statistics, (8.131) outlines that, irrespective of the window length k0, the non-negative quadratic statistics constructed from the improved residuals is more sensitive to a set of sensor or actuator biases. Moreover, this equation also highlights that the larger the value of k0 the more significant this increase in sensitivity becomes.

Constructing the residual Q statistics on the basis of (3.19) yields the same conclusion. Under the assumption that 4846, the nzn required eigenvalues for constructing the 4848 and 4849, 4850. Since 4851, (8.131) also confirms that the increase in sensitivity is more pronounced for the Q statistic.

The application studies in Sections 8.4 and 8.6 confirm the above findings. More precisely, Figures 8.7, 8.9 and 8.17 illustrate that the non-negative quadratic statistics based on the score variables show sporadic violations of their control limits which, however, exceeded the significance level. In contrast, the statistics constructed from the improved residuals showed a considerably stronger response and produced, in almost each case, a constant violation of their control limits.

8.8 Tutorial session

Question 1:

Describe under which conditions conventional scatter diagrams, the Hotelling's T2 statistic and the Q statistic are insensitive to fault conditions. What is the effect of such changes upon Type II errors?

Question 2:

Referring to Question 1, analyze how these changes can be detected.

Question 3:

Explain why the primary residuals are difficult to use for constructing monitoring charts. How can the associated problems be overcome?

Question 4:

What is the problem of using primary residuals that are based on the eigenvectors of the data covariance matrix? Are the primary residuals related to the eigenvalues of the data covariance matrix also affected by this problem? Are both types of primary residuals sensitive to geometric changes in the model and residual subspaces as well as changes in the variances of the source and error variables?

Question 5:

Describe the properties of the primary and improved residuals based on the eigenvalues of the covariance matrix.

Question 6:

Discuss the assumptions under which the central limit theorem holds true. What are the conditions under which the central limit theorem does not hold true?

Project 1:

Use a Monte Carlo simulation based on the second intuitive example, described in (8.8), augment the stochastic vector z0 by a time-varying mean vector 4856 and comment on the Type I and II errors. For PCA, discuss how to develop an adaptive monitoring approach to construct primary residuals. How can such an adaptive approach be utilized to determine improved residuals?

Project 2:

Develop primary and improved residuals on the basis of the PLS objective functions for determining the weight and loading vectors. Can a fault that (i) only affects the input variables, or (ii) only affects the output variables, or (iii) affects the input and the output variables, be detected? Discuss the results and propose a reliable monitoring scheme for detecting geometric changes in the weight and loading vectors and the variance of the score variables.

Project 3:

Assume that a fault condition affects the ith eigenvector and/or eigenvalue of the covariance matrix 4858. Similar to the analysis in Subsection 8.7.3, develop and contrast the relationships describing the impact of such a change upon the Hotelling's T2 and 4860 statistics.

 

 

1 This relatively large range outlines, again, the problem of selecting an appropriate size for the reference data set, discussed in Section 6.4 and Subsection 7.3.6.

2 !! is the double factorial and the product of the odd numbers only, e.g. 9!! = 1 · 3 · 5 · 7 · 9 = 945.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.142.43.26