Chapter 3

Process monitoring charts

The aim of this chapter is:

  • to design monitoring charts on the basis of the extracted LV sets and the residuals;
  • to show how to utilize these charts for evaluating the performance of the process and for assessing product quality on-line; and
  • to outline how to diagnose behavior that is identified as abnormal by these monitoring charts.

For monitoring a complex process on-line, the set of score and residual variables give rise to the construction of a statistical fingerprint of the process. This fingerprint serves as a benchmark for assessing whether the process is in-statistical control or out-of-statistical-control. Based on Chapters 3 and Chapters 2, the construction of this fingerprint relies on the following assumptions for identifying PCA/PLS data models:

  • the error vectors associated with the PCA/PLS data models follow a zero mean Gaussian distribution that is described by full rank covariance matrices;
  • the score variables, describing common cause variation of the process, follow a zero mean Gaussian distribution that is described by a full rank covariance matrix;
  • for any recorded process variable, the variance contribution of the source signals (common cause variation) is significantly larger than the variance contribution of the corresponding error signal;
  • the number of source variables is smaller than the number of recorded process (PCA) or input variables (PLS);
  • recorded variable sets have constant mean and covariance matrices over time;
  • the process is a representation of the data models in either 2.2, 2.24 or 2.51;
  • none of the process variables possess any autocorrelation; and
  • the cross-correlation function of any pair of process variables is zero for two different instances of time, as described in Subsection 2.1.1.

Part III of this book presents extensions of conventional MSPC which allow relaxing the above assumptions, particularly the assumption of Gaussian distributed source signals and time-invariant (steady state) process behavior.

The statistical fingerprint includes scatter diagrams, which Section 1.2 briefly touched upon, and non-negative squared statistics involving the t-score variables and the residuals of the PCA and PLS models. For the construction of monitoring models, this chapter assumes the availability of the data covariance and cross-covariance matrices. In this regard, the weight, loading and score variables do not need to be estimated from a reference data set. This simplifies the presentation of the equations derived, as the hat notation is not required.

Section 3.1 introduces the tools for constructing the statistical fingerprint for on-line process monitoring and detecting abnormal process conditions that are indicative of a fault condition. Fault conditions could range from simple sensor or actuator faults to complex process faults. Section 3.2 then summarizes tools for diagnosing abnormal conditions to assist experienced plant personnel in narrowing down potential root causes. Such causes could include, among many other possible scenarios, open bypass lines, a deteriorating performance of a heat exchanger, a tray or a pump, a pressure drop in a feed stream, a change in the input composition of input feeds, abnormal variation in the temperature of input or feed streams, deterioration of a catalyst, partial of complete blockage of pipes and valve stiction.

The diagnosis offered in Section 3.2 identifies to what extent a recorded variable is affected by an abnormal event. This section also shows how to extract time-based signatures for process variables if the effect of a fault condition deteriorates the performance of the process over time. Section 3.3 finally presents (i) a geometric analysis of the PCA and PLS projections to demonstrate that fault diagnosis based on the projection of a single sample along predefined directions may lead to erroneous and incorrect diagnosis results in the presence of complex fault conditions and (ii) discusses how to overcome this issue. A tutorial session concerning the material covered in this chapter is given in Section 3.4.

3.1 Fault detection

Following from the discussion in Chapter 2, PCA and PLS extract latent information in form of latent score variables and residuals from the recorded variables. According to the data models for PCA in 2.2 and 2.6 and PLS in 2.23, 2.24 and 2.51, the t-score variables describe common cause variation that is introduced by the source vector s. Given that the number of t-score variables is typically significantly smaller than the number of recorded variables, MSPC allows process monitoring on the basis of a reduced set of score variables rather than relying on charting a larger number of recorded process variables.

With respect to the assumptions made for the data structures for PCA and PLS in Chapter 2, the variation described by the t-score variables recovers the variation of the source variables. Hence, the variation encapsulated in the t-score variables recovers significant information from recorded variables, whilst the elements in the error vector have an insignificant variance contribution to the process variable set. Another fundamental advantage of the t-score variables is that they are statistically independent, which follows from the analysis of PCA and PLS in Chapters 9 and 10, respectively.

The t-score variables can be plotted in scatter diagrams for which the confidence regions are the control ellipses discussed in Subsection 1.2.3. For a time-based analysis, MSPC relies on nonnegative quadratics that includes Hotelling's T2 statistics and residual-based squared prediction error statistics, referred to here as Q statistics. Scatter diagrams are not time-based but allow the monitoring of pairs or triples of combinations of t-score variables. In contrast, the time-based Hotelling's T2 statistics present an overall measure of the variation within the process.

The next two subsections provide a detailed discussion of scatter diagrams and the Hotelling's T2 statistic. It is important to note that MRPLS may generate two Hotelling's T2 statistics, one for the common cause variation in the predictor and response variable sets and one for variation that is only manifested in the input variables and is not predictive for the output variables. This is discussed in more detail in the next paragraph and Subsection 3.1.2.

The mismatch between the recorded variables and what the t-score variables, or source variables, can recover from the original variables are model residuals. Depending on the variance of the discarded t-score variables (PLS) or the variance of the t′-scores (MRPLS), these score variables may be used to construct a Hotelling's T2 or a residual Q statistic. Whilst Hotelling's T2 statistics present a measure that relates to the source signals, or significant variation to recover the input variables, the residual Q statistic is a measure that relates to the model residuals.

Loosely speaking, a Q statistic is a measure of how well the reduced dimensional data representation in 2.2, 2.24 or 2.51 describe the recorded data. Figure 1.7 presents an illustration of perfect correlation, where the sample projections fall onto the line describing the relationship between both variables. In this extreme case, the residual vector is of course zero, as the values of both variables can be recovered without an error from the projection of the associated sample.

If, however, the projection of a sequence of samples do not fall onto this line an error for the recovery of the original variables has occurred, which is indicative of abnormal process behavior. For a high degree of correlation, Figure 1.9 shows that the recovered values of each sample using the projection of the samples onto the semimajor of the control ellipse are close to the recorded values. The perception of ‘close’ can be statistically described by the residual variables, its variances and the control limit of the residual based monitoring statistic.

3.1.1 Scatter diagrams

Figures 1.6, Figures 1.7 and 1.9 show that the shape of the scatter diagrams relate to the correlation between two variables. Different from the 2D scatter diagrams, extensions to 3D scatter diagrams are possible, although it is difficult to graphically display a 3D control sphere. For PCA and PLS, the t-score variables, 1172 for PCA and 1173 for PLS are uncorrelated and have the following covariance matrices

3.1 3.1

and

3.2 3.2

The construction of the weight matrix R is shown in Chapter 10. Under the assumption that the exact data covariance matrix is known a priori, the control ellipse for i ≠ j has the following mathematical description

3.3 3.3

where 1176 is the critical value of a χ2 distribution with two degrees of freedom and a significance of α. The length of both axes depends on the variance of the ith and jth t-score variable, denoted by 1181 and 1182 for PCA, and 1183 and 1184 for PLS. These variances correspond to the diagonal elements of the matrices in (3.1) and (3.2), noting that 1 ≤ i, j ≤ n.

It is straightforward to generate a control ellipse for any combination of score variables for n > 2. This, however, raises the following question: how can such scatter plots be adequately depicted? A naive solution would be to extend the 2D concept into an nD concept, where the 2D control ellipse becomes an nD-ellipsoid

3.4 3.4

where 1189 is the critical value of a χ2 distribution with n degrees of freedom. While it is still possible to depict a control ellipsoid that encompasses the orthogonal projections of the data points onto the n = 3-dimensional model subspace, however, this is not the case for n > 3. A pragmatic solution could be to display pairs of score variables, an example of which is given in Chapter 5.

It is important to note that (3.3) and (3.4) only hold true if the exact covariance matrix of the recorded process variables is known (Tracey et al. 1992). If the covariance matrix must be estimated from the reference data, as shown in Sections 2.1 and 2.2, the approximation by a χ2-distribution may be inaccurate if few reference samples, K, are available. In this practically important case, (3.4) follows an F-distribution under the assumption that the covariance matrix of the score variables has been estimated independently from the score variables. For a detailed discussion of this, refer to Theorem 5.2.2. in Anderson (2003). The critical value 1196 in this case is given by (MacGregor and Kourti 1995; Tracey et al. 1992)

3.5 3.5

where 1197 is the critical value of an F-distribution for n and Kn degrees of freedom, and a significance of α. It should be noted that the value of 1201 converges to 1202 as K → ∞ (Tracey et al. 1992) and if the variable mean is known a priori, 1204 becomes (Jackson 1980)

3.6 3.6

3.1.2 Non-negative quadratic monitoring statistics

Non-negative quadratic statistics could be interpreted as a kinetic energy measure that condenses the variation of a set of n score variables or the model residuals into single values. The reference to non-negative quadratics was proposed by Box (1954) and implies that it relies on the sum of squared values of a given set of stochastic variables. For PCA, the t-score variables and the residual variables can be used for such statistics. In the case of PLS, however, a total of three univariate statistics can be established, one that relates to the t-score variables and two further that correspond to the residuals of the output and the remaining variation of the input variables.

The next two paragraphs present the definition of non-negative quadratics for the t-score variables and detail the construction of the residual-based ones for PCA and PLS. For the reminder of this book, the loading matrix for PCA and PLS are denoted as P and only contain the first n column vectors, that is, the ones referring to common-cause variation. For PCA and PLS, this matrix has nz and n, and nx and n columns and rows, respectively. The discarded loading vectors are stored in a second matrix, defined as Pd. Moreover, the computed score vector, 1213 for PCA and 1214 for PLS is of dimension n.

3.1.2.1 PCA monitoring models

The PCA data model includes the estimation of:

  • the model subspace;
  • the residual subspace;
  • the error covariance matrix;
  • the variance of the orthogonal projection of the samples onto the loading vectors; and
  • the control ellipse/ellipsoid.

The model and residual subspaces are spanned by the column vectors of P and Pd, respectively. Sections 6.1 and 9.1 provide a detailed analysis of PCA, where these geometric aspects are analyzed in more detail.

According to 2.2, the number of source signals determines the dimension of the model subspace. The projection of the samples onto the model subspace therefore yields the source variables that are corrupted by the error variables, which the relationship in 2.8 shows. Moreover, the mismatch between the data vector z0 and the orthogonal projection of z0 onto the model subspace, g, does not include any information of the source signals, which follows from

3.7 3.7

The above relationship relies on the fact that 1221, which 2.7 outlines, and the eigenvectors of the data covariance matrix are mutually orthonormal. The score vector 1222, approximating the variation of the source vector s, and the residual vector g give rise to the construction of two non-negative squared monitoring statistics, the Hotelling's T2 and Q statistics that are introduced below.

Hotelling's T2 statistic

The univariate statistic for the t-score variables, 1228 is defined as follows

3.8 3.8

The matrix Λ includes the largest n eigenvalues of 1231 as diagonal elements. For a significance α, the control limit for the above statistic, 1233 is equal to χ2(n) if the covariance matrix of the recorded process variables is known. If this is not the case the control limit can be obtained as shown in (3.5) or (3.6). The above non-negative quadratic is also referred to as a Hotelling's T2 statistic. The null hypothesis for testing whether the process is in-statistical-control, H0, is as follows

3.9 3.9

and the hypothesis H0 is rejected if

3.10 3.10

The alternative hypothesis H1, the process is out-of-statistical-control, is accepted if H0 is rejected.

Assuming that the fault condition, representing the alternative hypothesis H1, describes a bias of the mth sensors, denoted by zf = z0 + Δz where the mth element of Δz is nonzero and the remaining entries are zeros, the score variables become 1245. This yields the following impact upon the Hotelling's T2 statistic, denoted here by 1247 where the subscript f refer to the fault condition

3.11 3.11

The above equation uses 1249. The alternative hypothesis, H1, is therefore accepted if

3.12 3.12

and rejected if 1251. A more detailed analysis of the individual terms in (3.11) 1252, 1253 and 1254 yields that

  • 1255;
  • 1256;
  • 1257;
  • 1258; and
  • 1259.

If the term 1260 is hypothetically set to zero, 1261. The larger the fault magnitude, Δzm the more the original T2 statistic is shifted, which follows from

3.13 3.13

which is equal to

3.14 3.14

The impact of the term 1264 upon 1265 is interesting since it represents a Gaussian distributed contribution. This, in turn, implies that the PDF describing 1266 is not only a shift of T2 by 1268 but it has also a different shape.

Figure 3.1 presents the PDFs that describe the T2 and 1270 and illustrate the impact of Type I and II errors for the hypothesis testing. It follows from Subsections 1.1.3 and 1.2.4 that a Type I error is a rejecting of the null hypothesis although it is true and a Type II error is the acceptance of the null hypothesis although it is false. Figure 3.1 shows that the significance level for the Type II, β, depends on the exact PDF for a fault condition, which, even the simple sensor fault, is difficult to determine. The preceding discussion, however, highlights that the larger the magnitude of the fault condition the more the PDF will be shifted and hence, the smaller β becomes. In other words, incipient fault conditions are more difficult to detect than faults that have a profound impact upon the process.

Figure 3.1 Illustration of Type I and II errors for testing null hypothesis.

3.1

Q statistic

The second non-negative quadratic statistic relates to the PCA model residuals 1273 and is given by

3.15 3.15

The control limit for the Q statistic is difficult to obtain although the above sum appears to be the sum of squared values. More precisely, Subsection 3.3.1 highlights that the PCA residuals are linearly dependent and are therefore not statistically independent. Approximate distributions for such quadratic forms were derived in Box (1954) and Jackson and Mudholkar (1979). Appendix B in Nomikos and MacGregor (1995) yielded that both approximations are close. Using the method by Jackson and Mudholkar (1979) the control limit for the Q statistic is as follows

3.16 3.16

where 1276, 1277, 1278, 1279 and the variable cα is the normal deviate evaluated for the significance α. Defining the matrix product 1282, (3.15) can be rewritten as follows

3.17 3.17

where 1283 is the element of 1284 stored in the jth row and the ith column. Given that 1287 is symmetric, i.e. 1288, (3.17) becomes

3.18 3.18

A modified version of the Q statistic in (3.15) was proposed in (Hawkins 1974) and entails a scaling of each discarded score vector by its variance

3.19 3.19

and follows a χ2 distribution with the control limit 1291. If the data covariance matrix needs to be estimated, the control limit 1292 is given by

3.20 3.20

The diagonal matrix Λd contains the discarded eigenvalues of 1294 and 1295. For process monitoring applications, however, a potential drawback of the residual 1296 statistic is that some of the discarded eigenvalues, 1297, may be very close or equal to zero. This issue, however, does not affect the construction of the Q statistic in (3.15).

Using the Q statistic for process monitoring, testing whether the process is in-statistical-control relies on the null hypothesis H0, which is accepted if

3.21 3.21

and rejected if Q > Qα. On the other hand, the alternative hypothesis H1, describing the out-of-statistical-control situation, is accepted if the null hypothesis H0 is rejected

3.22 3.22

Assuming that the fault condition is a bias of the mth sensor that has the form of a step, (3.15) becomes

3.23 3.23

Similar to the Hotelling's T2 statistic, the step-type fault yields a Q statistic Qf that includes the offset term 1308, where Δzi is the magnitude of the bias, and the Gaussian distributed term 1310. Figure 3.1 and (3.23) highlight that a larger bias leads to a more significant shift of the PDF for Qf relative to the PDF for Q and therefore a smaller Type II error β. In contrast, a smaller and incipient sensor bias leads to a large Type II error and is therefore more difficult to detect.

3.1.2.2 PLS monitoring models

PLS and MRPLS models give rise to the generation of three univariate statistics. The ones for PLS models are presented first, followed by those for MRPLS models.

Monitoring statistics for PLS models

Similar to PCA, the retained t-score variables allow constructing a Hotelling's T2 statistic, which according to 2.24 describes common cause variation

3.24 3.24

Here, 1315 and 1316 is given in (3.2). Equations (3.5) or (3.6) show how to calculate the control limit for this statistic if 1317 is not known a priori. If 1318 and 1319 are available the control limit is 1320. The Q statistic for the residual of the output variables is given by

3.25 3.25

Here, 1322.

The residuals of the input variables can either be used to construct a Hotelling's T2 or a Q statistic, depending upon their variances. This follows from the discussion concerning the residual statistic proposed by Hawkins (1974). Very small residual variances can yield numerical problems in determining the inverse of the residual covariance matrix. If this is the case, it is advisable to construct a residual Q statistic

3.26 3.26

where e = [IPRT]x0. In a similar fashion to PCA, the elements of the residual vector e are linear combinations of the input variables computed from the discarded r-weight vector stored in Rd

3.27 3.27

Using the relationship in (3.27), equation (3.26) can be rewritten as follows

3.28 3.28

For determining the control limits of the Qe and Qf statistics, it is possible to approximate the distribution functions of both non-negative quadratics by central χ2 distributions1. Theorem 3.1 in Box (1954) describes this approximation, which allows the determination of control limits for a significance α

3.29 3.29

where the g and h parameters are obtained such that the approximated distributions have the same first two moments as those of Qe and Qf in (3.25) and (3.28). In other words, the mean and variance of 13372 and 1338, so that ge and gf are

3.30 3.30

and he and hf are

3.31 3.31

For larger variances of the discarded t-score variables, although they do not contribute significantly to the prediction of the output variables, it is advisable to construct a second Hotelling's T2 statistic instead of the Qe statistic

3.32 3.32

where 1345. The Hotelling's 1346 statistic follows a central χ2 statistic with nxn degrees of freedom if the covariance and cross-covariance matrices are known a priori or a scaled F distribution if not. As before, the estimate, 1349, has to be obtained from an different reference set that has not been used to estimate the weight and loading matrices (Tracey et al. 1992).

The advantage of the statistics in (3.24) and (3.32) is that each score variable is scaled to unity variance and has the same contribution to the Hotelling's T2 statistics. Hence, those score variables with smaller variances, usually the last few ones, are not overshadowed by those with significantly larger variances, typically the first few ones. This is of concern if a fault condition has a more profound effect upon score variables with a smaller variance. In this case, the residuals Qe statistic may yield a larger Type II error compared to the Hotelling's 1352 statistic. Conversely, if the variances of the last few score variables are very close to zero numerical problems and an increase in the Type I error, particularly for small reference sets, may arise. In this case, it is not advisable to rely on the Hotelling's 1353 statistic.

Monitoring statistics for MRPLS models

With respect to 2.51, a Hotelling's T2 statistic to monitor common cause variation relies on the t-score variables that are linear combinations of the source variables corrupted by error variables

3.33 3.33

The covariance matrix of the score variables is 1355 and represents, in fact, the length constraint of the MRPLS objective function in 2.66. The Hotelling's T2 statistic defined in (3.33) follows a χ2 distribution with n degrees of freedom and its control limit is 1359.

The data model in 2.51 highlights that the residuals of the input variables that are not correlated with the output variables may still be significant and can also be seen as common cause variation but only for the input variable set. The vector of source variables s′ describes this variation and allows the construction of a second Hotelling's T2 statistic, denoted here as the Hotelling's T′2 statistic

3.34 3.34

Here, R′ is the r-loading matrix containing the nxn r-loading vectors for determining 1365. The t′-score variables are equal to the s′ source variables up to a similarity transformation and, similar to the t-score variables, 1368 is a diagonal matrix. If the score covariance matrices need to be estimated Tracey et al. (1992) outlined that this has to be done from a different reference set that was not used to estimate the weight and loading matrices. If this is guaranteed, the Hotelling's T2 statistic follows a scaled F-distribution with n and Kn and the Hotelling's T′2 statistics follows a scaled F distribution with nxn and Kn degrees of freedom.

If the variance of the last few t′-score variables is very close to zero, it is advisable to utilize a Q statistic rather than the Hotelling's T′2 statistic. Assuming that each of the t′-score variables have a small variance, a Q statistic can be obtained that includes each score variable

3.35 3.35

If there are larger differences between the variances of the t′-score variables, it is advisable to utilize the Qe statistic or to divide the t′-score variables into two sets, one that includes those with larger variances and the remaining ones with a small variance. This would enable the construction of two non-negative quadratic statistics. Finally, the residuals of the output variables form the Qf statistic in (3.25) along with its control limit in (3.29) to (3.31).

3.2 Fault isolation and identification

After detecting abnormal process behavior, the next step is to determine what has caused this event and what is its root cause. Other issues are how significantly does this event affect product quality and what impact does it have on the general process operation? Another important question is can the process continue to run while the abnormal condition is removed or its impact minimized, or is it necessary to shut down the process immediately to remove the fault condition? The diagnosis of abnormal behavior, however, is difficult (Jackson 2003) and often requires substantial process knowledge and, particularly, about the interaction between individual operating units. It is therefore an issue that needs to be addressed by experienced process operators.

To assist plant personnel in identifying potential causes of abnormal behavior, MSPC offers charts that describe to what extent a particular process variable is affected by such an event. It can also offer time-based trends that estimate the effect of a fault condition upon a particular process variable. These trends are particularly useful if the impact of a fault condition becomes more significant over time. For a sensor or actuator bias or precision degradation, such charts provide useful information that can easily be interpreted by a plant operator. For more complex process faults, such as the performance deterioration of units, or the presence of unmeasured disturbances, these charts offer diagnostic information allowing experienced plant operators to narrow down potential root causes for a more detailed examination.

It is important to note, however, that such charts examine changes in the correlation between the recorded process variables but do not present direct causal information (MacGregor and Kourti 1995; MacGregor 1997; Yoon and MacGregor 2001). Section 3.3 analyzes associated problems of the charts discussed in this section. Before developing and discussing such diagnosis charts, we first need to introduce the terminology for diagnosing fault conditions in technical systems. Given that there are a number of competing definitions concerning fault diagnosis, this book uses the definitions introduced by Isermann and Ballé (1997), which are:

Fault isolation:

Determination of the kind, location and time of detection of a fault. Follows fault detection.

Fault identification:

Determination of the size and time-variant behavior of a fault. Follows fault isolation.

Fault diagnosis:

Determination of the kind, size, location and time of detection of a fault. Follows fault detection. Includes fault isolation and identification.

The literature introduced different fault diagnosis charts and methods, including:

  • contribution charts;
  • charting the results of residual-based tests; and
  • variable reconstruction.

Contribution charts, for example discussed by Koutri and MacGregor (1996) and Miller et al. (1988), indicate to what extent a certain variable is affected by a fault condition. Residual-based tests (Wise and Gallagher 1996; Wise et al. 1989a) examine changes in the residual variables of a sufficiently large data set describing an abnormal event, and variable reconstruction removes the fault condition from a set of variables (Dunia and Qin 1998; Lieftucht et al. 2009).

3.2.1 Contribution charts

Contribution charts reveal which of the recorded variable(s) has(have) changed the correlation structure among them. More precisely, these charts reveal how each of the recorded variables affects the computation of particular t-score variables. This, in turn, allows computing the effect of a particular process variable upon the Hotelling's T2 and Q statistics if at least one of them detects an out-of-statistical-control situation. The introduction of contribution charts is designed here for PCA. The tutorial session at the end of this chapter offers a project for developing contribution charts for PLS and to contrast them with the PCA ones.

3.2.1.1 Variable contribution to the Hotelling's T2 statistic

The contribution of the recorded variable set z0 upon the ith t-score variable that forms part of the Hotelling's T2 statistic is given by Kourti and MacGregor (1996):

1. Determine which score variable is significantly affected by out-of-statistical-control situation by testing the alternative hypothesis of the normalized score variables

3.36 3.36

which is as follows

3.37 3.37

This yields n* ≤ n score variables that are affected and earmarked for further inspection. Moreover, the index set 1, 2, … , n can be divided into a subset of indices that contains the affected score variables, 1392 that contains n* elements, and a subset 1394 that stores the remaining n − n* elements. The union of both subsets, 1396, is the index set 1, 2, … , n and the intercept of both subsets contains no element, 1398.
2. Next, compute the contribution of each process variable 1399, j ∈ [1, 2, … , nz], for each of the violating score variables, ti, where 1402

3.38 3.38

3. Should the contribution of 1403 be negative, set it equal to zero.
4. The contribution of the jth process variable on the Hotelling's T2 statistic is

3.39 3.39

5. The final step is to plot the values of 1406 for each of the nz variables in the form of a 2D bar chart for a specific sample or a 3D bar chart describing a time-based trend of 1408 values.

For this procedure, pji is the entry of the loading matrix P stored in the jth row and the ith column.

The above procedure invites the following two questions. Why is the critical value for the hypothesis test in (3.37) 1413 and why do we remove negative values for 1414? The answer to the first question lies in the construction of the Hotelling's T2 statistic, which follows asymptotically a χ2 distribution (Tracey et al. 1992). Assuming that n* < n, the Hotelling's T2 statistic can be divided into a part that is affected and a part that is unaffected by a fault condition

3.40 3.40

The definition of the χ2 PDF, however, describes the sum of statistically independent Gaussian distributed variables of zero mean and unity variance. In this regard, each element of this sum has the same contribution to the overall statistic. Consequently, the critical contribution of a particular element is the ratio of the control limit over the number of sum elements. On the other hand, testing the alternative hypothesis that a single t-score variable, which follows a Gaussian distribution of zero mean and unity variance and its squared values asymptotically follows a χ2 distribution with 1 degree of freedom, against the control limit of just this one variable can yield a significant Type II error, which Figure 3.2 shows.

Figure 3.2 Type II error for incorrectly applying hypothesis test in Equation (3.37).

3.2

The answer to the second question lies in revisiting the term in (3.40)

3.41 3.41

From the above equation, it follows that the contribution of the jth process variable upon the Hotelling's T2 statistic detecting an abnormal condition is equal to

3.42 3.42

Now, including terms in the above sum that have a different sign to 1423 reduces the overall value of 1424. Consequently, to identify the main contributors to the absolute value of 1425 requires the removal of negative sum elements 1426.

3.2.1.2 Variable contribution to the Q statistic

Given that the Q statistic relies on the sum of the residuals for each variable, the variable contribution of the jth variable is simply (Yoon and MacGregor 2001):

3.43 3.43

Alternative forms are discussed in Kourti (2005), 1430, or Chiang et al. (2001), 1431. Since the variance for these residuals may vary, it is difficult to compare them without scaling. Furthermore, using squared values does not offer the possibility of evaluating whether whether a temperature reading is larger or smaller then expected, for example. This suggests that (3.43) should be

3.44 3.44

3.2.1.3 Degree of reliability of contribution charts

Although successful diagnoses using contribution charts have been reported (Martin et al. 2002; Pranatyasto and Qin 2001; Vedam and Venkatasubramanien 1999; Yoon and MacGregor 2001), subsection 3.3.1 shows that the PCA residuals of the process variables are linearly dependent. The same analysis can also be applied to show that the recorded process variables have a linearly dependent contribution to the Hotelling's T2 statistic. Moreover, Yoon and MacGregor (2001) discussed that contribution charts generally stem from an underlying correlation model of the recorded process variables which may not possess a causal relationship.

However, contribution charts can identify which group(s) of variables are mostly affected by a fault condition. Lieftucht et al. (2006a) highlighted that the number of degrees of freedom in the residual subspace is an important factor for assessing the reliability of Q contribution charts. The ratio 1434 is therefore an important index to determine the reliability of these charts. The smaller this ratio the larger is the dimension of the residual subspace and the less the degree of linear dependency among the variable contribution to the Q statistic.

Chapters 4 and 5 demonstrate how contribution charts can assist the diagnosis of fault conditions ranging from simple sensor or actuator faults to more complex process faults. It is important to note, however, that the magnitude of the fault condition can generally not be estimated through the use of contribution charts. Subsection 3.2.3 introduces the projection- and regression-based variable reconstruction that allows determining the kind and size of complex fault scenarios. The next subsection describes residual-based tests to diagnose abnormal process conditions.

3.2.2 Residual-based tests

Wise et al. (1989a) and Wise and Gallagher (1996) introduced residual-based tests that relate to the residuals of a PCA model. Residual-based tests for PLS models can be developed as a project in the tutorial session at the end of the chapter. Preliminaries for calculating the error variance are followed here by outlining the hypothesis tests for identifying which variable is affected by an abnormal operating condition.

3.2.2.1 Preliminaries

The residual variance for recovering the jth process variable, 1437, is

3.45 3.45

where 1438 is the jth row vector stored in 1440. Equation (3.45) follows from the fact that 1441, that the t-score variables are statistically independent and that each error variable has the variance 1442. If 1443 is unknown, it can be estimated from 1444,

3.46 3.46

That 1445 follows from 1446 and the fact that the p-loading vectors are mutually orthonormal. Equation (2.122) points out that 1447 is equal to the sum of the diagonal elements of 1448, which yields

3.47 3.47

Equation (3.47) only requires the first n eigenvalues and eigenvectors of 1450.

Assuming that the availability of a data set of Kf samples describing an abnormal operating condition and a second data set of normal process behavior containing K samples, it is possible to calculate the following ratio which follows an F-distribution

3.48 3.48

where both variances can be computed from their respective data sets. This allows testing the null hypothesis that the jth variable is not affected by a fault condition

3.49 3.49

or the alternative hypothesis that this variable is affected

3.50 3.50

Testing the null hypothesis can only be performed if the last nzn eigenvalues are identical, that is, the variance of the error variables are identical. If this is not the case, however, the simplification leading to (3.47) cannot be applied and the variance of each residual must be estimated using computed residuals. A procedure for determining which variable is affected by a fault condition is given next.

3.2.2.2 Variable contribution to detected fault condition

The mean for each of the residuals is zero

3.51 3.51

which can be taken advantage of in testing whether a statistically significant departure in mean occurred. This is a standard test that is based on the following statistic

3.52 3.52

where 1455 is the mean estimated from the Kf samples describing a fault condition. This statistic follows a t-distribution with the following upper and lower control limit

3.53 3.53

for a significance α. The null hypothesis, H0, is therefore

3.54 3.54

The alternative hypothesis, H1 is accepted if

3.55 3.55

By inspecting the above hypothesis tests in (3.49) and (3.55) for changes in the error variance and the mean of the error variables for recovering the recorded process variables, it is apparent that they require a set of recorded data that describes the fault condition. This, however, hampers the practical usefulness of these residual-based tests given that a root cause analysis should be conducted upon detection of a fault condition. Such immediate analysis can be offered by contribution charts and variable reconstruction charts that are discussed next.

3.2.3 Variable reconstruction

Variable reconstruction exploits the correlation among the recorded process variables and hence, variable interrelationships. This approach is closely related to the handling of missing data in multivariate data analysis (Arteaga and Ferrer 2002; Nelson et al. 1996; Nelson et al. 2006) and relies on recovering the variable set z0 and x0 based on the n source variables of the PCA and PLS data structures, respectively

3.56 3.56

According to 2.2 and 2.24, 1463 and 1464 mainly describe the impact of the common-cause variation, Ξs and 1466, upon z0 and x0, respectively.

3.2.3.1 Principle of variable reconstruction

For PCA, the derivation below shows how to reconstruct a subset of variables in z0. A self-study project in the tutorial section aims to develop a reconstruction scheme for PLS. Under the assumption that n < nz, it follows from (3.56) that

3.57 3.57

which can be rewritten to become

3.58 3.58

where 1471, 1472, 1473 and 1474. The linear dependency between the elements of 1475 follow from

3.59 3.59

The above partition of the loading matrix into the first n rows and the remaining nzn rows is arbitrary. By rearranging the row vectors of P along with the elements of 1479 any nzn elements in 1481 are linearly dependent on the remaining n elements. Subsection 3.3.1 discusses this in more detail.

Nelson et al. (1996) described three different techniques to handle missing data. The analysis in Arteaga and Ferrer (2002) showed that two of them are projection-based while the third one uses regression. Variable reconstruction originates from the projection-based approach for missing data, which is outlined next. The regression-based reconstruction is then presented, which isolates deterministic fault signatures from the t-score variables and removes such signatures from the recorded data.

3.2.3.2 Projection-based variable reconstruction

The principle of projection-based variable reconstruction is best explained here by a simple example. This involves two highly correlated process variables, z1 and z2. For a high degree of correlation, Figure 2.1 outlines that a single component can recover the value of both variables without a significant loss of accuracy. The presentation of the simple introductory example is followed by a regression-based formulation of projecting samples optimally along predefined directions. This is then developed further to directly describe the effect of the sample projection upon the Q statistic. Next, the concept of reconstructing single variables is extended to include general directions that describe a fault subspace. Finally, the impact of reconstructing a particular variable upon the Hotelling's T2 and Q statistics is analyzed and a regression-based reconstruction approach is introduced for isolating deterministic fault conditions.

An introductory example

Assuming that the sensor measuring variable z1 produces a bias that takes the form of a step (sensor bias), the projection of the data vector 1489 onto the semimajor of the control ellipse yields

3.60 3.60

with Δz1 being the magnitude of the sensor bias. Using the score variable tf to recover the sensor readings gives rise to

3.61 3.61

The recovered values are therefore

3.62 3.62

This shows that the recovery of the value of both variables is affected by the sensor bias of z1. The residuals of both variables are also affected by this fault

3.63 3.63

since

3.64 3.64

Subsection 3.3.1 examines dependencies in variable contributions to fault conditions, a general problem for constructing contribution charts (Lieftucht et al. 2006a; Lieftucht et al. 2006b).

The application of projection-based variable reconstruction, however, can overcome this problem if the fault direction is known a priori (Dunia et al. 1996; Dunia and Qin 1998) or can be estimated (Yue and Qin 2001) using a singular value decomposition. The principle is to remove the faulty sensor reading for z1 by an estimate, producing the following iterative algorithm

3.65 3.65

The iteration converges for j → ∞ and yields the following estimate for 1495

3.66 3.66

On the other hand, if a sensor bias is affecting z2 the estimate 1497 is then

3.67 3.67

Figure 3.3 presents an example where the model subspace is spanned by 1498 and a data vector 1499 describes a sensor bias to the first variables, 1500. The series of arrows show the convergence of this iterative algorithm. Applying this scheme for 1501, Table 3.1 shows how 1502, 1503, 1504 and the convergence criterion 1505 changes for the first few iteration steps. According to Figure 3.3 and Table 3.1, the ‘corrected’ or reconstructed data vector is 1506.

Figure 3.3 Illustrative example of projection-based variable reconstruction.

3.3

Table 3.1 Results of reconstructing z10 using the iterative projection-based method

c03tnt001

Regression-based formulation of projection-based variable reconstruction

In a similar fashion to the residual-based tests in Subsection 3.2.2, it is possible to formulate the projection-based variable reconstruction scheme as a least squares problem under the assumptions that:

  • the fault condition affects at most n sensors;
  • the fault signature manifest itself as a step-type fault for the affected variables; and
  • a recorded data set contains a sufficiently large set of samples representing the fault condition.

A sensor bias that affects the mth variable can be modeled as follows

3.68 3.68

where 1509 is an Euclidian vector for which the mth element is 1 and the remaining ones are 0, and Δzm is the magnitude of the sensor bias. The difference between the measured data vector 1512 and 1513 is the Gaussian distributed vector 1514, whilst the 1515 is Gaussian distributed with the same covariance matrix, that is, 1516, which follows from the data model in 2.2 and Table 2.1. This implies that the fault magnitude, Δzm, can be estimated if the fault direction 1518 is known by the following least squares formulation

3.69 3.69

The solution of (3.69) is the estimate of the mean value for 1519

3.70 3.70

which converges to the true fault magnitude as 1520.

Impact of fault condition upon Q statistic

Dunia and Qin (1998) showed that projecting samples along 1522 has the following impact on the Q statistic

3.71 3.71

Knowing that 1524, (3.71) becomes

3.72 3.72

where 1525. It follows from (3.72) that if

3.73 3.73

Δzm asymptotically converges to the true fault magnitude allowing a complete isolation between the step-type fault and normal stochastic process variation, hence

3.74 3.74

The optimal solution of the objective function in (3.71) is given by

3.75 3.75

The estimates for Δzm in (3.70) and (3.75) are identical. To see this, the sensor bias upon has the following impact on the residuals

3.76 3.76

which we can substitute into the expectation of (3.75)

3.77 3.77

Geometrically, the fault condition moves the samples along the direction 1528 and this shift can be identified and removed from the recorded data.

Extension to multivariate fault directions

A straightforward extension is the assumption that 1529 sensor faults arise at the same time. The limitation 1530 follows from the fact that the rank of P is n and is further elaborated in Subsection 3.3.2. The extension of reconstructing up to n variables would enable the description of more complex fault scenarios. This extension, however, is still restricted by the fact that the fault subspace, denoted here by 1534, is spanned by Euclidian vectors

3.78 3.78

where 1535 is an index set storing the variables to be reconstructed, for example 1536, 1537, ··· , 1539. The case of 1540 storing non-Euclidian vectors is elaborated below. Using the fault subspace 1541 projection-based variable reconstruction identifies the fault magnitude for each Euclidian base vector from the Q statistic

3.79 3.79

Here, 1543. The solution for the least squares objective function is

3.80 3.80

with 1544 being the generalized inverse of 1545. If 1546 is the correct fault subspace, the above sum estimates the fault magnitude 1547 consistently

3.81 3.81

since

3.82 3.82

This, in turn, implies that the correct fault subspace allows identifying the correct fault magnitude and removing the fault information from the Q statistic

3.83 3.83

So far, we assumed that the fault subspace is spanned by Euclidian vectors that represent individual variables. This restriction can be removed by defining a set of up to 1549 linearly independent vectors 1550, 1551, … , 1553 of unit length, which yields the following model for describing fault conditions

3.84 3.84

The fault magnitude in the direction of vector 1554 is 1555. The 1556 vectors can be obtained by applying a singular value decomposition on the data set containing 1557 samples describing the fault condition (Yue and Qin 2001). Figure 3.4 gives a graphical interpretation of this regression problem. Equation (3.79) presents the associated objective function for estimating 1558. Different from the projection along Euclidian directions, if the fault subspace is constructed from non-Euclidian vectors, the projection-based variable reconstruction constitutes an oblique projection.

Figure 3.4 Graphical illustration of multivariate projection-based variable reconstruction along a predefined fault axis.

3.4

Projection-based variable reconstruction for a single sample

As for residual-based tests, the above projection-based approach requires a sufficient number of samples describing the fault condition, which hampers its practical usefulness. If an abnormal event is detected, the identification of potential root causes is vital in order to support process operators in carrying out appropriate responses. Despite the limitations of contribution charts, discussed in (Lieftucht et al. 2006a; Yoon and MacGregor 2001) and Subsection 3.3.1, they should be used as a first instrument.

Equations (3.60) to (3.66) show that a sensor bias can be removed if it is known which one is faulty. Dunia and Qin (1998), and Lieftucht et al. (2006b) pointed out that projection-based reconstruction and contribution charts can be applied together to estimate the impact of a fault condition upon individual process variables. A measure for assessing this impact is how much the reconstructed variable reduces the value of both non-negative quadratic statistics. For the Q statistic, this follows from (3.72) for 1560 and 1561, and (3.79) for 1562.

As highlighted in Lieftucht et al. (2006a), however, the projection-based reconstruction impacts the geometry of the PCA model and therefore the non-negative quadratic statistics. Subsection 3.3.2 describes the effect of reconstructing a variable set upon the geometry of the PCA model. In summary, the procedure to incorporate the effect of projecting a set of 1563 process variables is as follows.

1. Reconstruct the data covariance matrix by applying the following equation

3.85 3.85

where:
  • 1564;
  • 1565;
  • 1566 is a data vector that stores the variables to be reconstructed as the first 1567 elements, 1568, and the variables that are not reconstructed as the remaining 1569 elements, 1570;
  • the matrices denoted by the superscript * refer to the variables stored in the rearranged vector 1572;
  • the symbol 1573 refers to the reconstructed portions of the data covariance matrix describing the impact of reconstructing the set of 1574 process variables;
  • 1575 is the partition of the data covariance matrix 1576 that relates to the variable set that is not reconstructed; and
  • the indices 1, 2 and 3 are associated with the covariance matrix of the reconstructed variables, the cross covariance matrix that includes the reconstructed and the remaining variables and the covariance matrix of the unreconstructed variables, respectively.
2. Calculate the eigendecomposition of 1577, that is, calculate 1578 and 1579.
3. Compute the T2 statistic from the retained score variables 1581 with 1582, i.e. 1583.
4. Determine the Q statistic, 1585, and recompute the control limit by reapplying (3.16) using the discarded eigenvalues of 1586.

Ignoring the effect the reconstruction process imposes upon the underlying PCA model is demonstrated in Chapter 4.

Limitations of projection-based variable reconstruction

Despite the simplicity and robustness of the projection-based variable reconstruction approach, it is prone to the following problems (Lieftucht et al. 2009).


Problem 3.2.1
The maximum number of variables to be reconstructed equals the number of retained principal components Dunia and Qin (1998). This limitation renders the technique difficult to apply for diagnosing complex process faults, which typically affect a larger number of variables.

 


Problem 3.2.2
The reconstruction process reduces the dimensions of the residual subspace and if 1587 the Q statistic does not exist (Lieftucht et al. 2006b). This is a typical scenario for cases where the ratio n/nz is close to 1.

 


Problem 3.2.3
The fault condition is assumed to be described by the linear subspace 1590. Therefore, a fault path that represents a curve or trajectory is not fully reconstructible.

These problems are demonstrated in Lieftucht et al. (2009) through the analysis of experimental data from a reactive distillation column. A regression-based variable reconstruction is introduced next to address these problems.

3.2.3.3 Regression-based variable reconstruction

This approach estimates and separates the fault signature from the recorded variables (Lieftucht et al. 2006a; Lieftucht et al. 2009). Different from projection-based reconstruction, the regression-based approach relies on the following assumptions:

  • The fault signature in the score space, 1591, is deterministic, that is, the signature for a particular process variable is a function of the sample index k; and
  • The fault is superimposed onto the process variables, that is, it is added to the complete set of score variables

3.86 3.86

where the subscript f refers to the corrupted vectors,
  • 1594; and
  • 1595.

In contrast to the regression-based technique for missing data, different assumptions are applied to the score variables leading to a method for variable reconstruction according to Figure 3.5. Based on the above assumptions, the fault signature can be described by a parametric curve that can be identified using various techniques, such as polynomials, principal curves and artificial neural networks (Walczak and Massart 1996). For simplicity, radial basis function networks (RBFNs) are utilized here.

Figure 3.5 Structure of the regression based variable reconstruction technique.

3.5

According to Figure 3.5, 1596 is subtracted from 1597. The benefit of using the score variables follows from Theorem 9.3.2. The score variables are statistically independent but the original process variables are highly correlated. Equations (3.61) and (3.63) describe the negative impact of variable correlation upon the PCA model, which not only identify a faulty sensor but may also suggest that other variables are affected by a sensor bias. Separating the fault signature from the corrupted samples on the basis of the score variables, however, circumvents the correlation issue.

The first block in Figure 3.5 produces the complete set of nz score variables f(k) from the corrupted samples of the process variables 1600. The score variable set is then used to estimate the fault signature 1601 as above which is subtracted from the score variables to estimate 1602. Since a fault signature may affect each of the score variables, all score variables need to be included to estimate the fault signature.

The output from a radial basis function network is defined by

3.87 3.87

Here, R is the number of network nodes, 1604, 1 ≤ j ≤ nz, 1 ≤ i ≤ R is a Gaussian basis function for which ci and 7.1 are the center and the width, respectively, and 1609 are the network weights. By applying (3.87), the separation of the fault signature becomes

3.88 3.88

where, 1610 is a vector storing the values for each network node for the kth sample, A is a parameter matrix storing the network weights and 1613 is the isolated stochastic part of the computed score variables tf(k). For simplicity, the center of Gaussian basis functions is defined by a grid, that is, the distance between two neighboring basis functions is equivalent for each pair and their widths are assumed to be predefined. The training of the network therefore reduces to a least squares problem.

Chapter 4 shows that the regression-based reconstruction has the potential to provide a clear picture of the impact of a fault condition. Lieftucht et al. (2009) present another example that involves recorded data from an experimental reactive distillation unit at the University of Dortmund, Germany, where this technique could offer a good isolation of the fault signatures for a failure in the reflux preheater and multiple faults in cooling water and acid feed supplies.

3.3 Geometry of variable projections

For the projection-based reconstruction of a single sample, this section analyzes the geometry of variable projections, which involves orthogonal projections from the original variable space onto smaller dimensional reconstruction subspaces that capture significant variation in the original variable set. Subsection 3.3.1 examines the impact of such projections upon the contribution charts of the Q statistic for PCA and PLS and shows that the variable contributions are linearly dependent. This is particularly true if the number of retained components is close to the size of the original variable set. Subsection 3.3.2 then studies the impact of variable reconstruction upon the geometry of the PCA model.

3.3.1 Linear dependency of projection residuals

Given the PCA residual vector 1616 and the centered data vector 1617, it is straightforward to show that the residual vector is orthogonal to the model subspace

3.89 3.89

Here, 1618, is the loading matrix, storing the eigenvectors of the data covariance matrix as column vectors, which span the model plane. The above relationship holds true, since the eigenvectors of a symmetric matrix are mutually orthonormal. Equation (3.89), however, can be further elaborated

3.90 3.90

and partitioning it as follows

3.91 3.91

which gives rise to

3.92 3.92

Ψ1 is a square matrix and can be inverted, which yields

3.93 3.93

The above relationship is not dependent upon the number of source signals n. In any case there will be a linear combination of the variable contribution of the Q statistic.

3.3.2 Geometric analysis of variable reconstruction

The geometrical properties of the multidimensional reconstruction technique for a single sample are now analyzed. Some preliminary results for one-dimensional sensor faults were given in Lieftucht et al. (2004) and a more detailed analysis is given in Lieftucht et al. (2006a). For simplicity, the data vector z0 is rearranged such that the reconstructed variables are stored as the first 1622 elements. The rearranged vector and the corresponding covariance and loading matrix are denoted by 1623, 1624 and P*, respectively. Furthermore, the reconstructed vector and the reconstructed data covariance and loading matrices are referred to as 1626, 1627 and 1628, respectively.

3.3.2.1 Optimality of projection-based variable reconstruction

The following holds true for the reconstructed data vector and the model subspace.


Theorem 3.3.1
After reconstructing 1629 variables, the reconstructed data vector lies in an 1630 dimensional subspace, such that the orthogonal distance between the reconstructed vector and the model subspace is minimized.

 


Proof.
The residual vector, 1631 is equal to

3.94 3.94

where,

3.95 3.95

Here Ω is a projection matrix, which is defined below

3.96 3.96

The squared distance between the reconstructed point and the model subspace is

3.97 3.97

Here 1633, 1634 and 1635. Defining the objective function

3.98 3.98

yields the following minimum

3.99 3.99

The resulting expression for Ω is equivalent to the projection matrix obtained from reconstructing the variable set 1637 using projection-based variable reconstruction. This follows from an extended version of (3.65) for j → ∞

3.100 3.100

which can be simplified to

3.101 3.101


3.3.2.2 Reconstruction subspace

The data vector 1639 but the reconstruction of 1640 to form 1641 results in a projection of 1642 onto a 1643-dimensional subspace, which Theorem 3.3.2 formulates. Any sample 1644 is projected onto this reconstruction subspace along the 1645 directions defined in ϒ. As Theorem 3.3.1 points out, the distance of the reconstructed sample and the model subspace is minimal.


Theorem 3.3.2
The reconstruction of 1647 is equivalent to the projection of 1648 onto a 1649-dimensional subspace Π. This subspace is spanned by the column vectors of the matrix Π

3.102 3.102

which includes the model and the residual subspace.

 


Proof.
After reconstructing 1652, the subspace in which the projected samples lie is given by the following 1653 base vectors. These can be extracted from 1654

3.103 3.103

and are consequently given by:

3.104 3.104

Note that the base vectors defined in (3.104) are not of unit length.

3.3.2.3 Maximum dimension of fault subspace

Theorem 3.3.3 discusses the maximum dimension of the fault subspace.


Theorem 3.3.3
The maximum dimension of the fault subspace ϒ is n irrespective of whether the columns made up of Euclidian or non-Euclidian vectors.

 


Proof.
ϒ contains Euclidian vectors. Following from (3.101), rewriting the squared matrix 1658 using the matrix inversion lemma yields

3.105 3.105

and produces the following equivalent projection matrices

3.106 3.106

Both sides, however, can only be equivalent if the inverse of 1659 exists. Given that the rank of 1660 is equal to n and assuming that any combination of n column vectors in 1663 span a subspace of dimension n, the maximum partition in the upper left corner of 1665 can be an n by n matrix. This result can also be obtained by reexamining the iteration procedure in (3.65)

3.107 3.107

If the iteration is initiated by selecting 1668 and the iterative formulation in (3.107) is used to work backwards from 1669 to 1670, it becomes

3.108 3.108

Asymptotically, however,

3.109 3.109

The above relationships yield some interesting facts. First, the block matrix 1671 is symmetric and positive definite if 1672. The symmetry follows from the symmetry of 1673 and the positive definiteness results from the partitioning of

3.110 3.110

That 1674 has positive eigenvalues follows from a singular value decomposition of 1675, which is shown in Section 9.1. Secondly, the eigenvalues of 1676 must be between 0 and 1. More precisely, eigenvalues > 0 follow from the positive definiteness of 1678 and eigenvalues > 1 would not yield convergence for 1680.

 


Proof.
ϒ contains non-Euclidian vectors. If the model of the fault condition is 1682 an objective function, similar to (3.79), can be defined as follows

3.111 3.111

which yields the following estimate of the fault magnitude

3.112 3.112

Only 1683 guarantees that 1684 is invertible. If this is not the case, the projection-based reconstruction process does not yield a unique analytical solution.

3.3.2.4 Influence on the data covariance matrix

For simplicity, the analysis in the remaining part of this subsection assumes that ϒ stores Euclidian column vectors. It is, however, straightforward to describe the impact of the projection-based reconstruction upon the data covariance matrix if ϒ includes non-Euclidian column vectors, which is a project in the tutorial session. Variable reconstruction leads to changes of the data covariance matrix which therefore must be reconstructed. Partitioning the rearranged data covariance matrix 1687

3.113 3.113

where 1688 and 1689 and 1690, the reconstruction of 1691 affects the first two matrices, 1692 and 1693

3.114 3.114

and

3.115 3.115

where 1694, and 1695 and 1696 are the covariance and cross-covariance matrices involving 1697. Replacing 1698 by 1699 and 1700 by 1701

3.116 3.116

yields the following corollary.


Corollary 3.3.4
The rank of 1702 is 1703, as the block matrices 1704 and 1705 are linearly dependent on 1706.

The effect of variable reconstruction upon the model and residual subspaces, which are spanned by eigenvectors of 1707, is analyzed in the following subsections.

3.3.2.5 Influence on the model plane

Pearson (1901) showed that the squared length of the residual vector between K mean-centered and scaled data points of dimension nz and a given model subspace of dimension n is minimized if the model subspace is spanned by the first-n dominant eigenvectors of the data covariance matrix. Theorems 3.3.1 and 3.3.2 highlight that projecting samples onto the subspace Π leads to a minimum distance between the projected points and the model subspace. This gives rise to the following theorem.


Theorem 3.3.5
The reconstruction of the 1713 variables does not influence the orientation of the model subspace.

The above theorem follows from the work of Pearson (1901), given that the reconstructed samples have a minimum distance from the original model subspace.


Corollary 3.3.6
That there is no affect upon the orientation of the model subspace does not imply that the n dominant eigenvectors of 1715 and 1716 are identical.

The above corollary is a result of the changes that the reconstruction procedure imposes on 1717, which may affect the eigenvectors and the eigenvalues.


Corollary 3.3.7
The dominant eigenvectors and eigenvalues of 1718 may differ from those of 1719, which implies that the directions for which the score variables have a maximum variance and the variance of each score variable may change.

The influence of the projection onto the residual subspace is discussed next.

3.3.2.6 Influence on the residual subspace

Following from the preceding discussion, the reconstruction results in a shift of a sample in the direction of the fault subspace, such that the squared length of the residual vector is minimal (Theorem 3.3.1). Since the reconstruction procedure is, in fact, a projection of z0 onto Π, which is of dimension 1722 (Theorem 3.3.2), it follows that the dimension of the residual subspace is 1723, because the dimension of the model subspace remains unchanged.

Since the model subspace is assumed to describe the linear relationships between the recorded and source variables, which follows from Equation (2.2), the 1724 discarded eigenvalues represent the cumulative variance of the residual vector. Moreover, given that 1725 has a rank of 1726, 1727 eigenvalues are equal to zero.


Corollary 3.3.8
If 1728, the cumulative variance of g*, 1730, is equal to 1731. In contrast, the cumulative variance of 1732 is 1733 and hence, 1734.

 


Corollary 3.3.9
Variable reconstruction has a minor effect on the data covariance matrix if the ratio 1735 is small. Conversely, if 1736 is close to 1, the squared length of 1737 can be significantly affected by the reconstruction process.

An example to illustrate the effect of the above corollaries is given in Chapter 4. It is important to note that even if the projection-based variable reconstruction has only minor effect on the data covariance matrix, this influence will lead to changes of the monitoring statistics and their confidence limits, which is examined next.

3.3.2.7 Influence on the monitoring statistics

The impact of variable reconstruction manifests itself in the construction of 1738, which yields a different eigendecomposition. Since the Hotelling's T2 and the Q statistic are based on this eigendecomposition, it is necessary to account for such changes when applying projection-based variable reconstruction. For reconstructing a total of 1741 variables, this requires the following steps to be carried out:

1. Reconstruct the covariance matrix by applying (3.114) to (3.116).
2. Compute the eigenvalues and eigenvectors of 1742, that is, 1743 and 1744.
3. Calculate the Hotelling's T2 statistic using the dominant n eigenvectors and eigenvalues of 1747, that is, 1748 and 1749, where 1750 that is, defined in (3.95).
4. Compute the Q statistic, 1752, and recalculate the confidence limits for the Q statistic by applying (3.16) using the discarded eigenvalues of 1754.

This procedure allows establishing reliable monitoring statistics for using projection-based variable reconstruction. If the above procedure is not followed, the variance of the score variables, the loading vectors and the variance of the residuals are incorrect, which yields erroneous results (Lieftucht et al. 2006a).

3.4 Tutorial session

Question 1:

What is the advantage of assuming a Gaussian distribution of the process variables for constructing monitoring charts?

Question 2:

With respect to the data structure in 2.2, develop fault conditions that do not affect (i) the Hotelling's T2 statistic and (ii) the residual Q statistic. Provide general fault conditions which neither statistic is sensitive to.

Question 3:

Considering that the Hotelling's T2 and Q statistics are established on the basis of the eigendecomposition of the data covariance matrix, is it possible to construct a fault condition that neither affects the Hotelling's T2 nor the Q statistic?

Question 4:

Provide a proof of (3.27).

Question 5:

Provide proofs for (3.72) and (3.75).

Question 6:

For projections along predefined Euclidean axes for single samples, why does variable reconstruction affect the underlying geometry of the model and residual subspaces?

Question 7:

Following from Question 6, why is the projection-based approach for multiple samples not affected by variable reconstruction to the same extent as the single-sample approach? Analyze the asymptotic properties of variable reconstruction.

Question 8:

Excluding the single-sample projection-based approach, what is the disadvantage of projection- and regression-based variable reconstruction over contribution charts?

Project 1:

With respect to Subsection 3.2.2, develop a set of residual-based tests for PLS.

Project 2:

For PCA, knowing that 1761, 1762 and 1763, design an example that describes a sensor fault for the first variable but for which the contribution chart for the Q statistic identifies the second variable as the dominant contributor to this fault. Can the same problem also arise with the use of the residual-based tests in Subsection 3.2.2? Check your design by a simulation example.

Project 3:

With respect to Subsection 3.2.3, develop a variable reconstruction scheme for the input variable set of a PLS model. How can a fault condition for the output variable set be diagnosed?

Project 4:

Simulate an example involving 3 process variables that is superimposed by a fault condition described by 1765, 1766, 1767, 1768 and develop a reconstruction scheme to estimate the fault direction 1769 and the fault magnitude Δzυ.

Project 5:

Using a simulation example that involves 3 process variables, analyze why projection-based variable reconstruction is not capable of estimating the fault signature is not of the form 1771, i.e. 1772, with Δz(k) being a deterministic sequence. Contrast the performance of the regression-based variable reconstruction scheme with that of the projection-based scheme? How can a fault condition be diagnosed if it is of the form Δz(k) but stochastic in nature?

 

 

Notes

1 For a central χ2 distribution, the mean of each contributing element is assumed to be zero unlike a noncentral where this assumption is relaxed.

2 The mean and variance of χ2(h), E2(h)} and E{(χ2(h) − h)2}, are h and 2h, respectively.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.149.239.82