Sage Journals: Discover world-class research

Abstract

The issue of quality-related fault detection in the industrial process has attracted much attention in recent years. The partial least squares (PLS) is considered an efficient tool for predicting and monitoring. The modified partial least squares (MPLS) is an extended algorithm for solving the oblique decomposition of PLS, however, the study indicated that the loss of quality variable information may affect the prediction of quality information in the decomposition process of the MPLS algorithm. Furthermore, the detection rate of traditional statistics and static control limit is low, and the existing dynamic control limit has certain limitations. Therefore, a new PLS space-decomposition algorithm called advanced partial least squares (APLS) is proposed. APLS avoids the loss of quality information by orthogonal decomposition of process variables according to their relationship with quality. APLS has a more accurate prediction of quality when process variables contain more noise; the fault false alarm rates (FAR) of quality-related faults are reduced by using the new statistics and thresholds combined with local information increment technology in the process variable principal component subspace. Finally, the effectiveness of the proposed approach is verified by a numerical example and an industrial benchmark problem.

Keywords

Introduction

The issue of quality-related fault detection^1,2 has attracted much attention in recent years. The partial least squares^3,4 (PLS) is a popular method of multivariate statistical process monitoring.^5,6 Thanks to its efficiency in processing huge amounts of highly correlated plant data, PLS is recognized as a powerful tool for data-driven^7–10 model establishment, fault detection and diagnosis.¹¹ Since the final quality of the product receives more attention from technicians, PLS is suitable for monitoring and predicting critical performance indicators in industrial production. PLS can decompose the process variable into spaces that are related to the product quality and unrelated to the product quality. By monitoring the former, it is convenient to detect the faults of the production process and understand the quality change of the product indirectly. Because the quality-unrelated spaces have no impact on the product quality, no monitoring is required. Therefore, PLS is an efficient method as a whole for improving product quality and production efficiency by reducing FAR and increasing the fault effective alarm rates (EAR)¹² in quality-related fault detection. The quality-unrelated faults, while having a minimal impact on the product’s quality, hold significant importance for the overall health monitoring of the system. Doostmohammadian et al.^13,14 discuss fault detection and isolation via networked estimation in their study for both full-rank and rank-deficient dynamical systems. These scenarios are accompanied by the presence of system and measurement noise. These studies provide crucial guidance for monitoring quality-unrelated faults and hold significant research value in observing the system’s health status.

In the classic PLS algorithm, a model with normal data is first developed, and then the model parameters are obtained to perform spatial decomposition on the testing data. Finally, the corresponding statistics and thresholds are calculated in the decomposed space to use the fault criterion for fault detection. However, Li et al.¹⁵ shows the classic PLS performed an oblique decomposition on process space, which resulted in the significant process variable information remaining in the residual subspace directly. To achieve complete monitoring of quality-related information, Zhou et al.¹⁶ proposed a preprocessing approach called total projection to latent structures (T-PLS) by further decomposing the score and load matrices of the classic PLS. However, T-PLS unnecessarily decomposes process variables into four spaces, which makes the algorithm model even more complicated. Therefore, Yin et al.¹⁷ proposed a modified PLS (MPLS) to solve the orthogonal problem. The MPLS algorithm decomposes the quality variable into two orthogonal parts directly according to the relationship between the quality variable and the process variable, followed by using the relational matrix obtained to decompose the process variables. Soon later, Wang and Yin¹⁸ develop an enhanced method to solve the same problem effectively by combining orthogonal signal correction with the MPLS. To improve the fault detection rate of the algorithm in practical applications, a new spatial decomposition algorithm should be designed. Besides, the fault detection rate can be improved by designing new statistics and thresholds in the corresponding subspace, and the local information increment technology^19,20 is an effective method to achieve this goal. The local information increment technology mainly calculates the local information increment mean and local dynamic threshold by defining a local covariance matrix, and reduces the data when updating the covariance matrix with a fixed threshold window. The statistics obtained by APLS are calculated based on the data close to the current time, which can eliminate the influence of correlation between two adjacent samples. The statistics of APLS can better reflect the actual changing characteristics of the data.

Note that the space decomposition of MPLS may cause the loss of output information and cannot orthogonally decompose the input according to the relationship between the input and the output correctly. Besides, the aforementioned methods generally use the traditional statistical and threshold design methods, which may result in high FAR and low EAR as a whole. To tackle these problems, the key contributions of this work are summarized as follows: (1) A new multi-space class algorithm called advanced partial least squares (APLS) is proposed to solve the problem of quality information loss, where the desired process variable decomposition form is first given. The process variable is orthogonally decomposed into two parts: one is only related to the quality (called principal subspace, PS), and the other is unrelated to the quality (called residual subspace, RS). (2) To solve the problem of low quality-related fault detection rate, the new statistics and thresholds are calculated in PS combined with local information increment technology in APLS, and the detection performance of the algorithm is improved. (3) To remove the noise variation interference from process variables and improve the prediction accuracy of the quality-related information. Then, the quality variables are orthogonally decomposed into predictable quality information subspace and unpredictable quality information subspace according to the coefficient matrix between the quality variables and PS.

The rest part of this paper is organized as follows. In section “Related work,” the MPLS-based fault-detection method is reviewed and the problem is formulated. Section “Proposed fault detection approach” describes the proposed approach in detail. In Section “Numerical example and case study,” a numerical example and a case study are carried out respectively to test the performance of the proposed approach. Finally, the conclusion is presented in Section “Conclusion.”

Related work

The MPLS algorithm is an extended algorithm to address the oblique decomposition problem of the classic PLS. It first gives the input matrix and the output matrix as follows:

X = [\begin{matrix} x_{1}^{T} \\ . . . \\ x_{N}^{T} \end{matrix}] \in R^{N \times m}, Y = [\begin{matrix} y_{1}^{T} \\ . . . \\ y_{N}^{T} \end{matrix}] \in R^{N \times l}

(1)

x_{i} \in R^{m}, y_{i} \in R^{l}, i = 1, . . ., N

(2)

where N represents the number of samples, m and l represent the number of variables of input $X$ and output $Y$ respectively. MPLS first gives the following desired decomposition form and then decomposes $Y$ into two subspaces orthogonal to each other according to the relationship with $X$ .

Y = XM + E_{y} = \hat{Y} + E_{y}

(3)

where $M$ is the relationship information matrix of $X$ and $Y$ . According to the aforementioned relationship, we can get the following derivation¹⁷:

\frac{1}{N} Y^{T} X = \frac{1}{N} M^{T} X^{T} X + \frac{1}{N} {E_{y}}^{T} X \approx M^{T} \frac{X^{T} X}{N}

(4)

Therefore, the relationship information matrix $M$ of $X$ and $Y$ can be calculated. Perform SVD²¹ on $M M^{T}$ to decompose $X$ orthogonally, then construct the orthogonal projection operator. MPLS projects the process variable $X$ along the direction of the aforementioned orthogonal projection operator, and two orthogonal subspaces $\hat{X}$ and $\bar{X}$ can be obtained. Finally, the external model of MPLS is given as follows:

{\begin{matrix} X = \hat{X} + \bar{X} \\ Y = XM + E_{y} \end{matrix}

(5)

As can be seen from formula (5), $\hat{X}$ is responsible for predicting output $Y$ , while $\bar{X}$ has almost no contribution to predicting output $Y$ . Although the performance of the MPLS algorithm in actual applications has been greatly improved compared with PLS, the following problems still exist. (1) The space decomposition of MPLS may cause loss of the output information failing to orthogonally decompose the input according to the relationship between the input and the output correctly. (2) MPLS uses the traditional statistical and static threshold design methods, which may result in high FAR and low EAR.

Proposed fault detection approach

In this section, a new spatial decomposition algorithm APLS is proposed to deal with the shortcomings of MPLS. It is mainly involved in the spatial decomposition principle of the APLS algorithm, the statistical and threshold design method based on local information increment technology, and the detailed steps of the algorithm.

Complete space decomposition

The PLS algorithm aims to completely decompose the process variable $X$ into two subspaces that are related and unrelated to the quality variable $Y$ , and then monitor the two subspaces separately. Based on the goal of PLS, this paper first gives the desired decomposition form of the process variable $X$ as follows:

X = Y Φ + R_{X} = \hat{X} + R_{X}

(6)

It should be emphasized that $X$ and $Y$ in (6) are a known set of modeling data, and have no meanings when used as input and output data. $Φ$ is the model parameter obtained when $X$ and $Y$ are known as input and output data and contains correlation information between $X$ and $Y$ . $R_{X}$ is a space orthogonal to $Y$ containing information not related to $Y$ . The above discussion leads to the following formula:

cov (r_{x}, y) = ε {y r_{x}^{T}} = 0

(7)

where $r_{x}^{T}$ and $y^{T}$ are the row vector in $R_{X}$ and $Y$ , respectively. Without the loss of generality, $Φ$ is assumed as a full column-rank matrix and $1 \leq m < n$ .

According to (7), in the case of $N >> max {n, m}$ , it follows that

\frac{1}{N} X^{T} Y = \frac{1}{N} {(Y Φ + R_{X})}^{T} Y = \frac{1}{N} {(Y Φ)}^{T} Y + \frac{1}{N} R_{X}^{T} Y

(8)

Since $R_{X}^{T} Y = 0$ , then (8) becomes:

\frac{1}{N} X^{T} Y = \frac{1}{N} Φ^{T} Y^{T} Y

(9)

Formula (9) can be equivalently transformed to the following expression:

Φ = {(Y^{T} Y)}^{- 1} Y^{T} X

(10)

So far, the relation matrix $Φ$ of $X$ and $Y$ have been obtained. Based on this result, $X$ should be decomposed into two parts, that is, $\hat{X}$ and $R_{X}$ . $\hat{X}$ only contains variable information related to $Y$ and $R_{X}$ only contains variable information orthogonal to $Y$ . A simple way to perform the aforementioned decomposition is to project $X$ orthogonally onto $span {Φ}$ and $span {Φ}^{⊥}$ . Consequently, $\hat{X} = span {Φ}, R_{X} = span {Φ}^{⊥}$ . For the orthogonal projection, the following steps are necessary:

Perform SVD on matrix $Φ Φ^{T}$ :

Φ Φ^{T} = [\begin{matrix} {\hat{Γ}}_{ψ} & {\bar{Γ}}_{ψ} \end{matrix}] [\begin{matrix} Λ_{ψ} & 0 \\ 0 & 0 \end{matrix}] [\begin{matrix} {\hat{Γ}}_{ψ}^{T} \\ {\bar{Γ}}_{ψ}^{T} \end{matrix}]

(11)

where ${\hat{Γ}}_{ψ} \in R^{m \times l}, {\bar{Γ}}_{ψ} \in R^{m \times (m - l)}$ and $Λ_{ψ} \in R^{l \times l}$ .

Construct orthogonal projection operator $Ξ_{ψ}$ and $Ξ_{ψ}^{⊥}$ , which are the orthogonal projectors on $span {Φ}$ and $span {Φ}^{⊥}$ respectively, that is,

Ξ_{ψ} = {\hat{Γ}}_{ψ} {\hat{Γ}}_{ψ}^{T}, Ξ_{ψ}^{⊥} = {\bar{Γ}}_{ψ} {\bar{Γ}}_{ψ}^{T}

(12)

Decompose $X$ into two subspaces as follows:

{\begin{matrix} \hat{X} = X Ξ_{ψ} = X {\hat{Γ}}_{ψ} {\hat{Γ}}_{ψ}^{T} \in S_{\hat{x}} = span {M} \\ \bar{X} = X Ξ_{ψ}^{⊥} = X {\bar{Γ}}_{ψ} {\bar{Γ}}_{ψ}^{T} \in S_{\tilde{x}} = span {M}^{⊥} \end{matrix}

(13)

$Y$ is decomposed according to the relationship with $\hat{X}$ into a predictable part of quality-related information $\hat{Y}$ and an unpredictable part of quality-related information $E_{Y}$ as follows:

Y = \hat{X} Ω + E_{Y} = \hat{Y} + E_{Y}

(14)

To achieve the above decomposition, perform SVD again on matrix $Ω Ω^{T}$

Ω Ω^{T} = [\begin{matrix} \hat{P} & \bar{P} \end{matrix}] [\begin{matrix} Λ_{Ω} & 0 \\ 0 & 0 \end{matrix}] [\begin{matrix} {\hat{P}}^{T} \\ {\bar{P}}^{T} \end{matrix}]

(15)

where $\hat{P} \in R^{m \times l}, \bar{P} \in R^{m \times (m - l)}$ and $Λ_{Ω} \in R^{l \times l}$ . Construct orthogonal projection operator $Ψ$ and $Ψ^{⊥}$ , which are the orthogonal projectors on $span {Ω}$ and $span {Ω}^{⊥}$ respectively, that is,

Ψ = \hat{P} {\hat{P}}^{T}, Ψ^{⊥} = \bar{P} {\bar{P}}^{T}

(16)

Decompose $X$ into two subspaces as follows

\hat{Y} = Y Ψ = Y \hat{P} {\hat{P}}^{T} \in S_{\hat{Y}} = span {Ω}

(17)

E_{Y} = Y Ψ^{⊥} = Y \bar{P} {\bar{P}}^{T} \in S_{E_{Y}} = span {Ω}^{⊥}

(18)

The final APLS model is given as follows:

{\begin{matrix} X = \hat{X} + R_{X} = X {\hat{Γ}}_{ψ} {\hat{Γ}}_{ψ}^{T} + X {\bar{Γ}}_{ψ} {\bar{Γ}}_{ψ}^{T} \\ Y = \hat{Y} + E_{Y} = Y \hat{P} {\hat{P}}^{T} + Y \bar{P} {\bar{P}}^{T} \end{matrix}

(19)

Note that $\hat{X}$ only contains information related to the quality variable $Y$ , and $R_{X}$ is orthogonal to $Y$ ; $\hat{Y}$ is responsible for predicting information related to quality, and $E_{Y}$ is the residual of the quality variable.

Local information increment

The local information increment technology is mainly used to calculate the local data covariance matrix, local information increment matrix, and local information increment mean of the data matrix. The problem that the thresholds are static can be solved by using the local information increment technology. The specific steps are as follows:

Give a set of observation data:

X_{n} = [\begin{matrix} X_{1} (1) & X_{1} (2) & . . . & X_{1} (N) \\ X_{2} (1) & X_{2} (2) & . . . & X_{1} (N) \\ . . . & . . . & . . . & . . . \\ X_{m} (1) & X_{m} (2) & . . . & X_{m} (N) \end{matrix}] \in R^{m \times N}

(20)

Preprocess the sampled data and calculate the mean vector $b_{n}$ of $X_{n}$ as follows:

b_{n} = \frac{1}{n} X_{n} l_{n}

(21)

where $l_{n} = {[1, 1, \dots, 1]}^{T} \in R^{n \times 1}$ . Then, preprocess the original data to obtain $X_{n}^{l}$

X_{n}^{l} = X_{n} - b_{n} l_{n}^{T}

(22)

Choose the sampled data with a fixed window length L from normal data as the local data matrix:

X_{n}^{L} = [X (i_{n' - L + 1}), . . ., X (i_{n'})]

(23)

where $i_{n'}$ is a certain time in the normal sampling data. When the sampled data on the $n -$ 1th time arrives, the local data matrix becomes:

X_{n + 1}^{L} = [X (i_{n' - L + 2}), . . ., X (i_{n'}), X (n + 1)]

(24)

It can be seen from formula (23) and (24) that the local data matrix formed by the common part of the two is

Y_{n, n + 1}^{L} = [X (i_{n' - L + 2}), . . ., X (i_{n'})]

(25)

From (25), the mean vector defined by each sampling data can be obtained by

y_{n, n + 1}^{L} = \frac{1}{L - 1} Y_{n, n + 1}^{L} i_{n}

(26)

where $i_{n} = {[1, 1, \dots, 1]}^{T} \in R^{(L - 1) \times 1}$ . For calculation simplicity, we define

K_{n}^{L} = X_{n}^{L} {(X_{n}^{L})}^{T}

(27)

Then, the local covariance matrix of the $n th$ sampling time is

R_{n}^{L} = \frac{K_{n}^{L}}{L - 1} - \frac{{Lb}_{n}^{L} {(b_{n}^{L})}^{T}}{L - 1}

(28)

where $b_{n}^{L}$ is the mean vector of the $n th$ sampling time, shown as follows:

b_{n}^{L} = \frac{(L - 1) y_{n, n + 1}^{L} - X (i_{n' - L + 1})}{L}

(29)

Similarly, the following relationships exist on the $(n + 1) th$ sampling time:

K_{n + 1}^{L} = X_{n + 1}^{L} {(X_{n + 1}^{L})}^{T}

(30)

R_{n + 1}^{L} = \frac{K_{n + 1}^{L}}{L - 1} - \frac{{Lb}_{n + 1}^{L} {(b_{n + 1}^{L})}^{T}}{L - 1}

(31)

From (28) and (31), the local information increment matrix $D_{n + 1}^{L}$ can be obtained as:

D_{n + 1}^{L} = R_{n + 1}^{L} - R_{n}^{L}

(32)

The average local information increment of the $(n + 1) th$ time is as follows:

λ_{n + 1}^{L} = \frac{\sum_{i = 1}^{p} \sum_{j = 1}^{p} | D_{n + 1}^{L} [i, j] |}{p^{2}}

(33)

Calculate the local dynamic threshold as follows:

σ_{n + 1}^{L} = \frac{h}{L} \sum_{k = n' - L + 1}^{n'} γ^{*}

(34)

where $γ^{*}$ is the local information increment mean without fault.

Proposed APLS approach

Based on the above analysis, the main steps of APLS are summarized as follows.

Step 1: Collect normal process data and quality data into matrices $X$ and $Y$ .

Step 2: Calculate the coefficient matrix $Φ$ according to the desired decomposition of $X$ and $Y$ .

Step 3: Calculate orthogonal projection operators $Ξ$ and $Ξ^{⊥}$ by the coefficient matrix $Φ$ in Step 2.

Step 4: Decompose $X$ orthogonally into two subspaces $\hat{X}$ and $R_{X}$ .

Step 5: Calculate the coefficient matrix $Ω$ according to the desired decomposition of $Y$ and $\hat{X}$ .

Step 6: Calculate orthogonal projection operators $Ψ$ and $Ψ^{⊥}$ according to the coefficient matrix $Ω$ , and decompose $Y$ into a predictable quality-related space $\hat{Y}$ and an unpredictable quality-related space $E_{Y}$ .

Step 7: Normalize the testing data $X_{new}$ .

Step 8: Use Step 3 and Step 4 to decompose the testing data $X_{new}$ into ${\hat{X}}_{new}$ and $R_{X_{new}}$ subspaces.

Step 9: Use the parameter $Ω$ obtained in Step 5 to calculate the predictable quality-related space ${\hat{Y}}_{pre}$ and the unpredictable quality-related space $E_{Y_{res}}$

Step 10: Use the local information increment technique to calculate the average value of the information increment $D_{n}$ and the local dynamic threshold $σ_{n}$ in ${\hat{X}}_{new}$

Step 11: Use fault detection criteria for fault detection:

(a) if $D_{n} \geq σ_{n}$ , it is detected as a quality-related fault,

(b) if $D_{n} < σ_{n}$ , it is no quality-related fault occurred.

According to the design of the pseudocode above, the computational complexity of Algorithm 1 is denoted as $T (n) = O (n * m^{2})$ . The parameter h in formula (34) is the optimization coefficient of the threshold. The parameter is obtained with the target function of fault detection false alarm rate and fault alarm rate being the best. The optimal parameter can be obtained by using the particle swarm optimization (PSO) intelligent optimization algorithm on a large number of known normal conditions data in the offline modeling phase. Asadi and Karami²² and Li et al.²³ show the detailed steps of the PSO algorithm. Finally, the pseudocode of the APLS algorithm is given in Table 1.

Table 1.

The pseudocode of the APLS algorithm.

Algorithm 1: The Pseudocode of the APLS algorithm
1. Input: process variables X, X_new, and quality variables Y, Y_new; the number of principal components A; window length L; the number of variables m; and the sample size n;
2. normal process data and quality data into matrices X, Xnew, and Y: Y=normlnew(Y); X=normlnew(X); Xnew=normlnew(Xnew);
3. compute the relationship information matrix: M=pinv(Y’Y)Y’*X;
4. perform SVD decomposition: MM=M’*M; [PPL,LM,PPR]=svd(MM);
5. calculate the orthogonal projection operator: PM=PPL(:,1:A); PMw=PPL(:,A+1:m); PMT=PPR(:,1:A)’; PMwT=PPR(:,A+1:m)’; OM=PMPMT; OMO=PMwPMwT;
6. calculate Xnew score matrix: for i=1:1:n txpnew(1:A,i)=PM’Xnew(i,:)’; end for i=1:1:n txwnew(:,i)=PMw’Xnew(i,:)’; end
7. calculate the principal component subspace and residual subspace: Xps=txpnew’PMT; Xrs=txwnew’PMwT;
8. preprocess the principal component subspace: X_n=X_pr-b_n*l_n’;
9. calculating statistics and local dynamic thresholds: for k=L:1:n R_n_L=(K_n_L-Lb_n_Lb_n_L’)/(L-1); R_n1_L=(K_n1_L-Lb_n1_Lb_n1_L’)/(L-1); D_n1_L=R_n1_L-R_n_L; For i=1:1:m For j=1:1:m temp=abs(D_n1_L(i,j)); T=T+temp; end end T=T/mm; $Threshold = h sum (T (n' : n' - L + 1)) / L$ ; end
10. Output: the FAR and EAR are given based on the mean value of the local information increment matrix and local dynamic thresholds.

The determination process of whether a fault has occurred is as follows:

(1) If the system’s dynamic threshold is denoted as $σ_{n + 1}^{L}$ , when it equals $λ_{n + 1}^{L} \geq 3 σ_{n + 1}^{L}$ , it indicates a fault has occurred in the system. At this moment, denote $λ_{n + 1}^{L}$ as ${}^{*}{λ_{n + 1}^{L}}$ . Given that a fault occurred at time $n + 1$ , signifying the presence of the fault in data $x (n + 1)$ , let $x (n + 1)$ be $x^{*} (j_{m' + 1})$ . In order to continue detection and updates, while computing the local information increment mean $λ_{n + 1}^{L}$ in the subsequent step, the sampling data $x^{*} (j_{m' + 1})$ associated with the occurred fault should not be included in. It is necessary to exclude $x (n + 1)$ from $X_{n + 1}^{L}$ . Simultaneously, there should be a reassignment of relevant parameters, such as $X_{n + 1}^{L} = X_{n}^{L}, K_{n + 1}^{L} = K_{n}^{L}, R_{n + 1}^{L} = R_{n}^{L}$ . Notably, in the event of a fault occurring at time $n + 1$ , upon the arrival of sampling data $x (n + 2)$ at time $n + 2$ , the resultant corresponding local data matrix is denoted as $X_{n + 2}^{L} = [x (i_{n' - L + 2}), x (i_{n' - L + 3}), \dots, x (i_{n'}), x (n + 2)]$ . Designate the set of all subscripts for the local information increment mean ${}^{*}{γ_{n + 1}^{L}}$ at the time of fault occurrence as $H (n + 1) = {l | {}^{*}{λ_{l}^{L}}}$ , with $| H |$ representing the number of elements in this set. Consider the set of all local information increment means at the time of fault occurrence as ${{}^{*}{λ_{j_{1}}^{L}}, \dots, {}^{*}{λ_{j_{k}}^{L}}, \dots, {}^{*}{λ_{j_{m' + 1}}^{L}}}$ , with subscripts satisfying ${j_{1}, \dots, j_{k}, \dots, j_{m' + 1}} \subseteq {1, 2, \dots, n + 1}$ . Denote the local information increment mean at the time of no fault occurrence as ${λ_{i_{1}}^{L}, \dots, λ_{i_{k}}^{L}, \dots, λ_{i_{n'}}^{L}}$ , satisfying ${i_{1}, i_{2}, \dots, i_{k}, \dots, i_{n'}} \subseteq {1, 2, \dots, n + 1}$ .

\begin{matrix} {i_{1}, i_{2}, \dots, i_{k}, \dots, i_{n'}} \cup {j_{1}, j_{2} \dots, j_{k}, \dots, j_{m' + 1}} \\ = {1, 2, \dots, n + 1} \end{matrix}

(35)

The dynamic threshold for local normal sampling data of length L based on a fixed window is represented as:

σ_{n + 1}^{L} = \frac{h}{L} \sum_{k = n' - L + 1}^{n'} λ_{i_{k}}^{L}

(36)

(2) If T equals $λ_{n + 1} < 3 σ_{n + 1}$ , it signifies that no fault has occurred in the system. Sampling continues, and step (1) is utilized for further detection. When selecting the dynamic threshold, the $n'$ within $σ_{n + 1}^{L}$ denotes the indices of non-anomalous data at time $n$ . When $n$ is not greater than L + 2 and the count of normal $λ^{L}$ is less than L, calculate the threshold using all normal states of $λ^{L}$ that are less than L.

Remark 1: This method employs a fixed-size sampling data window and a fixed threshold window for the detection of faults through the information increment matrix of the slid window covariance matrix. Notably, this approach obviates the need for computing the eigenvalues and eigenvectors of the covariance matrix. Instead, it employs a real-time updated covariance matrix to derive the real-time changing information increment matrix through subtraction, subsequently calculating the mean of the information increment. This methodology not only effectively reduces data during the update of the covariance matrix but also enhances representativeness by computing dynamic thresholds based on data proximal to the current moment. Consequently, it holds promise in reducing the system’s FAR.

Numerical example and case study

In this section, a numerical simulation is first performed to illustrate the implementation process of the proposed APLS algorithm. Then, the effectiveness of APLS is verified by comparing the quality-related fault detection performance of the MPLS, and OSC-MPLS in the TE process simulation examples. Two indices EAR and FAR,^12,24 are employed for performance, shown as follows:

EAR = \frac{No . of eective alarms}{Total faulty samples} \times 100 %

(37)

FAR = \frac{No . false p t alarms}{Total faulty samples} \times 100 %

(38)

From the perspective of industrial applications, a superior quality-related fault detection scheme should possess the following capability.

(a) The higher the EAR is, the stronger the performance of the algorithm in detecting quality-related faults is;

(b) The lower the FAR is, the more accurate the monitoring performance of the algorithm on normal data is.

Numerical example

Consider the following example in which the input is dynamic while the output is static¹²:

{\begin{matrix} t_{k} = a_{1} t_{k - 1} - a_{2} t_{k - 2} + {t_{k}}^{*} \\ x_{k} = P t_{k} + e_{k} \\ y_{k} = C_{1} x_{k} + C_{2} x_{k - 1} + v_{k} \end{matrix}

(39)

Where $a_{1} = [\begin{matrix} 0.4389 & 0.1210 & - 0.0862 \\ - 0.2966 & - 0.0550 & 0.2274 \\ 0.4538 & - 0.6573 & 0.4239 \end{matrix}]$ , $a_{2} = [\begin{matrix} - 0.2998 & - 0.1905 & - 0.2669 \\ - 0.0204 & - 0.1585 & - 0.2950 \\ 0.1461 & - 0.0755 & 0.3749 \end{matrix}]$ , $P = [\begin{matrix} 0.5586 & 0.2042 & 0.6370 \\ - 0.2007 & 0.0492 & 0.4429 \\ 0.0874 & 0.6062 & 0.0664 \\ 0.9332 & 0.5463 & 0.3743 \\ 0.2594 & 0.0958 & 0.2491 \end{matrix}]$ , $C_{1} = {[\begin{matrix} 0.9249 & 0.4350 \\ 0.6295 & 0.9811 \\ 0.8783 & 0.0960 \\ 0.6417 & 0.5275 \\ 0.7948 & 0.5456 \end{matrix}]}^{T}$ , $C_{2} = {[\begin{matrix} 1.7198 & - 0.3715 \\ 0.5835 & 1.5011 \\ 1.4236 & 1.3226 \\ 0.4963 & - 1.4145 \\ - 2.5717 & 1.0696 \end{matrix}]}^{T}$

A fault is added into the samples using (38):

x_{k} = {x_{k}}^{*} + x_{f}

(40)

where ${x_{k}}^{*}$ is fault-free samples, and $x_{f}$ represent the samples with fault. Use formula (39) to generate 900 samples under normal working conditions to establish an APLS regression model and obtain model parameters. Then, the quality-related and quality-unrelated faults to generate two sets of 900 samples as the testing data for detection. The first 400 samples of the 900 testing data are normal data, and the last 500 samples are fault data. Here, the number of principal components A is 9, which is obtained by cross-validation,²⁵ and L is 3, which is obtained by particle swarm optimization (PSO) intelligent optimization algorithm.

Figure 1 shows the prediction picture of the proposed APLS algorithm for the quality variable $Y$ under normal working conditions. It can be seen that the proposed APLS algorithm can accurately track the true value of the quality and the proposed algorithm has strong prediction performance for quality variables changes. Then, perform the online detection on samples of quality-related and quality-unrelated faults respectively.

(a) The quality-related fault values are added as follows:

x_{f} = {[\begin{matrix} 2.0000 & 1.0000 & - 3.0000 & 2.0000 & - 5.0000 \end{matrix}]}^{T}

(41)

Figure 1.

Predicted values and the true values under fault-free.

Figure 2 shows the algorithm detection results after adding quality-related faults. It is revealed that after the quality-related fault is added, the fault is accurately detected when the fault is introduced and the effectiveness of the proposed APLS algorithm for quality-related fault detection is proved.

(b) The quality-unrelated fault values are added as follows:

x_{f} = {[\begin{matrix} 0.0054 & 0.3145 & - 0.0432 & 0.7516 & - 0.4440 \end{matrix}]}^{T}

(42)

Figure 2.

Detection results of the proposed approach under quality-related.

Figure 3 is the detection result of the algorithm after adding quality-unrelated faults. It can be seen that after adding quality-unrelated faults, the APLS algorithm can accurately detect faults when the fault samples are introduced. The effectiveness of the proposed APLS algorithm for quality-unrelated fault detection is proved.

Figure 3.

Detection results of the proposed approach under the quality-unrelated.

Tennessee Eastman Process

Tennessee Eastman Process (TEP)^26,27 is a simulation example based on actual industrial processes proposed by the process control department of the Tennessee Eastman Chemical Company in the United States in 1993. The coupling among various parts of the TEP system is severe, highly nonlinear, and open-loop unstable, which is one of the several challenging control problems in the field of process control. As the process is based on actual industrial processes, it is widely used to evaluate the performance of process monitoring and has achieved good practical results. The TEP system has 12 manipulated variables and 41 measured variables, where 41 measured variables contain 22 continuous variables and 19 component variables. In addition, the process also includes 21 kinds of disturbances, among which 15 kinds of disturbances are known faults. Among these 15 known faults, there are two types of faults: one is quality-related faults such as IDV(1), IDV(2), IDV(5)–IDV(8), IDV(10), IDV(12)–(13), the other is quality-unrelated faults such as IDV(3)–(4), IDV(9), IDV(11), IDV(15). The collected samples include two types of normal data sets and fault data sets. The normal data set contains 480 samples, and each fault data set contains 960 samples. These data should be standardized before modeling, then normal data is used to build regression models, and faulty data sets are used as the testing data for detection. Moreover, the final product component XMEAS (35) is selected as the quality variable $Y$ . Thirty-three variables are selected as the input data matrix $X$ , including 22 process variables (XMEAS (1–22)) and 11 manipulated variables (XMV (1–11)). The modeling parameters are set as h = 1.5, L = 3, A = 12.

Using the MPLS, OSC-MPLS, and the proposed approach to detect IDV(1), Figure 4 shows the detection results of IDV(1) by the MPLS algorithm. It can be seen that some statistics of the MPLS algorithm are below the threshold at the time of fault sampling, which indicates that the MPLS algorithm has a lower EAR. Figure 5 is the detection result of the OSC-MPLS algorithm for IDV(1). It is shown that after orthogonal signal correction is performed on the data, the EAR of the OSC-MPLS algorithm is significantly improved, but there are still some fault data that are not effectively detected by the algorithm.

Figure 4.

Detection results of the MPLS approach under IDV(1).

Figure 5.

Detection results of the proposed approach under quality-related fault.

Figure 6 shows the detection results of the proposed algorithm APLS on IDV(1). It can be found that when the fault data are added, they are effectively detected by the APLS algorithm. Compared with MPLS and OSC-MPLS, the FAR of APLS is significantly decreased, which indicates that the proposed algorithm performs well in quality-related faults detection.

Figure 6.

Detection results of the proposed approach under quality-related fault.

Table 2 gives the detection rates of the quality-related fault of MPLS, OSC-MPLS, and APLS, respectively. The bold part shows the group with the highest EAR among the three groups of algorithms, which reflects that the EAR of the proposed APLS algorithm is significantly improved. In Table 2, the EAR of OSC-MPLS is generally higher than that of MPLS. Since OSC-MPLS performs orthogonal signal correction processing on the data based on MPLS, which removes the information orthogonal to $Y$ in $X$ , only EAR of IDV(1), IDV(6), and IDV(13) are lower than MPLS. The quality-related fault detection rate of the proposed APLS algorithm is significantly improved compared with MPLS and OSC-MPLS, and only the EAR of IDV(5) is slightly lower than the OSC-MPLS algorithm by 7.37%. It is worth mentioning that the EAR of all quality-related faults of APLS is above 95%, which has achieved the purpose of accurately alarming quality-related faults. Besides, the EAR of the three groups IDV(7), IDV(8), and IDV(10) has been significantly improved, which are 47.32%, 31.83%, and 43.32% respectively higher than the MPLS, OSC-MPLS algorithms.

Table 2.

EAR of three algorithms for the quality-related faults of the TEP.

Fault ID	Fault description	MPLS (%)	OSC-MPLS (%)	APLS (%)
IDV(1)	D feed temperature	88.63	93.01	99.88
IDV(2)	Reactor cooling water inlet temperature	91.76	91.63	99.38
IDV(5)	D feed temperature	99.75	99.87	92.50
IDV(6)	Reactor cooling water inlet temperature	99.00	98.75	100
IDV(7)	Reactor cooling water inlet temperature	18.85	52.68	96.55
IDV(8)	Condenser cooling water valve	63.42	67.42	99.25
IDV(10)	Reactor cooling water inlet temperature	20.97	53.93	97.25
IDV(12)	Condenser cooling water valve	80.39	90.26	99.88
IDV(13)	Reactor cooling water inlet temperature	87.39	86.39	98.75

Table 3 shows the FAR of the quality-related faults of MPLS, OSC-MPLS, and APLS. The bold part is the group with the highest quality-related fault false alarm rate among the three algorithms. Among them, the FAR of MPLS is lower, and only the FAR of IDV(12) is 6%. The FAR of OSC-MPLS quality-related faults is higher, and the group IDV(10) with the highest FAR is 7%. The APLS algorithm except for IDV(7) and IDV(10) has a lower FAR, and the FAR of other quality-related faults is 0%. A comparative analysis of FAR and EAR obtained from multiple experiments of the three algorithms shows that the proposed algorithm has the best detection performance for quality-related faults.

Table 3.

FAR of three algorithms for the quality-related faults of the TEP.

Fault ID	Fault description	MPLS (%)	OSC-MPLS (%)	APLS (%)
IDV(1)	D feed temperature	0	3.00	0
IDV(2)	Reactor cooling water inlet temperature	0	3.00	0
IDV(5)	D feed temperature	0	4.00	0
IDV(6)	Reactor cooling water inlet temperature	0	2.00	0
IDV(7)	Reactor cooling water inlet temperature	0	6.00	1.00
IDV(8)	Condenser cooling water valve	0	2.00	0
IDV(10)	Reactor cooling water inlet temperature	0	7.00	1.00
IDV(12)	Condenser cooling water valve	6.00	3.00	0
IDV(13)	Reactor cooling water inlet temperature	0	1.00	0

Gearbox fault detection experiment

The application scope of gearboxes is extensive, with typical applications including wind turbines, automobiles, aerospace, and more. In practical engineering, their roles encompass speed variation, altering transmission directions, torque rotation, power distribution, and more. To validate the efficacy of the algorithm proposed in this article for real engineering applications, experimental tests were conducted using gearbox fault data. This data was sourced from the QPZZ-II Rotating Machinery Vibration Analysis and Fault Diagnosis Test Platform System, capable of conducting comparative analyses and diagnostics for various fault states, particularly simulating misalignment of gear shafts. It has been widely applied in universities, industrial and mining sectors, and research institutes, for research, teaching, product development, and personnel training. The Japan International Cooperation Agency (JICA) has consistently employed similar platforms to train international equipment diagnostic engineers, yielding favorable results.

The gearbox data was collected using nine sets of sensors: Channel 1-TACH1 (optical sensor measuring rotational speed); Channel 2-CH1 (measuring directional displacement); Channel 3-CH2 (measuring directional displacement); Channel 4-CH3 (measuring acceleration); Channel 5-CH4 (measuring acceleration); Channel 6-CH5 (measuring acceleration); Channel 7-CH6 (measuring acceleration); Channel 8-CH7 (measuring acceleration); Channel 9-CH8 (measuring magnetic-electric speed), with a sampling frequency of 2000 × 2.56 Hz. There exists a certain degree of correlation among the variables obtained from the nine channels, indicating inter-variable coupling, dynamic characteristics, and time-varying properties. Additionally, the data collected from the gearbox was not acquired under ideal conditions. Therefore, due to issues such as structural vibrations and external interferences, the data also contains a certain level of disturbance and noise. The collected data can be broadly classified into two types: the first type comprises data obtained under normal operating conditions. These data are typically utilized as training data and can also be integrated with fault data to form test datasets, simulating sudden faults during regular operation. The second type encompasses data acquired under fault conditions, primarily used as test data for diagnostic testing. Additionally, fault data can be utilized for fault reconstruction. Detailed information regarding data collection conditions and fault types is provided in Table 4.

Table 4.

Fault types of gearbox data.

Data scenarios	Fault locations	Fault types	Set speed	Actual measured speed
Normal data	/	/	1500	1475
Normal data	/	/	880	852
Data with faults	The large gear	Gear fracture	1500	1470
		Gear fracture	880	878
		Gear pitting	1500	1470
		Gear pitting	880	880
	The large gear and the small gear	Gear fracture and gear wear	1470	1474
		Gear fracture and gear wear	880	878
		Gear pitting and gear wear	1470	1472
		Gear pitting and gear wear	880	877
	The small gear	Gear wear	1470	1478
	The small gear	Gear wear	880	881

Initializing the model parameters before conducting fault detection on the gearbox. The gearbox dataset comprises five sets of normal operating condition data and 26 sets of fault data. Before experimentation, 1000 samples under normal conditions were chosen as the modeling data to train the APLS model for parameter acquisition. Subsequently, a test dataset was formed by selecting 1000 samples from normal operational data and 1000 samples from fault conditions, simulating the occurrence of a fault at time step 1001. Channel 1 was chosen as the response variable in both the training and test datasets, while channels 2 through 9, labeled as CH1–CH8, were utilized as the process variables. The number of principal elements, A, was determined as 5 through cross-validation, and the window length, L, was set to 6, achieved through the PSO algorithm.

Figure 7 illustrates the fault detection results of the APLS algorithm for different types of faults in the gearbox, including gear wear, gear fracture, gear pitting, and combined faults. The detection results in the figure indicate that, during the 1–1000 sampling instants when no faults occurred, the APLS algorithm’s local dynamic threshold effectively adjusts the threshold dynamically based on adjacent sampling instants, thereby reducing the system’s FAR. During the occurrence of faults in the 1001–2000 range, it is evident from the figure that the APLS algorithm accurately detected faults within the system. Furthermore, to evaluate the performance of the APLS algorithm’s design of local dynamic control limits, a performance comparison was conducted with the concurrent projection to latent structures (CPLS)²⁷ algorithm and MPLS algorithm that uses static control limits. Figure 8 depicts the fault detection results of the APLS algorithm, CPLS algorithm, and MPLS algorithm simultaneously detecting faults occurring in the condition of gear pitting and gear wear at 880 RPM. From the fault detection results of the three algorithms, it is evident that all three accurately detected the occurrence of faults, demonstrating a relatively high fault detection rate. Figure 8 also presents data from sampling instants 1–900. It is observed from Figure 8 that the CPLS and MPLS algorithms employing static control limits exhibit a higher FAR. Conversely, due to the utilization of local dynamic thresholds, the APLS algorithm demonstrates a lower FAR compared to the CPLS and MPLS algorithms during normal operating conditions.

Figure 7.

The detection results of APLS algorithm for different fault types at 880 RPM.

Figure 8.

Comparison of three algorithms for detecting gear pitting and gear wear combination fault conditions at 880 RPM.

The application results of the APLS algorithm in gearbox fault data indicate a notably low FAR and high EAR rate in the gearbox fault detection. This algorithm demonstrates promising applicability in detecting gearbox faults in various industries such as wind turbine generators and aerospace, showcasing its performance in fault detection within gearbox systems.

Conclusion

This paper proposes a novel multi-space quality-related fault detection method based on Advanced PLS. The proposed algorithm directly orthogonally decomposes the process variables according to the relationship between the quality variables and avoids the loss of quality information during the decomposition process. Then, the quality variable is decomposed into predictable and unpredictable parts according to the principal component subspace relationship with the process variable, and the interference of the system variation in the process variable on the quality prediction is removed. Finally, a new statistic and threshold calculation method is designed in the quality-related space combined with local information increment technology, which significantly improves the EAR of quality-related faults. Since the algorithm only monitors the quality-related space, there is no monitoring of the quality-unrelated subspace. Although this quality-unrelated fault information has no impact on the final quality of the product, a good detection of it may provide a more comprehensive fault status, which is an interesting topic for future research.

Footnotes

Declaration of conflicting interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: The work in this article was supported in part by the 2023 School level Education and Teaching Reform Research Project (No. 2023xjjy57), the Key Projects of Humanity and Social Science Research Project of Anhui Educational Committee (No. SK2020A0213), National Natural Science Foundation of China (No. 52004008), Independent Research fund of The State Key Laboratory of Mining Response and Disaster Prevention and Control in Deep Coal Mines (Anhui University of Science and Technology) (No. SKLMRDPC20ZZ12). Research Project of Xi Jinping New Era Socialism with Chinese Characteristics Thought Research Center (sxzx2021-13).

Ethics approval

We declare that this manuscript has complied with all the ethical requirements of the journal.

Consent to participate

All authors of this manuscript have agreed to participate in the writing of the manuscript.

Consent for publication

All the authors of this manuscript consented to its publication.

ORCID iD

Guisheng Zhang

Data availability

Data used in this manuscript is available from corresponding author.

References

Shi

Lim

, et al. Fault detection filtering for nonhomogeneous Markovian jump systems via fuzzy approach. IEEE Trans Fuzzy Syst 2018; 26(1): 131–141.

Furqan

Islam

John

, et al. Process monitoring and fault detection on a hot-melt extrusion process using in-line Raman spectroscopy and a hybrid soft sensor. Comput Chem Eng 2019; 125(9): 400–414.

Muradore

Fiorini

. A PLS-based statistical approach for fault detection and isolation of robotic manipulators. IEEE Trans Ind Electron 2012; 59(8): 3167–3175.

Shang

Tian

Cao

, et al. MPC performance monitoring and diagnosis based on dissimilarity analysis of PLS cross-product matrix. Acta Autom Sin 2017; 43(2): 271–279.

Jiang

Huang

. Distributed monitoring for large-scale processes based on multivariate statistical analysis and Bayesian method. J Process Control 2016; 46(3): 75–83.

Kong

, et al. An effective neural learning algorithm for extracting cross-correlation feature between two high-dimensional data streams. Neural Process Lett 2015; 42(2): 459–477.

Bansak

Ferwerda

Hainmueller

, et al. Improving refugee integration through data-driven algorithmic assignment. Science 2018; 359(6373): 325–329.

Wang

Huang

, et al. Data-driven resilient fleet management for cloud asset-enabled urban flood control. IEEE Trans Intell Transp Syst 2018; 19(6): 1827–1838.

Gao

Kong

, et al. A generalized information criterion for generalized minor component extraction. IEEE Trans Signal Process 2017; 65(4): 947–959.

10.

Kong

Feng

, et al. Generalized weighted rules on modified moller algorithm. Acta Autom Sin 2020; 46(1): 193–199.

11.

Ning

. Event-triggered fault detection for linear stochastic systems. Trans Inst Meas Control 2017; 40(4):1423–1431.

12.

Jiao

Wang

. A quality-related fault detection approach based on dynamic least squares for process monitoring. IEEE Trans Ind Electron 2016; 63(4): 2625–2632.

13.

Doostmohammadian

Meskin

. Sensor fault detection and isolation via networked estimation: full-rank dynamical systems. IEEE Trans Control Netw Syst 2020; 8(2): 987–996.

14.

Doostmohammadian

Zarrabi

Charalambous

. Sensor fault detection and isolation via networked estimation: rank-deficient dynamical systems. Int J Control 2023; 96(11): 2853–2870.

15.

Qin

Zhou

. Geometric properties of partial least squares for process monitoring. Automatica 2010; 46(1): 204–210.

16.

Zhou

Qin

. Total projection to latent structures for process monitoring. AIChE J 2010; 56(1): 168–178.

17.

Yin

Ding

Zhang

, et al. Study on modifications of PLS approach for process monitoring. IFAC Proc Vol 2011; 44(1): 12389–12394.

18.

Wang

Yin

. Quality-related fault detection approach based on orthogonal signal correction and modified PLS. IEEE Trans Industr Inform 2015; 11(2): 398–405.

19.

Yang

. Advanced prognosis and health management of aircraft and spacecraft subsystems. PhD Thesis, Massachusetts Institute of Technology, USA, 2000.

20.

Wen

. Fault diagnosis based on information incremental matrix. Acta Autom Sin 2012; 38(5): 1–8.

21.

Murayama

Katada

Hayakawa

, et al. Shortened mean transit time in CT perfusion with singular value decomposition analysis in acute cerebral infarction: quantitative evaluation and comparison with various VT perfusion parameters. J Comput Assist Tomogr 2016; 41(2): 173–174.

22.

Asadi

Karami

. Modeling of the city evacuation plan in case of earthquake with particle swarm optimization (PSO) and imperialist competition algorithm (ICA). Int J Disaster Resil Built Environ 2019; 11(1): 134–151.

23.

Mang

Sun

, et al. Method for designing the optimal trajectory for drilling a horizontal well, based on particle swarm optimization (PSO) and analytic hierarchy process (AHP). Chem Technol Fuels Oils 2019; 55(1): 105–115.

24.

Zhou

Ren

Wang

. Quality-relevant fault monitoring based on locally linear embedding orthogonal projection to latent structure. Ind Eng Chem Res 2019; 58(3): 1262–1272.

25.

Tsamardinos

Greasidou

Borboudakis

. Bootstrapping the out-of-sample predictions for efficient and accurate cross-validation. Mach Learn 2018; 107(12): 1895–1922.

26.

Zou

Xia

. Fault diagnosis of Tennessee-Eastman process using orthogonal incremental extreme learning machine based on driving amount. IEEE Trans Cybern 2018; 48(12): 3403–3410.

27.

Qin

Zheng

. Quality-relevant and process-relevant fault monitoring with concurrent projection to latent structures. AIChE J 2013; 59(2): 496–504.

Study on advanced partial least squares for quality-related fault detection

Abstract

Keywords

Introduction

Related work

Proposed fault detection approach

Complete space decomposition

Local information increment

Proposed APLS approach

Numerical example and case study

Numerical example

Tennessee Eastman Process

Gearbox fault detection experiment

Conclusion

Footnotes

Declaration of conflicting interests

Funding

Ethics approval

Consent to participate

Consent for publication

ORCID iD

Data availability

References