Multi-model quality prediction approach using fuzzy C-means clustering and support vector regression

Abstract

Quality prediction of complex production process has increasingly attracted the interests of manufacturers and researchers. Complex production process has the characteristics of sub-process mutual coupling, data show nonlinear, multi-inputs and multi-outputs, and it is difficult to realize process quality prediction effectively. To solve this problem, a multi-model modeling approach based on fuzzy C-means clustering and support vector regression is proposed in this article. First, classify the operation conditions using fuzzy C-means clustering algorithm, then establish the local quality prediction models of multiple operation conditions using support vector regression, obtain multi-model with model weights using adaptive mutation particle swarm optimization, and implement the quality prediction of complex production process. This method solves the problems of nonlinear, wide operating condition range and prediction difficult. A case study of the Tennessee Eastman process shows that the proposed model is feasible and efficient.

Keywords

Multi-model fuzzy C-means clustering support vector regression quality prediction adaptive mutation particle swarm optimization

Introduction

With growing requirements of high product quality in modern complex production process, quality prediction has become increasingly important. Data-driven empirical prediction models, such as statistical regression and time series analysis, are established by the changing rule between quality independent variable and dependent variable. But with the growing complexity of the production process, the quality data are difficult to describe the changes of quality characteristics using the precise mathematical model. As the development of artificial intelligence techniques in these years, new intelligent forecasting methods, such as artificial neural network, fuzzy estimation, and support vector machine (SVM), have widely used in process prediction.^1–5

Complex production process, which has the characteristics of highly complex, uncertainty, multi-level, and network, is difficult to achieve quality prediction effectively by a single intelligent control method. The multi-model strategy provides an effective way to solve quality prediction of the complex production process. Multi-model modeling algorithm, which is established by certain rules for a complex system, has higher accuracy compared with a single model. Since Bates and Granger⁶ put forward to use the synthesis of multi-model to simulate the complex production process, the research of multi-model prediction method has been considerably developed in the last several years, including adaptive methods, Gaussian mixture, multi-model methods, and so on.^7–10

Multiple model methods can solve the problem related to the quick change in operating mode. Maestri et al.¹¹ proposed a robust cluster approach, assuming that all data of each operating mode have the same covariance matrix. However, it reduces the application ability in practical situations. Yong et al.¹² proposed a multiple model recursive monitoring method. A computational intelligence-based cluster algorithm is employed to separate different operating modes. Then, recursive kernel principal component analysis is used to reduce the dimension, and support vector data description is utilized to build models. Finally, the corresponding statistics are constructed to detect the process fault. Su et al.¹³ adopted a robust adaptive evidence theoretic k-NN classification method to establish the local model and applied to define the membership of current pattern belonging to each local model, and multi-model prediction method mainly uses weighted sum of the multiple local models. Sun and Yuan¹⁴ used the weighted method to obtain the weighted values of local models and then developed the global model to predict the quality of complex process. These efforts form the foundation of the work herein, predicting approaches for complex production processes are still underway. These methods can deal well with the nonlinear data. However, it cannot easily obtain the model for complex production processes when you cannot obtain enough data.

This article aims to develop a multi-model method for quality prediction of complex production process. The model consists of three main modules: classifier module, local prediction model module, and weight value optimization module. In classifier module, we classify the operation conditions using the fuzzy C-means (FCM) clustering algorithm and generate multiple sub-group datasets, which represent different operation conditions. Within local prediction model module, instead of constructing a global prediction model, a series of local quality prediction models are established using support vector regression (SVR). In weight value optimization module, adaptive mutation particle swarm optimization (AMPSO) is developed to obtain the local models’ weights for combining the forecast results in different operation conditions, so that it formulates the global quality prediction model. Compared to traditional prediction methods, this method not only reduces the computational complexity of the production process but also improves the prediction performance.

This article is organized as follows: section “Quality prediction multi-model approach” describes the concepts needed, including the FCM clustering method, the principle of SVR, and AMPSO algorithm. Section “Multi-modeling algorithm for prediction control” introduces the quality prediction multi-model. Section “Application experiment” describes the simulation experiment design of the Tennessee Eastman (TE) benchmark case study, including the overview of TE process, experimental results, and analysis. Finally, in section “Conclusion,” conclusions and some discussions of this article are made.

Quality prediction multi-model approach

Multi-model approach is to establish multiple models of complex production process by different work conditions. From this, higher prediction precision can be obtained compared with single model. In this article, clustering algorithm provides a sample classification rule, and it can create an object model rationally. Multi-model modeling algorithm uses the FCM clustering method to classify training data, obtain the training samples of each sub-category, and then use multi-SVR algorithm for establishing various sub-class model.

FCM clustering algorithm

Fuzzy clustering based on fuzzy theory is a classifier method that divides the data into groups by their characteristics and degree of uncertainty. FCM clustering method is the most widely used algorithm based on the objective function of fuzzy clustering algorithm. This method forms a common clustering criterion, its principle is to optimize the objective function using an iterative method and then obtain the classification result of datasets. The algorithm, which has good convergence and automatic detection identification, is widely used in pattern recognition, image recognition, and general classification by solving the problem of multiple branches of the split.^15–17 In this article, FCM clustering method is adopted to divide the conditions of complex production process.

Note that hypothetically, the process data are a collection of n samples $X = {x_{1}, x_{2}, \dots, x_{n}}$ , each sample x_k has m data eigenvalue vector: $x_{k} = (x_{k 1}, x_{k 2}, \dots, x_{km})$ , $k = 1, 2, \dots, n$ . The sample set can be divided into s classes through m eigenvalue vector, 2 ≤ s ≤ n. The relative membership degree matrix of sample set is $U = {u_{ik}}$ , where $u_{ik} = u_{{\tilde{A}}_{i}} (x_{k}) (i = 1, 2, \dots, s; k = 1, 2, \dots, n)$ means the degree of $x_{k}$ belong to ${\tilde{A}}_{i}$ , where ${\tilde{A}}_{i}$ is the c fuzzy subsets of X, and $u_{ik}$ should satisfy the constraint conditions

{\begin{matrix} u_{i k} \in [0, 1] & \forall i = 1, 2, …, s, k = 1, 2, … n \\ 0 < \sum_{k = 1}^{n} u_{i k} < n & \forall i = 1, 2, …, s \\ \sum_{i = 1}^{s} u_{i k} = 1 & \forall k = 1, 2, …, n \end{matrix}

(1)

Assume that there are s clustering centers constituting a cluster center matrix $V = (v_{1}, v_{2}, \dots, v_{s})$ . At the same time, the distance of the sample $x_{k}$ and clustering center $v_{i}$ (Euclidean distance) is defined as

d_{i k} = | | x_{k} - v_{i} | | = {\sum_{j = 1}^{m} {[x_{k j} - v_{i j}]}^{2}}^{\frac{1}{2}}

(2)

FCM objective function is

J_{m} = \sum_{k = 1}^{n} \sum_{i = 1}^{s} u_{i k}^{p} \cdot d_{i k}^{2}

(3)

where $p \in [0, \infty]$ is the fuzzy weighted index; FCM clustering is carried out through an iterative optimization of the objective function shown above, with the cluster centers $v_{i}$ by

v_{i} = \frac{\sum_{k = 1}^{n} u_{i k} \cdot x_{k}}{\sum_{k = 1}^{n} u_{i k}}, i = 1, 2, …, s

(4)

So, the parameters of FCM clustering method mainly contain number of clusters, exponent for the partition matrix, maximum number of iterations, minimum amount of improvement, and info display during iteration. We can set suitable parameters for your model. The clustering result in FCM clustering method is used to divide the complex production process. The local models can be established in different operation conditions.

SVR

SVR is formed from SVM. SVM is invented by Vapnik of AT&T Bell lab team. It is established to obtain the best solution between model complexities and learning ability according to the limited sample information, which is based on Vapnik–Chervonenkis (VC) dimension theory and structural risk minimization of statistical learning theory. An SVM performs classification tasks by constructing optimal separating hyperplanes (OSHs). An OSH maximizes the margin between the two nearest data points belonging to two separate classes.

Suppose that the training set (x_i, y_i), $i = 1, 2, \dots, n$ , x_i ∈ R^m, y_i ∈ {−1, 1}, the data can be separated by the hyperplane wx_i + b = 0, where n is the number of sample observations and m is the dimension of each observation, w is the weight vector, and b is the bias. If this hyperplane separates the data from two classes with maximal margin width 2/||w||², and all the points under the boundary is named the support vector. As input data are often with a high noise level in real-world problem, an SVM using soft margins can be expressed as follows with the introduction of the non-negative slack variables (ξ_i, $i = 1, 2, \dots, n$ ). Equation (5) is transformed into the following constrained form

\begin{array}{l} \min_{w, b, ξ} \frac{1}{2} || w {||}^{2} + C \sum_{i = 1}^{n} ξ_{i} \\ s . t . y_{i} ((w \cdot x_{i}) + b) \geq 1 - ξ_{i}, i = 1, …, n \\ ξ_{i} \geq 0, i = 1, …, n \end{array}

(5)

In equation (5), C is the penalty factor, and it determines the degree of the penalty assigned to an error. It can be viewed as a tuning parameter, which can be used to control the trade-off between maximizing the margin and the classification error.

SVM was originally designed for the classification problem; however, many studies show that this algorithm also has a better performance in regression problems. The basic principle of SVR is to deduce the output value y corresponding to the new input sample x, which is based on the model of the training set samples. In this article, local models are established using the SVR.^18–20 It is assumed that training set $T = {(x_{i}, y_{i})}, i = 1, 2, \dots, n,$ where regression function $y = f (x) = w^{T} \cdot x + b$ , x_i ∈ R^m is the input index vector, and f(x) is used to infer the y value corresponding to any input x. Defined the precision ε in regression analysis, if the error of the true value y_i and predictive value f(x_i) is not greater than the predetermined precision ε, we can believe that f(x) as to fit the training point is correct. For expanding its application scope, we introduce slack variables ξ_i and $ξ_{i}^{*}$ and penalty parameter C; optimization problem can be defined as

\begin{array}{l} \min_{w, b, ξ} \frac{1}{2} | | w | |^{2} + C \sum_{i = 1}^{n} (ξ_{i} + ξ_{i}^{*}) \\ s . t . - (ε + ξ_{i}^{*}) \leq f (x) - y_{i} \leq ε + ξ_{i}, i = 1, …, n \\ ξ_{i}, ξ_{i}^{*} \geq 0, i = 1, …, n \\ C > 0 \end{array}

(6)

The first part of the objective function is to make the function more flat and improve the generalization ability; the second part is to reduce the error, and penalty parameter C plays a compromise role.

Nonlinear regression problem can be transformed to the linear regression by choosing a suitable kernel function K(x_i, x_j). The optimization problem is defined as

\begin{array}{l} \min_{α} \frac{1}{2} \sum_{i, j = 1}^{n} (α_{i}^{*} - α_{i}) (α_{j}^{*} - α_{j}) K (x_{i} \cdot x_{j}) \\ + ε \sum_{i = 1}^{n} (α_{i}^{*} + α_{i}) - \sum_{i = 1}^{n} y_{i} (α_{i}^{*} - α_{i}) \end{array}

(7)

SVR model uses kernel function to solve nonlinear problem and makes the input vectors map to the high-dimensional feature space to transform the nonlinear to linear. The critical problem is to choose the kernel function K(x_i, x_j). There are several types of kernel function:

Linear function mainly used for a linear separable case is not suitable for a nonlinear case, and the form is as follows

K (x_{i}, x_{j}) = x_{i} \cdot x_{j}

Poly function

K (x_{i}, x_{j}) = {[(x_{i} \cdot x_{j}) + 1]}^{d}, d = 1, 2, …, n

Radial basis function (RBF)

K (x_{i}, x_{j}) = \exp (- \frac{| x_{i} - x_{j} |^{2}}{σ^{2}}) = \exp (- γ | x_{i} - x_{j} |^{2})

Sigmoid function

K (x_{i}, x_{j}) = \tanh (v (x_{i} \cdot x_{j}) + α)

Although several choices for the kernel function are available, linear function mainly used for a linear separable case is not suitable for a nonlinear case. However, the other three functions can be used to deal with nonlinear case, and RBF kernel function has greatly generalized ability, and many literatures have indicated that RBF is more stable. So, RBF function is adopted in establishing the SVR models.

AMPSO

Particle swarm optimization (PSO) originates from the study of birds’ predatory behavior; originally, it was proposed by Kennedy and Eberhart²¹ in 1995. PSO algorithms use the flying behavior of a swarm of particles to imitate the process of searching for the optimal solutions. Each particle represents a feasible solution called particles associated with the position and velocity vector. At each iteration, the particle moves toward an optimum solution through its present velocity, and their individual best solution is obtained, and the global best solution is obtained among all particles.

In PSO, each individual is considered as a volume-less particle, and the set of position of n particles in the m-dimensional search space is identified as $X = {X_{1}, X_{2}, \dots, X_{i}, \dots, X_{n}}$ ; the ith particle and the velocity are represented as $X_{i} = ({x_{i}}_{1}, {x_{i}}_{2}, \dots, x_{im})$ and $V_{i} = ({v_{i}}_{1}, {v_{i}}_{2}, \dots, v_{im})$ , respectively. Every particle flies to the better position in the search space according to its own flying experience and its group experience. Let P_i and P_g be the best position of particle i and global best position, respectively. The modified velocity and position of each particle can be calculated using the current velocity and the distance from P_i and P_g as follows

V_{i}^{k + 1} = V_{i}^{k} + c_{1} r_{1} (P_{i} - X_{i}^{k}) + c_{2} r_{2} (P_{g} - X_{i}^{k})

(8)

where c₁ and c₂ are the positives constants and r₁ and r₂ are two random functions in the range [0, 1].

Then calculate each component of $V_{i}^{k + 1} = (v_{1}^{k + 1}, v_{2}^{k + 1}, \dots, v_{m}^{k + 1})$ by

v_{j}^{k + 1} = {\begin{matrix} sgn ({\tilde{v}}_{j}^{k + 1}) v_{j}^{max}, if | {\tilde{v}}_{j}^{k + 1} | > v_{j}^{max} \\ {\tilde{v}}_{j}^{k + 1}, otherwise \end{matrix}

(9)

for $j = 1, 2, \dots, m$ , where the notation sgn indicates the usual signum function. Then, the ith particle position is updated with

X_{i}^{k + 1} = X_{i}^{k} + V_{i}^{k + 1}

(10)

PSO has the characteristics of simple concept, less control parameters, evolutionary computation, and swarm intelligence optimization; it can realize the search for the global optimal solution in complex space through collaboration and competition between individuals. But basic PSO is easy to relapse into a local extremum, and other particles quickly move to this local position in the optimization process, and it is difficult to find the best global solution. To cope with the PSO’s “premature” problem, AMPSO is used to solve this problem. The AMPSO facilitates particles to escape from the local optima to find the best solution in the other space.

As can be seen in formula (10), the next position of a particle is determined by both its current position and its new velocity. The new velocity is determined by the immediately previous velocity, individually best P_i, and group best P_g, as shown in formula (8). If the algorithm is in premature, then the group best P_g is the local optimal solution. If P_g is changed, the search direction of particles will be redirected. Thus, the main idea of the AMPSO is by mutating P_g in hope that the search will get out of a local optimum to explore new individual optimum and group optimum.

The mutation of the PSO is designed as a random operator with a certain probability $ρ$ . Specifically, for a uniformly distributed random number $rand \in (0, 1)$ , a mutated new group optimum is obtained as follows

P_{g} = {\begin{matrix} (P_{g}^{max} - P_{g}^{min}) \cdot rand + P_{g}^{min}, if rand > ρ \\ P_{g}, otherwise \end{matrix}

(11)

Multi-modeling algorithm for prediction control

Algorithm principle

The structure diagram of multi-model prediction algorithm is shown in Figure 1.

Figure 1.

Multi-model structure.

FCM clustering algorithm is used to classify the different operation conditions of complex production process based on the minimum Euclidean distance. Nonlinear SVR based on RBF as the kernel function (SVR-RBF) is used to establish the local quality prediction model, respectively. Finally, AMPSO is adopted to obtain the weighted values of local models and then develop the global model to predict the quality of complex process.

Multi-model with weighted strategy, which consists of the local models, adopts linear or nonlinear weighted way to obtain the complex process global approximate model. It tries to make the prediction results coincide with the actual process, so that it improves the prediction accuracy of process quality. In this article, weighted strategy is used to realize the local models to the global model. The form can be described as follows

\hat{y} (t) = w_{1} x_{1} (t) + w_{2} x_{2} (t) + \dots + w_{s} x_{s} (t)

(12)

where w_i (i = 1, …, s) is the weight values of local models, which satisfy the condition: w₁+w₂+···+w_s = 1. w_i ≥ 0, i = 1, …, s. x_i(t) (i = 1, …, s) means the ith local model result, and t means the time series parameters.

Each particle is constituted by s dimension parameters in the multi-model problem with s weighting factor, solving the optimal weight using the particle position, the particle speed, and the fitness function. The problem of weighted multi-model is to find an optimal set of weighting coefficients w_i (i = 1, …, s) and make the error between the multi-model output and actual output to minimum; the fitness function can be described using the mean squared error (MSE)

fitness = \sum_{i = 1}^{s} \frac{{(y_{i} (t) - {\hat{y}}_{i} (t))}^{2}}{s}

(13)

where $\hat{y} (t)$ is the multi-model output result and $y (t)$ is the actual output result.

Algorithm steps

The flowchart of the multi-model algorithm is shown in Figure 2, and the detail specific implement steps are as follows:

Step 1. Obtain the data of complex production process;

Step 2. Define the process input data and output data, normalize the data, and choose the sample training set S and testing set T;

Step 3. Initialize the system parameters, FCM parameters, SVR parameters, and AMPSO parameters;

Step 4. Divide the training set S into s conditions using FCM clustering and obtain the working section of local models;

Step 5. Establish the local models using SVR and search for a suitable SVR parameter by grid search and cross-validation;

Step 6. Obtain the weight values of local models through AMPSO. The MSE is used as the fitness function;

Step 7. Obtain the multi-model by the weight values. Then use testing set T to check the model accuracy and calculate the relative error (RE) and MSE of the testing sample

RE = \frac{| \hat{y} (t) - y (t) |}{y (t)}

(14)

MSE = \sum_{i = 1}^{n} \frac{{({\hat{y}}_{i} (t) - y_{i} (t))}^{2}}{n}

(15)

Figure 2.

The flowchart of multi-model.

Application experiment

In this section, a case of TE chemical process is presented to analyze the performance of the proposed multi-model prediction method. First, classify the operation conditions using FCM clustering algorithm, next establish the local quality prediction models of multiple operation conditions using SVR, then obtain local model weights using AMPSO, and implement the quality prediction of complex production process by combination of the local models. Finally, the results of the best method are compared with local model, back propagation (BP) neural networks, and SVR prediction method. All methods are coded in MATLAB (2009a) and used the LIBSVM toolbox from Chang and Lin.²²

TE chemical process

TE chemical process is proposed by J.J. Downs and E.F. Vogel, which is the process control case based on the Eastman practical industrial process, as shown in Figure 3. The TE process is a complex system with multi-inputs, multi-outputs, and multiple coupling relations between variables. McAvoy and Ye²³ have obtained a certain control effect of TE process using cascade control method and two final products of the composition of the G/H as process control targets, but the control method is unstable. The simulation in this article is based on the TE process in fault 1 mode corresponds to the 50/50 G/H product ratio, a shift fault, which means A/C feed ratio change, B composition constant.

Figure 3.

Tennessee Eastman process.

TE process is a complex nonlinear process, and there are five major unit operations in the process: a reactor, a condenser, a recycle compressor, a separator, and a stripper. It contains eight components: A, B, C, D, E, F, G, and H. The four reactants A, C, D, and E and the inert B are fed to the reactor where the products G and H are formed, and a byproduct F is also produced. The process has 22 continuous process measurements, 12 manipulated variables, and 19 composition measurements sampled less frequently. Details on the process description are well explained in Chiang et al.²⁴

Parameters selected

In this article, 22 continuous process measurements and 8 A, B, C, D, E, F, G, and H purge gas are chosen as the prediction model inputs, final products G and H as model outputs,²⁵ and A feed and A+C feed as control variables to classify the operation condition; the data of the fault 1 mode in the TE simulation model are used in process quality prediction. Table 1 shows the 22 continuous process measurements, and Table 2 shows the 8 discrete variables.

Table 1.

Continuous process measurements of the TE process for multi-model.

Variable number	Variable name	Unit
1	A feed	kscmh
2	D feed	kg/h
3	E feed	kg/h
4	A+C feed	kscmh
5	Recycle flow	kscmh
6	Reactor feed	kscmh
7	Reactor pressure	kPa
8	Reactor level	%
9	Reactor temperature	°C
10	Purge rate	kscmh
11	Separator temperature	°C
12	Separator level	%
13	Separator pressure	kPa
14	Separator underflow	m³/h
15	Stripper level	%
16	Stripper pressure	kPa
17	Stripper underflow	m³/h
18	Stripper temperature	°C
19	Steam flow	kg/h
20	Compressor work	kW
21	Reaction/cool temperature	°C
22	Condition cool temperature	°C

TE: Tennessee Eastman.

Table 2.

Discrete quality variables of the TE process for multi-model.

Variable number	Variable name	Unit
1	% A in purge gas	mol %
2	% B in purge gas	mol %
3	% C in purge gas	mol %
4	% D in purge gas	mol %
5	% E in purge gas	mol %
6	% F in purge gas	mol %
7	% G in purge gas	mol %
8	% H in purge gas	mol %

TE: Tennessee Eastman.

The dataset contains 160 data samples, which run 48 h, and the sampling interval is 18 min in the fault 1 mode of the TE process. The proportions of A feed and A+C feed are the inputs in the clustering algorithm, which are the main factors of the failure mode 1. The proposed multi-model quality prediction method adopts 160 test samples data of 4 classes in fault 1 model, the final products G and H as prediction objects, and 30 process variations as input parameters.

In FCM clustering method, we set the number of clusters to be 4, exponent for the partition matrix is 2.0, maximum number of iterations is 100, and minimum amount of improvement is 1e−5. LIBSVM toolbox is used to establish the local SVR models. We choose epsilon-SVR as type of SVM, 0.4 as the epsilon in loss function of epsilon-SVR, RBF as the type of kernel function, 10,000 as the parameter C, 0.002 as gamma in kernel function, and the others parameters are as defaults. In PSO, maximum number of iterations is 500, swarm size is 200, maximum velocity is 0.8, constants c₁ = c₂ are both 2, and probability of adaptive mutation rate is 50%.

Experiment results

Performance of FCM clustering algorithm

The computational results of the proposed FCM clustering algorithm are shown in Figure 4 and Table 3. Specifically, Table 3 provides the four centers of FCM in A feed and A+C feed and the number of every clustering. FCM clustering method effectively distinguishes various operating conditions of the TE process fault 1 mode, and the process is divided into four operation conditions.

Figure 4.

The result of fuzzy C-means clustering algorithm.

Table 3.

The clustering centers of fuzzy C-means (FCM).

The elements of FCM	First clustering center (number)	Second clustering center (number)	Third clustering center (number)	Fourth clustering center (number)
A feed	0.927 (11)	0.602 (11)	0.774 (61)	0.755 (77)
A+C feed	8.307 (11)	9.205 (11)	8.702 (61)	8.871 (77)

A small number of edge samples are distinguished, such as the first class and second class which only have 11 points, respectively. But the main operating conditions, such as the third class and fourth class, are mainly adopted to predict the quality of a complex model.

Performance of predicting the quality in different models

First, we evaluate the performance of the proposed multi-model quality prediction with different optimization methods to choose the local SVR weight parameters. In the experiments, we apply the four clustering results as the local model data, and the SVR is implemented as the prediction method. The performances of the best SVR parameters with different optimization algorithms are compared in Table 4. The best multi-model weight parameter values are then used in maximizing the prediction accuracies.

Table 4.

The simulation results of G ratio.

G output ratio	Multi-model	Local models				BP	SVR
G output ratio	Multi-model	First	Second	Third	Fourth	BP	SVR
MSE	0.2354	0.3602	0.4242	0.3376	0.4172	0.5696	0.2937
RE (%)	0.72	0.89	0.97	0.84	0.93	1.07	0.80
Running time (s)	1.48	0.07	0.08	0.07	0.07	0.37	0.02

BP: back propagation; SVR: support vector regression; MSE: mean squared error; RE: relative error.

In order to investigate the effectiveness of the proposed multi-model method, we have used three other methods: the first one (local model) that simply uses four local model by the data training after FCM algorithm; the second (BP) that chooses BP neural networks algorithm as the prediction method, and the third (SVR) that chooses SVR algorithm as the prediction method using original data. The results of the RE and MSE are shown in Tables 4 and 5. Figure 5 shows the prediction results of composition G and composition H.

Table 5.

The simulation results of H ratio.

H output ratio	Multi-model	Local models				BP	SVR
H output ratio	Multi-model	First	Second	Third	Fourth	BP	SVR
MSE	0.2616	0.5622	0.5801	0.3544	0.3181	0.5127	0.4366
RE (%)	0.96	1.33	1.38	1.05	1.04	1.23	1.22
Running time (s)	1.47	0.07	0.08	0.07	0.08	0.36	0.02

BP: back propagation; SVR: support vector regression; MSE: mean squared error; RE: relative error.

Figure 5.

Prediction results of the proposed method: (a) composition G and (b) composition H.

Simulation outputs and the actual G and H ratio of TE process using the proposed method are in good match; the RE of composition G and H is controlled in <1%, and MSE also shows a low ratio value. Compared with the other methods, the RE and MSE have been greatly improved, and they are better in the prediction of the entire system. This method can analyze the trend of the complex production process, although it cannot predict the accurate data value. It helps people understand the quality of products and adopt effective approaches to adjust the production process in unsuitable production states, guaranteeing the quality of production.

In total, four local models use SVR method to predict the complex production process using the classified data of the FCM clustering as the training set. From Figure 6 and Tables 4 and 5, the results show that first class and second class of G and H output ratio display much worse than the local model of the third class and the fourth class, but it shows good match with the result of FCM clustering. But the results of first class of G play better than fourth class because AMPSO method is not stable and thus influences the local model results, but it does not affect the superiority of this method. And the results of third class and fourth class play much better because they are the main operating conditions, and the prediction results are much closer to the global model.

Figure 6.

Prediction results of four local models: (a) composition G of the first local model, (b) composition H of the first local model, (c) composition G of the second local model, (d) composition H of the second local model, (e) composition G of the third local model, (f) composition H of the third local model, (g) composition G of the fourth local model, and (h) composition H of the fourth local model.

BP neural network algorithm and SVR algorithm are used to do the comparative study. From Tables 4 and 5, the results show that the prediction accuracy of multi-model is much better than these two methods. At the same time, the RE and MSE of SVR are better than BP neural network, especially the G output ratio, the reason is that BP neural network depends on the quantity and quality of the sample data, but as only 40 training samples are considered in this study, it belongs to the small sample noise problem. Figures 7 and 8, respectively, show the prediction result of BP and SVM.

Figure 7.

Prediction results of BP method: (a) composition G and (b) composition H.

Figure 8.

Prediction results of SVR method: (a) composition G and (b) composition H.

Performance of AMPSO algorithm

To study the influence of AMPSO algorithm, we test the proposed multi-model-based quality prediction system. AMPSO is used to solve the weight values of the multi-model. Each particle is constituted by four dimension local model parameters, and 0.5 is chosen as random operator in the multi-model problem. The MSE between the multi-model output and actual output is used as the fitness function. Optimization process of composition G and composition H with five different runs of the program with 500 steps is, respectively, shown in Figure 9. They demonstrate that AMPSO can find the best optimum values effectively.

Figure 9.

The best fitness evaluation for different runs: (a) composition G and (b) composition H.

Conclusion

Complex production process has the characteristics of sub-process mutual coupling; data show nonlinear, multi-inputs and multi-outputs, and a multi-model is proposed to implement quality prediction of complex production process. First, FCM clustering algorithm is used to deal with complex data, make the complex process divide into relatively simple conditions, and provide effective database for the realization of the complex process quality prediction. Based on the operation condition division, we establish the local quality prediction models using SVR and then obtain the global model by local model weights. AMPSO, which introduces mutation parameters to improve PSO, is used to obtain the best local model weights.

The results of the case study of the TE process as the complex production process demonstrate that multi-model is an exceptionally promising approach to solve quality problems in the complex production process. The prediction results of fault mode 1 in TE process show that the MSE is 0.2359 and 0.2616, and the RE is controlled in <1%, which demonstrate that the proposed method shows good performance. Besides, local model, BP, and SVR model as the comparative studies are used to predict the quality of TE process. The results show that the proposed method is feasible and efficient and evidently improves the prediction accuracy.

Linear weighted strategy is used to realize the local models to the global model in this article, not considering nonlinear weighted strategy. Future research will study the suitable weighted strategy according to the actual production process.

Footnotes

Academic Editor: Murat Uzam

Declaration of conflicting interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work is financially supported by National Natural Science Foundation of China (NSFC) under Grant No. 51675450 and the Fundamental Research Funds for the Central Universities under Grant No. 2682016CX031.

References

Torija

Ruiz

Ramos-Ridao

AF.

Use of back-propagation neural networks to predict both level and temporal-spectral composition of sound pressure in urban sound environments. Build Environ 2012; 52: 45–56.

Ganesan

Rajakarunakaran

Thirugnanasambandam

et al . Artificial neural network model to predict the diesel electric generator performance and exhaust emissions. Energy 2015; 83: 115–124.

Han

Ying

Qiao

A fuzzy neural network approach for online fault detection in waste water treatment process. Comput Electr Eng 2014; 40: 2216–2226.

Dong

Luo

Bearing degradation process prediction based on the PCA and optimized LS-SVM model. Measurement 2013; 46: 3143–3152.

Jia

Wang

et al . Hybrid of simulated annealing and SVM for hydraulic valve characteristics prediction. Expert Syst Appl 2011; 38: 8030–8036.

Bates

Granger

CWJ

. The combination of forecasts. J Oper Res Soc 1969; 20: 451–468.

Schmelas

Feldmann

Bollin

Adaptive predictive control of thermo-active building systems (tabs) based on a multiple regression algorithm. Energ Buildings 2015; 103: 14–28.

Sang

Jin

Lee

IB.

Process monitoring using a Gaussian mixture model via principal component analysis and discriminant analysis. Comput Chem Eng 2004; 28: 1377–1387.

Varadarajan

Miller

Zhou

Region-based mixture of Gaussians modelling for foreground detection in dynamic scenes. Pattern Recogn 2015; 48: 3488–3503.

10.

Li-Juan

Liu

et al . Multi-model predictive control based on AP-LSSVM. J Zhejiang Univ 2013; 47: 1741–1746.

11.

Maestri

Farall

Groisman

et al . A robust clustering method for detection of abnormal situations in a process with multiple steady-state operation modes. Comput Chem Eng 2010; 34: 223–231.

12.

Yong

Xin

Wang

Fault detection for a class of industrial processes based on recursive multiple models. Neurocomputing 2015; 169: 430–438.

13.

Wang

Shen

et al . Multi-model strategy based evidential soft sensor model for predicting evaluation of variables with uncertainty. Appl Soft Comput 2011; 11: 2595–2610.

14.

Sun

Yuan

Multi-model modeling approach for complex process. Chin J Sci Instrum 2011; 32: 132–137.

15.

Liu

Zhang

Yan

JJ.

Short-term load forecasting technique for municipal supply water consumption based on fuzzy clustering theory. J Harbin Inst Tech 2009; 41: 162–165.

16.

Wang

Sawada

Moriguchi

Landslide susceptibility analysis with logistic regression model based on FCM sampling strategy. Comput Geosci 2013; 57: 81–92.

17.

Jamshidi

Rahimi

Ruiz

et al . Application of FCM for advanced risk assessment of complex and dynamic systems. IFAC: Papers OnLine 2016; 49: 1910–1915.

18.

Wang

Support vector machines: theory and applications. Lect Notes Comput Sc 2005; 302: 249–257.

19.

Jeng

et al . ARFNNS with SVR for prediction of chaotic time series with outliers. Artif Life Robot 2009; 37: 4441–4451.

20.

Yaslan

Bican

Empirical mode decomposition based denoising method with support vector regression for time series prediction: a case study for electricity load forecasting. Measurement 2017; 103: 52–61.

21.

Kennedy

Eberhart

. Particle swarm optimization. In: Proceedings of the IEEE international conference on neural networks, 1995, 1995, vol. 4, pp.1942–1948. New York: IEEE, https://www.cs.tufts.edu/comp/150GA/homeworks/hw3/_reading6%201995%20particle%20swarming.pdf

22.

Chang

Lin

CJ.

LIBSVM: a library for support vector machines. ACM Trans Intell Syst Tech 2007; 2: 27, http://www.csie.ntu.edu.tw/~cjlin/libsvm

23.

McAvoy

Base control for the Tennessee Eastman problem. Comput Chem Eng 1994; 18: 383–413.

24.

Chiang

Kotanchek

Kordon

AK.

Fault diagnosis based on Fisher discriminant analysis and support vector machines. Comput Chem Eng 2004; 28: 1389–1401.

25.

Yang

Gao

et al . Multirate dynamic inferential modeling for multivariable processes. Chem Eng Sci 2004; 59: 855–864.