Abstract
Pneumatic diaphragm pump is an important part in intelligent spraying. When pneumatic diaphragm pump does not work normally, the entire intelligent spraying product line will be malfunctioned. To maintain and manage pneumatic diaphragm pump effectively, the grade analysis of the health status of pneumatic diaphragm pump is generally used according to its working state. Due to the effects of condition monitoring and random faults, some observable health predictions are often inaccurate. There are very few papers dealing with the health monitoring of pneumatic diaphragm pump and their estimation of residual life span. In this article, a method with vector autoregressive model and continuous-time hidden Markov model was proposed to analyze and evaluate the life span of pneumatic diaphragm pump based on the estimation error of the health condition and the cumulative deterioration of pneumatic diaphragm pump. It is modeled through a continuous-time Markov chain with three states, which includes unobservable healthy state 0, unobservable warning state 1, and observable fault state 2. The expectation–maximization algorithm is used to estimate the model parameters of the fitted hidden Markov. Through the posterior probability of pneumatic diaphragm pump in warning state 1, the derived conditional reliability function and mean residual life span formula can be calculated to evaluate the residual life span of pneumatic diaphragm pump. The results showed that the method can effectively predict the residual life span of pneumatic diaphragm pump, illustrate the effectiveness of the model, and improve the accuracy of the health status rating.
Keywords
Introduction
With the continuous development of the modern spraying process, the improvements of the spraying machine technology have made great progress. Nowadays, the requirements of the automated industrial production are gradually increasing, and the emergence of the spraying robot becomes inevitable. 1,2 The painting robot is an industrial robot that can automatically paint or spray various coatings. Pneumatic diaphragm pump (PDP) is a safe machine because it is driven by compressed air. Moreover, the constant pressure of the pump output ensures the stable pressure requirement of spray micro atomization. Thus, PDP plays a vital role in the entire spraying system.
PDP is a new type of conveying machine. It has a good pumping effect on various corrosive, high-viscosity, highly toxic, and other liquids. Meanwhile, it owns similar advantages to self-priming pumps and centrifugal pumps. Diaphragm, as a critical part of the diaphragm pump, operates the virtue of piston pump such as simple and durable structure. On the other hand, rubber diaphragm, which is an important part of the PDP for propelling liquid, has higher abrasive and corrosion resistivity compared with other reciprocating pumps. The PDP is being widely used in various fields due to the recent developments in fretting corrosion resistance and corrosion resistance. It is able to use the diaphragm in the cylinder to agitate the work to achieve the purpose of conveying various liquids. 3
PDP is a complicated machine because of many internal parts. The fault of any part will affect the normal operation of PDP. 4 During the operation of PDP, the fault is very frequent such as diaphragm rupture or diaphragm elastic failure and inlet and outlet pipeline blockage. Once a component of the PDP fails, the PDP will not work properly. The quality of the already sprayed product cannot be up to the standard if the abnormal working state of the PDP cannot be found in time. This will cause some products to fail and need to be reworked, which is time-consuming and labor intensive. In addition, repairing or replacing a faulty PDP will delay the work of the entire production line. With data collection and computer technology, it is possible to implement an effective condition monitoring system for critical equipment so that the plant productivity is improved. 5 The goal is to use the information obtained from condition monitoring to assess the actual condition of the operating PDP without any unnecessary interruptions. In the presence of condition monitoring data, the primary predictive feature used in machine prediction is the mean residual life span (MRL), which represents the average remaining time before a fault occurs, given the current machine condition and past observations. Thus, timely and accurate monitoring of the working state of the PDP and the estimation of the residual life span (RL) of the PDP are of great significance for the safe production of the spray. 6
In recent years, the fault analysis of PDP has been widely studied. Yu et al. carried a simulation to the virtual prototype of the diaphragm pump based on the numerical value analyze software MATLAB 2014 a+ and the dynamics of mechanical system simulation software ADAMS 2010, 64 bits 7 ; Xu et al. developed an analytical model for the elastomeric diaphragm using the Mooney–Rivlin modeling method and elastomeric theory 8 ; Gong et al. presented a fault diagnosis of the high-pressure piston diaphragm pump based on acoustic emission. 9 However, few people have considered the effect of random fault on some observable systems as well as estimated the RL of PDP.
In this article, a statistical method is proposed to predict the RL of PDP. Statistical methodology is based on rigorous stochastic modeling and statistical analysis of available effective state monitoring information, so it is a common method in fault diagnosis. The specific implementation process is shown in Figure 1. There are three main statistical methods for early fault prediction using state monitoring information in industry: the proportional simulation modeling, stochastic recursive filtering, 10 and hidden Markov models (HMMs). 11 HMM has been certified to be the most effective in progressive degradation system modeling and is widely used in many fields, such as speech and handwriting recognition, econometrics, and most recently, condition-based maintenance. 12,13 The states of the system are identified as a three-state continuous-time Markov chain that collects real multivariate data, where state 0, state 1, and state 2 represent healthy state, warning state, and fault state, respectively. 14,15 State 0 means that PDP is working properly and there is no tendency to malfunction. In state 1, PDP can operate normally, but there is a chance of malfunctioning. These two states are unobservable but can be estimated from the data obtained by monitoring system state. Gradually, the failure rate continues increasing and the system will enter state 2, at some time point PDP will no longer operate and no further state monitoring data will be collected, which is observable. 16

HMM-based procedure for RL prediction. HMM: hidden Markov models; RL: residual life span.
After the establishment of the model, the study based on expectation–maximization algorithm (EMA) is applied to estimate the state and observation parameters of hidden Markov. 17,18 The main advantage of this EMA compared with other maximum likelihood estimation methods is that it can be performed well even if some information or data are missing. 19 However, the state process in the HMM is unobservable, which makes the EMA particularly suitable for parameter estimation in the HMM framework. 20 Estimating the parameters of HMM using EMA generally has two calculation methods. One method is to fit the vector autoregressive (VAR) model to the preprocessed historical data then to calculate the residual of the entire data history use the fitted model. The other method is to directly use the preprocessed historical data then to apply the observation process model and the state space representation of the Kalman filter. This study uses the residual obtained from the VAR as the observation process in the hidden Markov framework. Once the hidden Markov parameters are estimated, the RL of PDP can be predicted using the derived conditional reliability function (CRF) and the explicit formula of the MRL function of the posterior probability. 21,22 The MRL function can be calculated using the posterior probability, indicating the average remaining time before the malfunction occurs by the current PDP condition and past observations. 23,24 From the theory of the partially observable hidden Markov decision process, it can be derived that the posterior probability of PDP in the warning state is a univariate statistic, whereas the observation from the state sample is multidimensional vector data. 25 In the field of fault prediction, hidden Markov is used to model the state deterioration process, and it is very novel to estimate the RL of PDP according to the deterioration condition. Moreover, the RL prediction of the HMM applied to PDP has not yet been completed currently.
The rest of this article is organized as follows. In the “Experimental platform construction and VAR model and residual calculation” section, the study will be executed about applying the VAR model to the data of the diaphragm’s motion frequency and outlet pressure in a PDP and obtaining both healthy and unhealthy residuals. In the “Hidden Markov modeling and estimation of parameters” section, the obtained residual is used as the observation process of HMM, and the model parameters are estimated by the EMA. In the “CRF and MRL functions for RL prediction” section, the CRF and the MRL function are used to predict the remaining life span of PDP. Conclusions and future perspectives are provided in the “Conclusions” section.
Experimental platform construction and VAR model and residual calculation
An abnormal continuation of the working condition occurs because of failure of PDP. Meanwhile, there are a series of changes in the variables of the pump, such as the frequency of movement of the diaphragm, the vibration amplitude of the pump, and the liquid pressure at the outlet. However, the motion frequency of PDP and the liquid pressure at the outlet will carry most of the fault information after the fault occurs. 9 For example, when the PDP is working normally, the number of times the cylinder diaphragm reciprocates in a unit time is kept within a certain range. If the number of times is not within the normal range, the PDP may have a malfunction such as blockage or diaphragm rupture. To avoid excessive parameterization, the two-dimensional (2-D) diagnostic data from the composition of the diaphragm pump’s diaphragm frequency and outlet fluid pressure were studied.
An experimental platform was built firstly to acquire data. The working power of PDP was provided by an air compressor with a stable pressure output (0.6 MPa). The main function of the program logic controller (PLC) is to obtain the state operation information of PDP, such as the motion frequency of the diaphragm and the pressure of the water at the outlet of PDP. The acquired status information was displayed on the personal computer through the application software, and the data were saved to a local database for further data analysis. Figure 2 shows the specific experimental platform.

Construction of the experimental platform.
The data are acquired by the PLC and the sensor. Firstly, the proximity switch is placed at the position of the diaphragm drive shaft, and the magnetic ring of the drive shaft is sensed to record the motion frequency of the diaphragm. When the PDP reciprocates once, the proximity switch generates a switching signal. In addition, the pressure sensor is installed at the water outlet of the pump to obtain the pressure value of the water at the outlet. Figure 3 shows the specific location layout.

Sensor position on PDP experiment platform. PDP: pneumatic diaphragm pump.
Two types of data histories were considered. One is the end of PDP failure and the other is that PDP was halted without the end of the fault. The fault history is defined as the historical data of the end of the fault observed, indicating that PDP can no longer be used normally. The pause history is defined as the historical data of PDP while it is still running without losing its function. During the life span of PDP, the study performs a data acquisition and analysis of the data for PDP every 1 h. A total of three historical data are collected throughout the study. Figure 4 is a list of trends in diaphragm motion frequency and outlet pressure with respect to time over the life span of all three data histories.

Trend graph of motion frequency and outlet fluid pressure.
It can be observed from Figure 4 that the measured value is stable from the first cycle to the 130th cycle. This indicates that PDP operates under normal conditions, and this part is considered to be a healthy part of historical data. After the 130th cycle, the diaphragm frequency value decreased significantly and the outlet pressure value increased significantly. This indicates that the entire working system is starting to operate abnormally, which is considered as an unhealthy part of historical data. For our experiments, the pressure value of the water at the outlet of PDP is a pulse value. As long as it fluctuates stably within a certain range, it can be regarded as normal condition, which explains the local fluctuation of the pressure trend of the outlet water in Figure 4.
Time series segmentation is the division of a nonstationary time series into a finite number of still parts. 26,27 Sequence segmentation is for better processing of sequences. The segmentation in this study is to achieve a smooth history in the healthy part of the data, which in turn can identify the healthy part of the data history so that the static time series model can be fitted, and the residual model can be calculated using the simulation model. 27 There are many ways to divide. In this study, only the historical data need to be divided into healthy and unhealthy parts. Since there is no basic “optimal” segmentation standard, for simplification, in this article, the data history is selected by graph check.
In all historical data, the healthy part of it is represented as
We assume that health data follow a common VAR process as follows
where ε
n
are i.i.d. N2(0, C), p ∈ N is the model order, Φr ∈ R2×2 is the autocorrelation matrices and δ
0 ∈ R2 and C ∈ R2×2 is the mean and covariance model parameters. To make the health data history
Where
The following is the least squares estimate for A and C
where T is the total number of available data points. The parameters of these models are unknown and need to be estimated. The estimation of model order p ∈ N is obtained by testing H 0: Φ p = 0 against H a: Φ p ≠ 0 using the likelihood ratio statistic given by
where Sp is the remaining sum of squares, the specific expression is as follows
For the actual 2-D pump data, M
2 = 10.8923 and M
3 = 5.5229 can be obtained by the above formula. From the χ2 distribution with two degrees of freedom and α = 0.05,
With the estimate
where
The calculation method of the residual calculation formula (4) is given in the literature, 24,27 whereby the residuals of the healthy part and the unhealthy part in the historical data can be calculated. The residual is given as a 2-D scatter plot, as shown in Figure 5. The blue dots indicate the residual calculated from the healthy portion of the pump historical data, whereas the orange dots represent the residual calculated from the unhealthy portion of the pump historical data.

Residual two-dimensional scatter plot of historical data.
The ultimate goal of this article is to model the state process of a PDP and to evaluate the remaining life span of PDP with the configured model. After the data preprocess through historical data segmentation and VAR modeling, the residuals will be obtained from the observation process. The hidden Markov parameters are estimated in the following text. Then the established model will be used to evaluate the RL of PDP.
In the next section, the residual is to be used as the hidden Markov observation process, and the EMA will be used to estimate the hidden Markov state and observation parameters.
Hidden Markov modeling and estimation of parameters
In this section, the residual data are considered to be a part of the observed data to fit the HMM. 28 The state of PDP is divided into good state (state 0), warning state (state 1), and fault state (state 2), and only the fault state is observable. This article models the state process (Xt : t ∈ R+) as a continuous-time homogeneous Markov chain with a spatial state {0, 1, 2}. Assuming that PDP starts from a good state of health, that is, X 0 = 0, the conversion rate matrix is as follows
where λ 01, λ 02, λ 12∈(0,+∞) are unknown model parameters and need to be estimated. Since the system deterioration needs to go through state 0 to state 1, and the probability of failure occurring in state 1 is high, the result is λ 12 > λ 02. 29 The last row of equation (5) contains only zero elements because once the system enters the failure state 2, it cannot leave the state before being replaced, that is, P(Yn = η|Xn Δ = 2) = 1, where η ≠ R2 represents the fault signal. And for a stable system, the sum of all the elements of each row of the conversion rate matrix is zero. 30
Considering the state of PDP, it is assumed that the above residual process (Yn : n ∈ N) is conditionally independent. For all n ∈ N, the study assumes that Yn is conditional on Xn Δ = χ, χ = 0, 1, following a binary normal distribution density function N2(µχ , Σ χ )
where µ 0, µ 1 ∈ R2 and Σ0, Σ1 ∈ R2×2 are unknown observation parameters.
Where Fi
and Sj
are used to represent the fault history and pause history, respectively, and ξ = inf{t ∈ R+: Xt
= 2} is used to represent the observable failure time, where i ∈ [1, N], j ∈ [1, M]. Assuming that the fault history Fi
has the form of
Since the sample path of the state process is unobservable, the maximum correlation likelihood function cannot be analyzed. 32 Thus, the research solves this problem by means of EMA. EMA is very suitable to solve the parameter estimation problem of HMM and iteratively maximizes pseudo-likelihood function to estimate unknown parameters. The EMA consists of two steps: an E-step and an M-step. The E-step is to calculate the expectation of the correlation likelihood function, and the M-step to obtain the maximization of the unknown state and the observation parameters. Specific steps are as follows:
It is assumed that λ, θ are the initial values of the unknown state parameter and the observed parameter, respectively, and
E-step: the pseudo likelihood function is defined by
where C represents all data sets.
M-step: Calculate λ* and θ* according to the following formula
The calculated λ* and θ* are substituted as new parameters into the E-step, and the E-step and the M-step are repeated until the Euclidean norm
The only maximization of the state parameters and observation parameters is given by the following formula
and
Using equations (9) and (10), and the Euclidean norm stopping criterion
Iterations of the EMA.
EMA: expectation–maximization algorithm.
Then, the transfer rate matrix of equation (5) can be obtained as
In Table 1, the article selects the initial value of the parameter through the experience of the predecessors. And subsequently iterates through the E step and the M step to obtain the parameter estimation value that satisfies the Euclidean norm. All calculations are coded in Python.
In the next section, based on the posterior probability of PDP in state 1, the study uses the estimated parameters of the model to calculate the CRF and the MRL function for the remaining life span of PDP.
CRF and MRL functions for RL prediction
In this section, the study will be based on the model CRF and the MRL proposed in the previous section to estimate the RL. Two of the formulas are functions of the posterior probability at the warning state (state 1).
Assuming that the uncertainty of the deterioration process of PDP follows the continuous-time homogeneous Markov chain (Xt : t ∈ R+), and the state space is described as Z = {0, 1, 2}. State 0 and state 1 are unobservable operational states, and state 2 is an observable fault state. Because the posterior probability of PDP in the warning state is fully capable of helping the study achieve the final prediction, the study only needs to monitor the posterior probability of the warning state. The posterior probability of PDP in the warning state is expressed as
where Δ represents the equidistant sampling time.
The Kolmogorov backward differential equation 33 is solved by the aforementioned conversion rate matrix equation (5), and the following probability transfer matrix is obtained
where the transition probabilities Pij (t) = P(Xt = j | X 0 = i), i, j∈[0, 1].
Using the Bayesian theorem n ≥ 1, the posterior probability can be recursively calculated as
where Π0 = 0.
According to equation (6), the ratio of normal density can be derived as
where
where Π0 = P(X 0 = 1) = 0, when n ≥ 1, the formula is as follows
According to the above CRF, it can be concluded that the MRL function in the nth period is as follows
The MRL and CRF functions have been widely used in the study of RL prediction in the field of life prediction research. Based on the formula introduced above, a real fault history data containing the binary residual observations of the study are applied thereto. To evaluate the performance of the PDP, we can estimate the MRL and CRF from the HMM. The corresponding posterior probability is calculated using the above posterior probability formula, and the results are shown in Figure 6.

Posterior probability of the typical failure PDP. PDP: pneumatic diaphragm pump.
Figure 6 depicts the posterior probability of PDP data history. It can be seen from the figure that the PDP starts running in a healthy state and remains healthy from the first to the 18th sampling period. In the 19th sampling period, the posterior probability is significantly abrupt, indicating that the PDP entered the warning state from a healthy state, that is, from state 0 to state 1.
Based on the initial value of the posterior probability and equation (13), the estimated conditional reliability for each sampling period of PDP historical data can be derived. The result is shown in Figure 7.

Conditional reliability function of failure history data.
Figure 6 shows the estimated conditional reliability for each sampling period of the PDP data history. It can be seen from the figure that reliability value of PDP from the state of health to the warning state is probably a significant downward trend in the 19th sampling period. And probably in the 26th sampling period, the system begins to enter the failure state from the warning state, indicating that the possibility of failure of PDP at this time is great. The MRL function is completely determined by the posterior probability, and the MRL of the failure historical data is shown in Figure 8. It can be seen from Figure 8 that the RL drops rapidly after the 25th sampling period, and the RL in the 26th sampling period is only 72.0559 h, indicating that PDP is close to state 2 at this time and should be comprehensive in time. Checking PDP and repairing it to avoid the progress of the production due to the fault and generate more replacement costs. Thus, the proposed model is suitable for predicting the deteriorated useful life span of a PDP.

Average residual life span function of failure history data.
Conclusions
The life span of PDP in intelligent spraying is predicted according to a real multivariate diaphragm motion frequency and outlet pressure. A prediction model was developed based on MRL estimation from multivariate observations obtained from state detection of the entire system. The state follows an unobservable continuous-time hidden Markov process. The state process was modeled as a three-state continuous-time Markov chain, where the two good and warning states are unobservable, and only the fault state was observable. The VAR model was used to clean and preprocess the historical data of the healthy part and obtain the unhealthy and healthy residuals from the history data. The residual is selected as the observation process in the HMM framework. And the parameters of the HMM are estimated by means of EMA which increase considerably the speed of the computation. The RL of the PDP is predicted by deriving the explicit formula of the CRF and the MRL in the case of estimating the unknown parameters. As a result, when it is detected that the monitoring status of the PDP in the intelligent spraying system changes and the RL is greatly reduced, preventive maintenance measures should be taken immediately to ensure the entire intelligent spraying system in a healthy working state. This avoids unqualified spray products due to system failures, the delay in the entire spray line, and even the shutoff of production.
Although this article mainly analyzes the historical data of PDP in the intelligent spraying system, the methods involved in this article can be widely applied to other fields. For example, the model established in this study can be used for other types of condition monitoring data, such as quality or performance monitoring data.
In fact, the failure of PDP is a gradual process. There are numerous kinds of characteristic information throughout the life span course. However, only three states, namely the health state, the warning state, and the failure state, have been presented in this study, and the ideal results have also been obtained. This article only predicts the fault state of PDP and does not find out the reason of the malfunction. Therefore, in the future, the fault type analysis can be deeply studied.
Footnotes
Acknowledgment
The authors are grateful to all colleagues from Shanghai University for their enthusiastic supporting this study.
Author contributions
LY supervised the project and guided the project implementation and organized the manuscript. JC conducted the experiment and prepared the manuscript. PY provided technical corrections and grammatical modifications to the manuscript. YY and KC executed the experimental protocol, and SH carried out the software development.
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
