System-level performance prognosis based on data augmentation for air brake systems of in-service trains

Abstract

In order to prognose the performance for air brake systems of in-service trains, a data augmentation method based on correlation analysis and improved multivariate support vector regression was proposed in this paper. By using black box theory and correlation analysis, brake cylinder pressure was extracted as the performance identification signal of air brake system, and the input and output signals were used as the auxiliary signals to construct the black box model of air brake system. Meanwhile, in order to make full use of the obtained data information, a data augmentation prognosis model based on improved multivariate support vector regression was established. Moreover, the optimal parameters of prognosis model were selected by means of particle swarm optimization algorithm, and RMSE and MAE were used as evaluation indicators. Finally, case studies on test-rig performance experiment data of air brake system were conducted. The prognosis model was trained and verified by using the service 1-7N and emergency, service full brake-relief and stage brake-relief experiment data, respectively, and the modeling and calibration error at pressure-holding stage were both in the order of ±5 kPa, which demonstrated the proposed method to be quite accurate and effective.

Keywords

In-service train air brake system performance prognosis data augmentation time-varying system

Introduction

With the continuous development of rail transit network, a large number of trains have been in-service. The characteristics of high-speed, long-service time and large-passenger capacity determine the extreme importance of safety and reliability for equipment of trains, which has received widespread attention.^1–3 As the safety equipment of trains, brake system plays a vital important role in slowing down or stopping trains, and it has become an important constraint on the further improvement of train speed and traction quality. However, due to the harsh working conditions, such as vibration, humidity, electromagnetic interference and frequent usage, performance degradation of brake system may occur inevitably, which further affect the operation safety, efficiency, quality and maintenance costs of trains. Fault prognosis, detection and diagnosis are effective ways to improve system active safety ability and reduce accident risk.^4,5

Researches on fault prognosis, detection and diagnosis of brake system have been reported in recent years,^6,7 and the relevant methods mainly contain three categories: model-based, data-driven, and signal processing-based. Ding and Zuo⁵ proposed a performance degradation prognosis model based on relative characteristic and long short-term memory network for components of brake systems. Niu and Zhao⁸ proposed a fault detection and isolation method for locomotive brake system based on bond graph model. Zuo et al.⁹ established a performance degradation monitoring model based on data fusion method for pneumatic brake system. Lu et al.,¹⁰ Zuo,¹¹ and Zhou et al.¹² proposed data-driven methods to detect and isolate faults of sensors, pneumatic units and brake cylinders, respectively. Seo et al.¹³ conducted fault diagnosis for solenoid valve of railway brake systems with embedded sensor signals and physical interpretation. It can be found that existing literatures were basically limited to the fault prognosis, detection and isolation methods for components of brake systems, but few researches on system-level performance prognosis for air brake system of in-service trains have been reported.

Prognosis methods are generally divided into two categories: model-based and data-driven.^14–16 As air brake system is a mechanic-electric-pneumatic coupled and time-varying nonlinear system with complex structure and multiple operating modes, it is hard to establish an accurate mathematical model, while data-driven method focuses on data characteristics.^17,18 Moreover, time series forms during the service period of air brake system. Therefore, data-driven prognosis method was more appropriate and was adopted in this paper. Considering the nonlinearity of the input and output time series of air brake system under multiple operating modes, commonly used data-driven nonlinear prognosis methods, such as artificial neural networks (ANN) and support vector regression (SVR), were proposed. You et al.¹⁹ proposed a neural network-based method for real-time health status prediction of electric vehicle batteries. Shen et al.²⁰ used multivariate SVR to predict the remaining life of rolling bearings, and experimental results showed that the method could obtain accurate prediction results in the absence of samples. Although ANN has a strong ability to approximate nonlinear mapping, it adopts a learning method based on empirical risk minimization criterion, which requires a large sample size for training and is prone to overfitting, affecting the generalization ability. While SVR is based on structural risk minimization criterion, which considers both empirical risk and confidence interval minimization. It has strong generalization ability, and requires less samples, short training time and strong anti-noise ability, which can effectively overcome the shortcomings of ANN.^21,22 Nevertheless, the kernel parameters have a greater impact on SVR-based prognosis model. Genetic algorithm²³ and particle swarm algorithm²⁴ were used to optimize the kernel parameters and penalty factors to improve the prognosis accuracy. However, the prognosis model based on SVR also has the deficiency that the continuity and correlation between data points of time series is not fully considered.

In this paper, a black box model of air brake system was constructed, and the data augmentation prognosis model based on an improved multivariate support vector regression algorithm was established by enhancing the utilization of the obtained data, and PSO algorithm was used to optimize the parameters.

The remaining parts of this paper are organized as follows. Section 2 introduces the air brake system and its black box model. Section 3 presents the data augmentation prognosis method. Section 4 describes case studies to demonstrate the effectiveness of the proposed method. Finally, conclusions are drawn in Section 5, with some perspectives on research and development.

Black box model of air brake system

Brake system of high-speed train

The microcomputer-controlled electropneumatic brake system (Figure 1(a)) is widely used in trains, and it is mainly composed of driver controller, compressed air supply unit (CASU), electronic brake control unit (EBCU), pneumatic brake control unit (PBCU), and basic brake unit (BBU).⁵

Figure 1.

Black box model of air brake system. (a) Schematic diagram of brake system; (b) Schematic diagram of black box model.

Driver controller generates brake command signals and transmits them to EBCU for brake force calculation and distribution. CASU supplies compressed air to PBCU and other air-consuming equipment. PBCU generates brake cylinder pressure according to the instructions of EBCU. BBU transfers brake cylinder pressure into mechanical brake force.²⁵

Black box model of air brake system

The brake force of trains generally consists of electric brake force and air friction brake force. The electric brake force generated by the traction motor decreases as the train speed decreases, while the air friction brake force is provided by the air brake system, which comes into effect in the cases of service brake and fast brake at low speed, as well as emergency brake. Therefore, air brake system has been used as the inevitable safety system for slowing down and stopping the trains. Since various pneumatic components of air brake system are driven by compressed air, pressure can effectively reflect the performance of air brake system. Therefore, it is of great significance and value to study the change law of pressure for performance prognosis of air brake system.

As the structure of air brake system is complex, the working medium is compressed air, air brake system has a strong nonlinearity, and the components are coupled and influenced by each other. As a result, it is hard to establish an accurate mathematical model in practice, and data-driven prognosis method is more suitable for air brake system. Actually, although the structure is complex, air brake system is a control system determined by a nonlinear mechanism,²⁶ and in the case of a given input, changes within the system will have an impact on the final output. By introducing the black box theory²⁷ and focusing on the numerical characteristics, the input and output signals were used as the identification signals to construct the black box model of air brake system.

As shown in Figure 1(b), the pneumatic input of air brake system is brake supply reservoir pressure, that is, P_SR, the electric control input is EP current, that is, I_EP, and the output is brake cylinder pressure, that is, P_BCP. Moreover, P_BCP is an important variable to reflect the performance of air brake system,⁹ so the performance prognosis of air brake system can be achieved by predicting the time series of P_BCP.

In order to ensure the validity of the black box model, it was assumed that P_BCP is related to P_SR and I_EP, and a correlation test based on Spearman correlation coefficient was carried to verify the rationality of the hypothesis, and the calculation formula is as follows²⁸

r_{s} = 1 - \frac{6 \sum_{i = 1}^{n} d_{i}^{2}}{n (n^{2} - 1)}

(1)

where n is the number of samples, and $d_{i}$ represents the level difference between two variables at the ith moment.

Prognosis model based on data augmentation method

Multivariate support vector regression

According to the black box model of air brake system, it can be known that the input variable of the prognosis model is more than one. Meanwhile, considering the nonlinearity of the input and output time series of air brake system under multivariable operating conditions, the multivariate support vector regression (MSVR) algorithm was considered. The basic idea of MSVR is to establish a nonlinear mapping between multi-dimensional input vector and output vector. By introducing a kernel function, the original data of the input space is mapped into a high-dimensional feature space, then the nonlinear regression problem in input space is transformed into a linear regression problem in feature space.²⁹ The main steps of MSVR are as follows.

For the training sample set $D_{SVR} = {(x_{i}, y_{i}) | i = 1, 2, \dots, n}$ , $x_{i} \in R^{m 0 * n}$ is the input vector, $y_{i} \in$ R is the output vector, $m 0$ is the number of variables, and n is the number of training samples. The modeling process is to construct a nonlinear regression hyperplane that can fit the input vector and output vector of the training samples, and the nonlinear regression function is expressed as

f (x) = w^{T} \cdot φ (x) + b

(2)

where w is the weight vector, $φ (\cdot)$ is the nonlinear mapping function, and b is the bias.

In order to obtain w and b, the training error of MSVR is defined as equation (3)

ξ (f (x) - y, x) = {\begin{matrix} 0, | y - f (x) | \leq ε \\ | y - f (x) | - ε, else \end{matrix}

(3)

where $f (x)$ and y are the predicted value and the observed value of P_BCP, respectively, and $ε$ is the insensitivity.

Secondly, construct a quadratic convex optimization problem,

Γ (w) = \frac{1}{2} ‖ w ‖^{2} + C \sum_{i = 1}^{n} ξ (f (x_{i}) - y_{i}, x_{i})

(4)

where $\frac{1}{2} ‖ w ‖^{2}$ is the regularization term, which is used to reduce the complexity of the model and reduce over-fitting, $\sum_{i = 1}^{n} ξ (f (x_{i}) - y_{i}, x_{i})$ is the empirical risk term, C is the penalty factor, which is used to balance the training error and the generalization ability.

Then, slack variables $ξ_{i}$ and $ξ_{i}^{*}$ are introduced, and equation (4) is transformed into a constrained quadratic optimization problem. The optimization objective is as shown in equation (5)

\begin{matrix} min_{w, b, ξ, ξ^{*}} {\frac{1}{2} ‖ w ‖^{2} + C \sum_{i = 1}^{n} (ξ_{i} + ξ_{i}^{*})} \\ s . t . {\begin{matrix} w^{T} \cdot φ (x_{i}) + b - y_{i} \leq ε + ξ_{i} \\ \begin{matrix} y_{i} - w^{T} \cdot φ (x_{i}) - b \leq ε + ξ_{i}^{*} \\ ξ_{i} \geq 0, ξ_{i}^{*} \geq 0 \end{matrix} \end{matrix}, i = 1, 2, \dots, n \end{matrix}

(5)

By introducing Lagrangian multipliers $α_{i}$ , $α_{i}^{*}$ , $β_{i}$ , $β_{i}^{*}$ , the Lagrangian function is constructed as

\begin{matrix} L (w, b, ξ, ξ^{*}, α_{i}, α_{i}^{*}, β_{i}, β_{i}^{*}) = \frac{1}{2} | | w | |^{2} + C \sum_{i = 1}^{n} (ξ_{i} + ξ_{i}^{*}) \\ - \sum_{i = 1}^{n} α_{i} [ε + ξ_{i} + y_{i} - (w^{T} \cdot φ (x_{i}) + b)] \\ - \sum_{i = 1}^{n} α_{i}^{*} [ε + ξ_{i}^{*} - y_{i} + (w^{T} \cdot φ (x_{i}) + b)] \\ - \sum_{i = 1}^{n} (β_{i} ξ_{i} + β_{i}^{*} ξ_{i}^{*}) \end{matrix}

(6)

In order to satisfy the minimization of the original variables w, b, $ξ_{i}$ , $ξ_{i}^{*}$ , and the maximization of the dual variables $α_{i}$ , $α_{i}^{*}$ , $β_{i}$ , $β_{i}^{*}$ , according to the KKT complementarity condition,³⁰ the equation (6) is transformed into the dual Lagrangian function with $α_{i}$ and $α_{i}^{*}$ .

\begin{matrix} max_{α, α^{*}} {\sum_{i = 1}^{n} y_{i} (α_{i}^{*} - α_{i}) - ε \sum_{i = 1}^{n} (α_{i}^{*} + α_{i}) \\ - \frac{1}{2} \sum_{i = 1}^{n} \sum_{j = 1}^{n} (α_{i}^{*} - α_{i}) (α_{j}^{*} - α_{j}) (φ^{T} (x_{i}) \cdot φ (x_{j}))} \end{matrix}

s . t . {\begin{matrix} \sum_{i = 1}^{n} (α_{i}^{*} - α_{i}) = 0 \\ \begin{array}{l} 0 \leq α_{i} \leq C \\ 0 \leq α_{i}^{*} \leq C \end{array} \end{matrix}

(7)

The nonlinear regression function is obtained as

f (x) = \sum_{i = 1}^{n} (α_{i}^{*} - α_{i}) [φ^{T} (x_{i}) \cdot φ (x_{j})] + b

(8)

Due to the high computational cost of the inner product of the nonlinear regression function, $φ^{T} (x_{i}) \cdot φ (x_{j})$ in equation (8) is replaced by the kernel function $K (x_{i}, x_{j})$ , and the nonlinear regression function is expressed as

f (x) = \sum_{i = 1}^{n} (α_{i}^{*} - α_{i}) K (x_{i}, x_{j}) + b

(9)

Improved multivariate support vector regression based on data augmentation

Although the MSVR-based prognosis model has strong nonlinear mapping ability, it only considers the correspondence between the input vector and the output vector at the same time during model training. In fact, according to the Takens delay embedding theorem,³¹ there is a certain functional relationship between the future value of a time series and its previous $m_{q}$ values, that is, given a time series $X_{n} = (x_{1}, x_{2}, \cdot \cdot \cdot, x_{n})$ , the known data at and before time t can be used to predict the data at time $t$ +1, which is expressed as

{\hat{x}}_{t + 1} = f (x_{t}, x_{t - 1}, \dots, x_{t - m_{q} + 1})

(10)

where $m_{q}$ is the embedding dimension.

Based on the above ideas, in order to make full use of the acquired data for prognosis, data augmentation was carried out for the training set and testing set, and an improved multivariate support vector regression (IMSVR) algorithm was proposed. Specifically, different from utilizing the input vector and output vector directly, new input vector X and output vector Y were constructed according to equations (11) and (12).

X = [\begin{matrix} {\bar{x}}_{m + 1} \\ {\bar{x}}_{m + 2} \\ ⋮ \\ {\bar{x}}_{n} \end{matrix}] = [\begin{matrix} y_{1} \\ y_{2} \\ ⋮ \\ y_{n - m} \end{matrix} \begin{matrix} \begin{matrix} y_{2} \\ y_{3} \\ ⋮ \\ y_{n - m + 1} \end{matrix} & \begin{matrix} \dots \\ \dots \\ ⋱ \\ \dots \end{matrix} & \begin{matrix} y_{m} \\ y_{m + 1} \\ ⋮ \\ y_{n - 1} \end{matrix} & \begin{matrix} x_{m + 1} \\ x_{m + 2} \\ ⋮ \\ x_{n} \end{matrix} \end{matrix}]

(11)

Y = [\begin{matrix} y_{m + 1} \\ y_{m + 2} \\ ⋮ \\ y_{n} \end{matrix}]

(12)

The nonlinear regression function of IMSVR is obtained as

\begin{matrix} x_{t + 1} = f ({\bar{x}}_{t}) = \sum_{i = 1}^{n - m} (α_{i}^{*} - α_{i}) K ({\bar{x}}_{i}, {\bar{x}}_{t}) + b, \\ t = m + 1, m + 2, \dots, n \end{matrix}

(13)

Thus, the IMSVR-based prognosis model is expressed as

{\hat{x}}_{n + 1} = \sum_{i = 1}^{n - m} (α_{i}^{*} - α_{i}) K ({\bar{x}}_{i}, {\bar{x}}_{n + 1}) + b

(14)

{\bar{x}}_{n + 1} = {y_{n - m + 1}, y_{n - m + 2}, \cdot \cdot \cdot, y_{n}, x_{n + 1}}

(15)

where ${\hat{x}}_{n + 1}$ is the predicted value of the n + 1 point.

Model parameter optimization and evaluation indicator

Parameter optimization of prognosis model

Kernel function has a great impact on the accuracy of prognosis model, and kernel functions mainly include d-order polynomial, sigmoid and radial basis function (RBF) kernel functions. Among them, RBF kernel function has the advantage that the center of each basis function corresponds to a support vector. Compared with the d-order polynomial kernel function, there are fewer parameters involved in the operation, which reduces the complexity of model. Moreover, the sigmoid kernel function may be invalid when taking some parameter values, so RBF kernel function is selected, which is expressed as

K (x_{i}, x_{j}) = e^{- γ {‖ x_{i} - x_{j} ‖}^{2}}

(16)

where γ is the kernel parameter.

In order to further improve the accuracy of the prognosis model, it is necessary to reasonably select the kernel parameter γ and the penalty factor C in the modeling process. By comparing Cross Validation (CV), Genetic Algorithm (GA) and Particle Swarm Optimization (PSO), it was found that PSO algorithm has better optimization effect,^32–34 and it has been applied in the railway domain in recent years,³⁵ so PSO algorithm was chosen to optimize the model parameters.

The basic principle of PSO algorithm is to find the optimal solution for an individual in the population through competition or cooperation, which has the advantages of simple algorithm, easy implementation and strong global convergence.³⁶ The implementation steps of PSO algorithm are as follows.

Firstly, a set of random values is initialized as a swarm of particles. The particles have two properties: velocity and position, and the velocity and position of the ith particle in s-dimensional space are denoted as $v_{i} = (v_{i 1}, v_{i 2}, \cdot \cdot \cdot, v_{i s})$ and $x_{i} = (x_{i 1}, x_{i 2}, \cdot \cdot \cdot, x_{i s})$ , respectively.

Secondly, the particles search for the optimal solution by means of iterations. During each iteration, the optimal position searched by the particle is noted as pbest and the optimal position of the whole particle swarm is noted as gbest, which are called the personal best and the global best respectively. All particles in the swarm update their velocity and position by using pbest and gbest. The updating equations for particle velocity and position are expressed as.³⁴

{\begin{matrix} v_{i} (l + 1) = w_{v} * v_{i} (l) + c_{1} * r_{1} * [pbes t_{i} (l) - x_{i} (l)] \\ + c_{2} * r_{2} * [gbes t_{i} (l) - x_{i} (l)] \\ x_{i} (l + 1) = x_{i} (l) + w_{p} * v_{i} (l + 1) \end{matrix}

(17)

where $v_{i} (l)$ is the velocity of the ith particle at the lth iteration; l is the number of iterations, w_v and w_p are the inertia weights; r₁ and r₂ are random numbers between (0, 1), $x_{i} (l)$ is the position of the ith particle at the lth iteration, and c₁ and c₂ are learning factors.

After several iterations, the personal best, global best and particle swarm velocity values are continuously updated until the termination condition is reached, completing the selection of the optimal parameters of γ and C.

Evaluation indicator of prognosis model

Moreover, in order to evaluate the accuracy of the prognosis model, root mean squared error (RMSE) and mean absolute error (MAE),³⁷ which are widely used in the field of time series prognosis, are selected as evaluation indicators, and the calculation formulas are expressed as

RMSE = \sqrt{\sum_{k = 1}^{N} \frac{{(y_{p} (k) - y (k))}^{2}}{N}}

(18)

MAE = \frac{1}{N} \sum_{k = 1}^{N} | y_{p} (k) - y (k) |

(19)

where $y (k)$ is the measured value of P_BCP, $y_{p} (k)$ is the predicted value, and N is the number of samples.

Case study

Experiment sample acquisition and preprocessing

Considering that air brake system has the characteristics of variable operating conditions, in order to validate the effectiveness of the data augmentation prognosis method, three typical operating conditions of service 1-7N and emergency, service full brake-relief, as well as stage brake-relief performance tests were conducted on the test-rig of brake system of high-speed train (Figure 2) to construct the training set and testing set. According to the black box model of air brake system, the signals collected include EP current, brake supply reservoir pressure and brake cylinder pressure, and the sampling frequency was set as 10 Hz.

Figure 2.

Test-rig of brake system of high-speed train. (a) Test site; (b) Test console.

For the collected time series data, on the one hand, due to the interference of the working environment, there is measurement noise during data sampling process. On the other hand, the magnitude of the EP current, brake supply reservoir pressure and brake cylinder pressure are not in the same order of magnitude. Using the original data directly will reduce the accuracy of the prognosis model, and even cause the model to fail. Therefore, in order to eliminate the influence of noise and magnitude, five-point moving average filtering on the collected original data was performed. Then, the [0, 1] interval normalization method was used to normalize the filtered data, and the conversion formula is expressed as

x^{*} = \frac{x - x_{min}}{x_{max} - x_{min}}

(20)

where x is the original data, $x_{\min}$ is the minimum value of the original data, $x_{\max}$ is the maximum value of the original data, and $x^{*}$ is the normalized data.

Figure 3 shows the performance test curves of the EP current, brake supply reservoir pressure and brake cylinder pressure under three typical operating conditions, respectively.

Figure 3.

Performance test curves of the EP current, brake supply reservoir pressure and brake cylinder pressure. (a) Service 1-7N and emergency; (b) Service full brake-relief; (c) Stage brake-relief.

Verification work

Correlation analysis results

The correlation among the EP current, brake supply reservoir pressure and brake cylinder pressure under three typical operating conditions were shown in Table 1 to Table 3, respectively.

Table 1.

Correlation analysis results under service 1-7N and emergency operating conditions.

		EP current	Brake supply reservoir pressure	Brake cylinder pressure
EP current	Correlation of Spearson	1	−0.7536	0.9892
EP current	Significance	0**	0**	0**
Brake supply reservoir pressure	Correlation of Spearson	−0.7536	1	−0.7695
Brake supply reservoir pressure	Significance	0**	0**	0**
Brake cylinder pressure	Correlation of Spearson	0.9892	−0.7695	1
Brake cylinder pressure	Significance	0**	0**	0**

in Table 1 means that there is an extremely significant correlation.

Table 2.

Correlation test results under service full brake-relief operating conditions.

		EP current	Brake supply reservoir pressure	Brake cylinder pressure
EP current	Correlation of Spearson	1	−0.6425	0.7923
EP current	Significance	0**	0**	0**
Brake supply reservoir pressure	Correlation of Spearson	−0.6425	1	−0.8519
Brake supply reservoir pressure	Significance	0**	0**	0**
Brake cylinder pressure	Correlation of Spearson	0.7923	−0.8519	1
Brake cylinder pressure	Significance	0**	0**	0**

Means that there is an extremely significant correlation.

Table 3.

Correlation test results under stage brake-relief operating conditions.

		EP current	Brake supply reservoir pressure	Brake cylinder pressure
EP current	Correlation of Spearson	1	−0.8107	0.9799
EP current	Significance	0**	0**	0**
Brake supply reservoir pressure	Correlation of Spearson	−0.8107	1	−0.8790
Brake supply reservoir pressure	Significance	0**	0**	0**
Brake cylinder pressure	Correlation of Spearson	0.9799	−0.8790	1
Brake cylinder pressure	Significance	0**	0**	0**

Means that there is an extremely significant correlation.

It can be seen from the results with ** in Table 1 to Table 3 that there is an extremely significant correlation among EP current, brake supply reservoir pressure and brake cylinder pressure. Therefore, the hypothesis that the brake cylinder pressure is related to the brake supply reservoir pressure and EP current could be validated, which further demonstrated the validity and feasibility of the black box model of air brake system developed in this paper.

Prognosis results

Prognosis model based on IMSVR

Considering the variable operating conditions of air brake system, in order to make the prognosis model have sufficient generalization capability, the performance test data under service 1-7N and emergency operating conditions shown in Figure 3(a) were selected to construct the training set and used for the training of IMSVR-based prognosis model. The values of γ and C were optimized by PSO algorithm.

The key parameters of the PSO algorithm were set as follows: c₁ is 1.5, c₂ is 1.7, w_v is 1, w_p is 1, the number of termination generations is 200, and the population size is 20. The optimal parameters for the service 1-7N and emergency operating conditions were γ = 0.01 and C = 46.22. The prognosis model was trained using the constructed training set with the optimal parameters, and the obtained prognosis model was used to predict the testing set. For analysis purposes, the testing set was aligned with the training set, and the predicted and measured brake cylinder pressure time series curves under the service 1-7N and emergency operating conditions were obtained as shown in Figure 4.

Figure 4.

Prognosis results under service 1-7N and emergency operating conditions. (a) Comparison curves between the predicted and measured values; (b) Prognosis error.

As can be seen in Figure 4, the predicted values of brake cylinder pressure under the service 1-7N and emergency operating conditions follow the measured values well, with the errors within ±5 kPa, except at the time of stage switching. In addition, the RMSE and MAE of the IMSVR-based prognosis model was 3.0160 and 2.8372, respectively, which showed that performance prognosis model of air brake system based on the proposed method has a high accuracy.

Comparison results of prognosis model based on MSVR

In order to further validate the accuracy of the developed performance prognosis model, the conventional MSVR-based model without data augmentation was used to predict the time series of brake cylinder pressure under service 1-7N and emergency operating conditions. The optimal parameters were γ = 0.01 and C = 6.94. The predicted and measured values of the brake cylinder pressure under the service 1-7N and emergency operating conditions were shown in Figure 5.

Figure 5.

Comparison prognosis results under service 1-7N and emergency operating conditions. (a) Comparison curves between the predicted and measured values; (b) Prognosis error.

As can be seen in Figure 5, the predicted values of the brake cylinder pressure time series of MSVR-based prognosis model follow the measured values less well than those of IMSVR-based prognosis model. In addition, the evaluation indicators of the MSVR-based prognosis model were calculated. The RMSE and MAE of the MSVR-based prognosis model were higher than those of the IMSVR-based prognosis model, with the RMSE being 68.12% higher and the MAE being 51.23% higher. Therefore, the prediction accuracy of the IMSVR-based prognosis model was significantly better than that of the MSVR-based prognosis model, and it also proved that the data augmentation method could effectively improve the accuracy of the performance prognosis for air brake system.

Calibration results

In order to verify the generalization performance of the IMSVR-based prognosis model, the performance test data shown in Figure 3(b) and (c) were selected as the calibration signals. The comparison curves of the predicted and measured brake cylinder pressure time series and error curves under the service full brake-relief and stage brake-relief operating condition were shown in Figures 6 and 7, respectively.

Figure 6.

Calibration results under service full brake-relief. (a) Comparison curves between the predicted and measured values; (b) Calibration error.

Figure 7.

Calibration results under stage brake-relief. (a) Comparison curves between the predicted and measured values; (b) Calibration error.

As can be seen in Figures 6 and 7, the predicted and measured values of the brake cylinder pressure time series under both service full brake-relief and stage brake-relief operating conditions were in good agreement, and the calibration errors, with the exception of the stage switching moments, were basically within ±5 kPa. In addition, the RMSE and MAE of the IMSVR-based prognosis model based on the service full brake-relief calibration signal was 3.1507 and 2.4872, and those based on the stage brake-relief calibration signal was 1.3066 and 0.8855, respectively.

It is obvious that the performance prognosis model of air brake system has a high calibration accuracy under both service full brake-relief and stage brake-relief calibration signals, indicating that the data augmentation method proposed in this paper has good generalization performance.

Conclusion

This paper investigated the performance prognosis method for air brake system of in-service trains. Compared with traditional prognosis strategy, the obvious improvements are that this method shows great accuracy and is practically suitable for systems with complex structure and time-varying nonlinear characteristics.

(1) In view of the complex structure of air brake system, brake cylinder pressure was extracted as the performance identification signal, and black box model of air brake system was constructed. Correlation analysis tests were conducted using Spearman’s correlation coefficient, and the validity of the black box model was verified by the performance test data of air brake system.

(2) To address the time-varying nonlinear characteristics of brake cylinder pressure time series, as well as to make full use of obtained data, a data augmentation prognosis model was developed based on the IMSVR algorithm, and PSO algorithm was used to optimize the parameters to improve the accuracy of the prognosis model.

(3) The performance tests of air brake system under different operating conditions were carried out, and the analysis results of test data showed that the modeling error of the prognosis model established by using the performance test data under different stages was within ±5 kPa, and the calibration errors under two typical operating conditions of service full brake-relief and stage brake-relief were also within ±5 kPa, which demonstrated the validity and applicability of the proposed method.

By comparing the prognosis results with those based on the MSVR algorithm, it is further verified that the proposed method has higher accuracy. The method can be applied to extend the data and provide a reference for monitoring the service status of air brake system. In the future, fault warning algorithm of air brake system can be designed by comparing the predicted and measured values.

Footnotes

Handling Editor: Chenhui Liang

Declaration of conflicting interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by the National Natural Science Foundation of China (Grant No. 62273258), Science and Technology Research and Development Programme Topics of China State Railway Group Co., Ltd. (Grant No. N2022J019, Grant No. N2022J037-A), and Shanghai Collaborative Innovation Research Center for Multi-network & Multi-model Rail Transit.

ORCID iDs

Jingxian Ding

Jianyong Zuo

References

Fumeo

Oneto

Anguita

Condition based maintenance in railway transportation systems based on Big Data Streaming analysis. Procedia Comput Sci 2015; 53: 437–446.

Bai

, et al. Tracking research on service performance of CRH₃ EMU in Wuhan-Guangzhou passenger dedicated line. Railway Locomotive and Motor Car 2018; 1: 37–40.

Tian

Sha

Shi

, et al. The safety evaluation method for service of traction system on multiple units and research on standards. Railw Veh 2015; 53: 21–24.

Liu

Pan

Zuo

, et al. A review on fault diagnosis for rail vehicles. J Mech Eng 2016; 52: 134–146.

Ding

Zuo

Performance degradation prognosis based on relative characteristic and long short-term memory network for components of brake systems of in-service trains. Appl Sci 2022; 12: 11725.

Liu

Zio

A SVM framework for fault detection of the braking system in a high speed train. Mech Syst Signal Process 2017; 87: 401–409.

Liu

Zio

A scalable fuzzy support vector machine for fault detection in transportation systems. Expert Syst Appl 2018; 102: 36–43.

Niu

Zhao

Bond graph model based fault detection and isolation for locomotive brake. J Tongji Univ 2015; 43: 894–899.

Zuo

Ding

, et al. Performance degradation monitoring based on data fusion method for in-service train pneumatic brake system. Proc IMechE, Part C: J Mechanical Engineering Science 2019; 233: 1924–1938.

10.

Fan

Gao

, et al. A data-based approach for sensor fault detection and diagnosis of electro-pneumatic brake. In: 2019 IEEE international conference on prognostics and health management, San Francisco, CA, USA, 17–20 June 2019.

11.

Zuo

Ding

Feng

Latent leakage fault identification and diagnosis based on multi-source information fusion method for key pneumatic units in Chinese standard electric multiple units (EMU) braking system. Appl Sci 2019; 9: 300.

12.

Zhou

, et al. Fault detection and isolation of the brake cylinder system for electric multiple units. IEEE Trans Control Syst Technol 2018; 26: 1744–1757.

13.

Seo

, et al. Solenoid valve diagnosis for railway braking systems with embedded sensor signals and physical interpretation. In: Proceedings of the annual conference of the prognostics and health management society, 2016, pp.337–343.

14.

Lei

Gontarz

, et al. A model-based method for remaining useful life prediction of machinery. IEEE Trans Reliab 2016; 65: 1314–1326.

15.

Kim

Choi

JH.

Practical options for selecting data-driven or physics-based prognostics algorithms with reviews. Reliab Eng Syst Saf 2015; 133: 223–236.

16.

Baraldi

Cadini

Mangili

, et al. Model-based and data-driven prognostics under different available information. Probab Eng Mech 2013; 32: 66–79.

17.

Lei

Gebraeel

, et al. Multi-sensor data-driven remaining useful life prediction of semi-observable systems. IRE Trans Ind Electron 2021; 68: 11482–11491.

18.

Zhao

Jiang

Long

Remaining useful life estimation of mechanical systems based on the data-driven method and Bayesian theory. J Mech Eng 2018; 54: 115–124.

19.

You

Park

Real-time state-of-health estimation for electric vehicle batteries: a data-driven approach. Appl Energy 2016; 176: 92–103.

20.

Shen

Chen

, et al. Remaining life predictions of rolling bearing based on relative features and multivariable support vector machine. J Mech Eng 2013; 49: 183–189.

21.

Huang

Jiang

, et al. Prediction of groundwater levels using evidence of chaos and support vector machine. J Hydroinformatics 2017; 19: 586–606.

22.

Zhao

Tao

Ding

, et al. A dynamic particle filter-support vector regression method for reliability prediction. Reliab Eng Syst Saf 2013; 119: 109–116.

23.

Zhao

Tao

Zio

System reliability prediction by support vector regression with analytic selection and genetic algorithm parameters selection. Appl Soft Comput 2015; 30: 792–802.

24.

Liu

Qiao

Zhao

, et al. Using the AR–SVR–CPSO hybrid model to forecast vibration signals in a high-speed train transmission system. Proc IMechE, Part F: J Rail and Rapid Transit 2019; 233: 701–714.

25.

Zuo

Ding

Liu

, et al. A virtual prototype for performance analysis of electropneumatic brake on metro trains. Adv Mech Eng 2020; 12: 1687814020926275.

26.

Zuo

Ding

Research progress on intelligent control and maintenance technology of railway vehicle braking system. J Traffic Transp Eng 2021; 21: 50–62.

27.

Bunge

A general black box theory. Philos Sci 1963; 30: 346–358.

28.

Spearman

The proof and measurement of association between two things. Int J Epidemiol 2010; 39: 1137–1150.

29.

Cortes

Vapnik

Support-vector networks. Mach Learn 1995; 20: 273–297.

30.

Moura

MDC

Zio

Lins

, et al. Failure and reliability prediction by support vector machines regression of time series data. Reliab Eng Syst Saf 2011; 96: 1527–1534.

31.

Takens

Detecting strange attractors in turbulence. Lect Notes Math. 1981; 898: 366–381.

32.

Lan

Forecasting performance of support vector machine for the Poyang Lake’s water level. Water Sci Technol 2014; 70: 1488–1495.

33.

Chen

Forecasting systems reliability based on support vector regression with genetic algorithms. Reliab Eng Syst Saf 2007; 92: 423–432.

34.

Chang

Hsu

Lin

, et al. Query-based learning for dynamic particle swarm optimization. IEEE Access 2017; 5: 7648–7658.

35.

Cole

McSweeney

Applications of particle swarm optimization in the railway domain. Int J Rail Transp 2016; 4: 167–190.

36.

Kennedy

Eberhart

RC.

Particle swarm optimization. In: Proceedings of IEEE international conference on neural networks, Perth, WA, Australia, 1995, pp.1942–1948.

37.

Chai

Draxler

RR.

Root mean square error (RMSE) or mean absolute error (MAE)? – arguments against avoiding RMSE in the literature. Geosci Model Dev 2014; 7: 1247–1250.