Optimization of LMBP high-speed railway wheel size prediction algorithm based on improved adaptive differential evolution algorithm

Abstract

It is beneficial for maintenance department to make maintenance strategy and reduce maintenance cost to forecast the hidden danger index value. Based on the analysis of the research status of wheel-to-life prediction at home and abroad and the repair of wheel-set wear and tear, this article designs and implements an adaptive differential evolution algorithm Levenberg–Marquardt back propagation wheel-set size prediction model. Aiming at the shortcomings of back propagation neural network, it is easy to fall into local extreme value. The back propagation algorithm is improved by Levenberg–Marquardt numerical optimization algorithm. Aiming at the shortcomings of back propagation neural network algorithm for randomly initializing connection weights and thresholds to fall into local extreme value, the differential evolution algorithm is used to optimize the initial connection weights and thresholds between the layers of the neural network. In order to speed up the search of the optimal initial weights and thresholds of the differential evolution algorithm Levenberg–Marquardt back propagation neural network, the initial values are further optimized, and an adaptive differential evolution algorithm Levenberg–Marquardt back propagation wheel-set size prediction model is designed and implemented. Compared with the proposed combine adaptive differential evolution algorithm with LMBP optimization (ADE-LMBP) is effective and significantly improves the prediction accuracy.

Keywords

Wheel-set size prediction neural network model Levenberg–Marquardt algorithm differential evolution algorithms

Introduction

With the increasing mileage of high-speed railways in China, railway train safety inspection is becoming more and more important. Any minor fault in the train components may affect the safety of high-speed trains and even cause major safety accidents. The wheel pair is called one of the three major consumable parts of the train, and the wear condition of the rim and the tread is a key factor affecting the safe and stable operation of the train, the ride comfort, and the service life of the wheel track.^1,2 As the train causes continuous wear of the wheel-set during the operation, the related expenses of maintenance and repair such as repairing or replacing it in time are one of the important components of the train maintenance cost.³Therefore, on the basis of mastering the historical state of the wheel-set, predicting the hidden danger index of the wheel-set is beneficial to the railway management department to formulate effective maintenance measures in time to improve the operational safety, reliability, and economy of the train.

In the railway transportation and development, the wheel-set wear phenomenon of railway vehicles is widely existed in various transportation sites, which is also one of the main research directions in recent years. At present, the research on the modeling of railway wheel-set state prediction at home and abroad is mainly divided into two categories. The first type is based on the vehicle track system dynamics model, the wheel–rail local contact model, and the local wear analysis mechanism model of the wheel–rail material, involving physical quantities such as wear and creep distance, normal force, and material hardness. Then, the numerical simulation analysis of the train wheel wear profile information is carried out.^4–7 The other is to analyze and predict the historical wear profile information obtained from the statistics and calculate the remaining service life of the wheel-set to propose a maintenance strategy, such as the time series model,⁸ support vector machine,⁹ gray prediction algorithm,¹⁰ and Bayesian algorithm.¹¹ With the advent of the era of data explosion, big data technology and numerical analysis methods have gradually been applied to various device security predictions. Among them, neural network algorithms are widely used. Gebraeel et al.¹² developed an experimental device to perform accelerated bearing testing to obtain bearing vibration data samples and based on back propagation (BP) neural network to establish a bearing residual life prediction model and verify its effectiveness. Wei Zhang et al.¹³ established a three-layer BP neural network for multi-stress accelerated life test, which can effectively predict the failure time of the normal stress level and obtain the prediction curve of reliability function. However, the traditional BP neural network converges slowly and is easy to fall into local optimum. Therefore, scholars will improve the BP algorithm or use some other algorithms combined with BP neural network to improve the shortcomings of BP. In terms of algorithm combination, Lixin et al.¹⁴ used a combination of time series analysis and BP neural network to predict the remaining life of the cooling fan and improve the prediction accuracy. He and Zhang¹⁵ proposed a combination of principal component analysis (PCA) and BP neural network to provide a good reference for the prediction of phosphorus content in basic oxygen furnace (BOF) endpoint. In terms of algorithm improvement, Zhang et al.¹⁶ used genetic algorithm to globally optimize the weight of BP neural network. The results of tool residual life show that the prediction effect is better than the single BP neural network method. Zhaoyang Ye and Kim¹⁷ optimized the BP neural network with Levenberg–Marquardt (LM) algorithm to predict the power consumption of buildings. The results show that the Levenberg–Marquardt back propagation (LMBP) neural network is more accurate and stable than the BP neural network. Huaixian Yin et al.¹⁸ used the particle swarm optimization algorithm to optimize the BP network to establish a prediction model for the axle-to-axle box failure of the urban rail train bogie, which is better than the BP neural network.

At present, the method for predicting wheel wear based on the dynamic model is mature, but most of them are based on simulation numerical analysis, and field data are not used. Based on the historical surface information data, there are few research studies on the method of wheel wear. Although the field data are used, the prediction accuracy is not high, and the algorithm combined with the global optimization and training process is less. In this article, based on the idea that the wheel diameter value changes with time to find useful information, the problem of wheel-set life is studied. The corresponding prediction model is established to predict the trend of wheel diameter change, which is used as an auxiliary method for the maintenance department to formulate maintenance strategy. In view of the shortcomings of BP neural network, that is, it is easy to fall into local extreme value, the LM numerical optimization algorithm is used to improve the problem. In view of the shortcomings of BP neural network algorithm for randomly initializing connection weights and thresholds to fall into local extremum, differential evolution algorithm (DE) is used to optimize initial connection weights and thresholds between layers of neural network. In order to speed up the search for the optimal initial weights and thresholds of the differential evolution algorithm Levenberg–Marquardt back propagation (DE-LMBP) neural network, the initial values are further optimized, and a design is implemented based on the adaptive DE-LMBP wheel-set size prediction model.

Theory

LMBP neural network algorithm

BP neural network is a multilayer feed forward neural network, which consists of an input layer, single layer or multilayer hidden layer, and output layer. For a single hidden layer BP neural network, the input layer contains $m$ nodes, the hidden layer contains $n$ nodes, and the output layer is 1 node, as shown in Figure 1.

Figure 1.

Topological structure of a single hidden layer BP network.

The weight matrix from the input layer to the hidden layer of BP network is marked as $W$ , the weight matrix from the hidden layer to the output layer is marked as $V$ , the threshold matrix of the hidden layer is marked as $A$ , and the threshold of the output layer is marked as $s$ . The expression is as follows

W = [\begin{matrix} w_{11} & w_{12} & \dots & w_{1 n} \\ w_{21} & w_{22} & \dots & w_{2 n} \\ ⋮ & ⋮ & ⋮ \\ w_{m 1} & w_{m 2} & \dots & w_{m n} \end{matrix}]

V = [\begin{matrix} v_{1} & v_{2} & \dots & v_{n} \end{matrix}]

A = {[\begin{matrix} a_{1} & a_{2} & \dots & a_{n} \end{matrix}]}^{T}

The standard BP algorithm minimizes the sum of squares of errors between the expected output vectors of training samples and the actual output vectors of the network by adjusting the weight vectors and thresholds between the connecting layers. The sum of squares of errors is the objective function that the LM algorithm needs to optimize.

The LM algorithm is used to adjust the weights and thresholds of BP neural network. The formula is as follows

\begin{matrix} Δ W_{k} = W_{k + 1} - W_{k} \\ = - {[J^{T} (W_{k}) J (W_{k}) + α_{k} I]}^{- 1} J^{T} (W_{k}) e (W_{k}) \end{matrix}

(1)

where $W_{k}$ is the weight vector of iteration, $α_{k}$ is the constant (adjustment factor) greater than zero in the LM algorithm, which is used to control the iteration of the LM algorithm, $I$ is the unit matrix, and $J (W_{k})$ is the Jacobian matrix of the error to the weight differential.

In the process of network training, with the increasing number of iterations, when $α_{k}$ approaches zero, the LM algorithm approaches the Gauss–Newton method, which converges faster than the BP algorithm based on the gradient descent method. And it has the advantage of faster calculation speed and higher accuracy when the error is closer to the minimum value. The algorithm provides a compromise between the speed of Newton’s method and the gradient descent method, which guarantees convergence.

Differential evolution algorithm

The essence of DE¹⁹ is a greedy algorithm based on real coding with the idea of preserving the best. The basic principle of the algorithm is to randomly select two individuals in the population to generate difference vectors and sum them with the third individual to generate new individuals (variant individuals), cross-operate the parent and the corresponding variant individuals, and select the individuals with better fitness between the parent and the offspring individuals, and these individuals with better fitness were selected as offspring.

DE mainly includes population initialization, mutation operation, crossover operation, and selection operation:

1. Population initialization

The three matrices $W$ , $V$ , and $A$ and threshold $s$ of BP network are mapped to the chromosome strings of the difference algorithm. The mapping relationship is as follows

R_{i} (t) = (w_{1, 1}, w_{1, 2}, \dots, w_{m, n}, v_{1}, v_{2}, \dots, v_{n}, a_{1}, a_{2}, \dots, a_{n}, s)

Let $R (t) = (R_{1} (t), R_{2} (t), \dots, R_{i} (t))$ , in which $R_{i} (t)$ is the ith chromosome of the $t$ generation, $R_{i} (t) = (r_{i 1} (t), r_{i 2} (t), \dots, r_{il} (t))$ , $i = 1, 2, \dots, N$ , $t = 1, 2, \dots, t_{\max}$ , $N$ is the population size, $t_{\max}$ is the largest evolutionary algebra, and $l$ is the chromosome length.

2. Mutation operation

The mutation operation is based on individual vector difference. Assuming that the current evolutionary individual is $R_{i} (t)$ , three chromosomes $R_{p 1} (t), R_{p 2} (t), and R_{p 3} (t) and (i \neq p 1 \neq p 2 \neq p 3)$ are randomly selec-ted from the population of this generation, and $F$ is the variation factor. The difference between the two individual variables is taken to obtain $U_{i} (t + 1)$

u_{ij} (t + 1) = r_{p 1 j} (t) + F (r_{p 2 j} (t) - r_{p 3 j} (t))

(2)

3. Cross-operation

The $R_{i} (t)$ of the variant individual and the current evolutionary individual operate in a discrete crossover manner to generate the crossover $C_{i} (t + 1)$ to increase the diversity of the population

c_{ij} (t + 1) = {\begin{matrix} u_{ij} (t + 1), & ran d_{ij} (0, 1) \leq CR & or & j = rand (i) \\ r_{ij} (t), & ran d_{ij} (0, 1) > CR & or & j \neq rand (i) \end{matrix}

(3)

where $ran d_{ij} (0, 1)$ is a random number between $(0, 1)$ , $CR$ is a crossover factor, and $rand (i)$ is a random integer of $(0, l)$ . This crossover strategy ensures that at least one component of $R_{i} (t + 1)$ is contributed by $R_{i} (t)$ corresponding components.

4. Fitness function

The fitness evaluation is carried out by using the square error measure in the following form

f (R_{i} (t)) = \frac{1}{E} = \frac{p}{\sum_{i = 1}^{p} {(y - y^{'})}^{2}}

(4)

where $p$ is the number of training samples, $y^{'}$ is the actual output of the network, and $y$ is the expected output.

5. Selection operation

Selecting $C_{i} (t + 1)$ and $R_{i} (t)$ as the crossover individuals and the current evolutionary individuals to select the best one according to the greedy way by comparing their fitness, that is

R_{i} (t + 1) = {\begin{matrix} R_{i} (t), & f (R_{i} (t)) > f (C_{i} (t + 1)) \\ C_{i} (t + 1), & f (R_{i} (t)) \leq f (C_{i} (t + 1)) \end{matrix}

(5)

Repeat equations (2)–(5) until conditions are met.

Adaptive DE-LMBP algorithm

In order to speed up the search for the optimal initial weight and threshold of DE-LMBP neural network and further optimize the initial value, an adaptive DE-LMBP neural network model is proposed. The difference between the adaptive DE-LMBP neural network model and the DE-LMBP neural network model is that the crossover probability and mutation probability of the DE-LMBP neural network model are fixed values. The crossover probability and mutation probability of the adaptive DE-LMBP neural network model are adjusted with the individual fitness.

The key to the adaptive algorithm lies in the variation and crossover operation of the algorithm, that is, the dynamic adjustment of the cross-factor $CR$ and the variation factor $F$ according to the individual fitness value. The basic idea is let $f_{\max}$ is the highest fitness of a certain generation group, and $f_{avg}$ is the average fitness of the generation group. The difference between the maximum fitness and the average fitness indicates the stability of the population to a certain extent. The smaller the difference, the smaller the individual fitness difference in the population and the greater the possibility that the population reaches precocity. On the contrary, the larger the difference, the greater the individual fitness difference and the divergence of individual characteristics. Therefore, when $f_{\max} - f_{avg}$ is small, the values of $CR$ and $F$ should be increased; conversely, when $f_{\max} - f_{avg}$ is larger, the values of $CR$ and $F$ should be reduced. The calculation formulas for $CR$ and $F$ are as follows

{\begin{matrix} CR = \frac{k_{1}}{(f_{\max} - f_{avg})} \\ F = \frac{k_{2}}{(f_{\max} - f_{avg})} \end{matrix}

(6)

Analysis of the above formula shows that the values of $CR$ and $F$ do not depend on the fitness of any individual, and all individuals have the same crossover and mutation probability, which is not conducive to the convergence of the algorithm to the global optimal solution. In addition, when the population approaches the global optimal solution, $f_{\max} - f_{avg}$ decreases, the values of $CR$ and $F$ increase, the optimal individuals are easily destroyed, and the population may not converge to the global optimal solution. To this end, the above formula is adjusted to

{\begin{matrix} CR = \frac{k_{1} (f_{\max} - f)}{(f_{\max} - f_{avg})} \\ F = \frac{k_{2} (f_{\max} - f^{'})}{(f_{\max} - f_{avg})} \end{matrix}

(7)

where $f$ is the larger fitness value of the two individuals to be crossed and $f^{'}$ is the fitness value of the individual to be mutated.

When the crossover and mutation factors are adjusted according to formula (7), the probability of crossover and mutation is close to or equal to zero for individuals whose fitness is close to or equal to the maximum fitness. In this way, in the early stage of evolution, the good individuals are almost in a state of unchanged, but at this time, the good individuals are not necessarily the global optimal solution. The final crossover and mutation functions are obtained by the following adjustment

\begin{matrix} CR = {\begin{matrix} k_{1} & f < f_{avg} \\ k_{2} (f_{\max} - f) / (f_{\max} - f_{avg}) & f_{avg} \leq f < f_{avg} + k_{4} (f_{\max} - f_{avg}) \\ k_{3} & f \geq f_{avg} + k_{4} (f_{\max} - f_{avg}) \end{matrix} \\ F = {\begin{matrix} {k'}_{1} & f^{'} < f_{avg} \\ {k'}_{2} (f_{\max} - f^{'}) / (f_{\max} - f_{avg}) & f_{avg} \leq f^{'} < f_{avg} + {k'}_{4} (f_{\max} - f_{avg}) \\ {k'}_{3} & f^{'} \geq f_{avg} + {k'}_{4} (f_{\max} - f_{avg}) \end{matrix} \end{matrix}

(8)

where $k_{1}, k_{2}, k_{3}, k_{4}, k'_{1}, k'_{2}, k'_{3}, k'_{4}$ are between zero and one. The method increases the crossover probability and mutation probability of the individuals whose fitness is close to and equal to the maximum fitness in the population to $k_{3}$ and $k'_{3}$ and can control the degree of approaching the maximum fitness through $k_{4}$ and $k'_{4}$ ; for poor individuals whose fitness is lower than the average fitness, a large crossover probability and mutation probability are uniformly adopted to make these individuals evolve toward the optimal individual.

Evaluation index

In this article, mean absolute error (MAE), root mean squared error (RMSE), mean absolute percentage error (MAPE) and correlation coefficient R are selected as model evaluation indicators

MSE = \frac{1}{n} \sum_{i = 1}^{n} {(y_{i} - {y'}_{i})}^{2}

(9)

RMSE = \sqrt{\frac{1}{n} \sum_{i = 1}^{n} {(y_{i} - {y'}_{i})}^{2}}

(10)

MAPE = \frac{1}{n} \sum_{i = 1}^{n} | \frac{(y_{i} - {y'}_{i})}{y_{i}} |

(11)

R = \frac{\sum_{i = 1}^{n} (y_{i} - \bar{y}) ({y'}_{i} - \bar{y'})}{\sqrt{\sum_{i = 1}^{n} {(y_{i} - \bar{y})}^{2} \sum_{i = 1}^{n} {({y'}_{i} - \bar{y'})}^{2}}}

(12)

where $y_{i}$ is the expected output value and $y'_{i}$ is the network output value. $\bar{y}$ and $\bar{y'}$ are the average of the real value and the network output value, respectively, and $n$ is the predicted sample number. The smaller the RMSE, MAE, and MAPE value, the higher the prediction accuracy of the model; the closer the R value is to 1, the higher the prediction accuracy of the model.

Experimental procedures

The flowchart for predicting the change trend of wheel pair diameter based on improved adaptive DE-LMBP neural network is shown in Figure 2.

Figure 2.

Adaptive DE-LMBP neural network prediction process.

According to Figure 2, we have designed the following wheel-set dimensional change trend prediction process as shown in Algorithm 1.

Algorithm 1. Wheel-set size prediction
1. Denoise the selected data and normalize it. 2.Construct sample feature points, construct the input data set according to $X = (x_{i}, x_{i + 1}, \dots, x_{i + (m - 1)})$ in the original sequence, and construct the output data set according to $Y = x_{i + m}$ . And divided into training sets and test sets according to a certain proportion. 3. Use adaptive differential evolution algorithm to find the optimal initial weight and threshold. 4. In the iterative process, the LMBP algorithm continues to adjust the weight threshold to get the best results. 4. Use the model to predict wheel-set size trends. 5. Calculate model evaluation indicators.

Algorithm 1. Wheel-set size prediction

1. Denoise the selected data and normalize it.
2.Construct sample feature points, construct the input data set according to

X = (x_{i}, x_{i + 1}, \dots, x_{i + (m - 1)})

in the original sequence, and construct the output data set according to

Y = x_{i + m}

. And divided into training sets and test sets according to a certain proportion.
3. Use adaptive differential evolution algorithm to find the optimal initial weight and threshold.
4. In the iterative process, the LMBP algorithm continues to adjust the weight threshold to get the best results.
4. Use the model to predict wheel-set size trends.
5. Calculate model evaluation indicators.

Data sample preparation

The data studied in this article comes from the LY series dynamic inspection system. As shown in Figure 3, it is installed on the operation line, or the entrance line of the electric multiple units (EMUs) will pass by. When the railway enters the detection area at a limited speed and triggers the detection sensor, the system enters the working state.

Figure 3.

LY series dynamic inspection system for wheel profile.

The dynamic inspection system for wheel profile uses the “light intercept image measurement technology” to measure the wheel alignment size online. The principle is shown in Figure 4. When the railway enters the detection range, the laser line source is projected from both sides of the track to the wheel surface to form an optical curve and is captured by the CCD camera. Through real-time image acquisition, processing, and correction, the real contour curve is obtained, which the numerical calculation is performed to obtain the wheel size parameter. On this basis, numerical calculation is performed to obtain the wheel size parameter.

Figure 4.

Optical intercept image detection principle.

The research object of this article is the China Railway High-speed (CRH) wheel-set. The tread shape of the CRH380BL model is the wear-type tread of the S1002CN. Figure 5 shows the historical wheel diameter measurement data for the CRH380BL-3539 railway for 1.5 years. The detection data of the wheel pairs are susceptible to the measurement position, maintenance personnel’s measurement habits and manual corrections, loads, rail conditions, and other factors. Historical measurement data will have data fluctuations and obvious abnormal points and repairs. This experiment does not consider the phenomenon of repair.

Figure 5.

Historical wheel diameter measurement data.

According to the wheel-set data of CRH380BL-3539 railway, the sample data of time series are constructed. The time interval of the time series is 1 day, and 450 data points of wheel diameter data with equal time interval are obtained. Considering that the data center has undergone several repairs and manual correction processing, the outliers in the data are corrected. The final sample data are shown in Figure 6.

Figure 6.

Data sample set.

Set network parameters

Sample feature points have to be constructed.²⁰ The correlation coefficient has to be selected to observe the input dimension. It can be seen from Figure 7 that the correlation coefficient R is the largest when the input dimension is 10, and the correlation coefficient is gradually decreased when it is lower or higher than the 10-dimensional input layer. Therefore, the input layer is set to 10-dimensional wheel pair diameter history data, and the output layer is the one-dimensional wheel diameter value of the next moment. Wheel diameter values have to be entered at the time of the wheel diameters $t - 10, t - 9, \dots, t - 1$ , respectively, and the target output is the wheel diameter value at time $t$ . Using the empirical traversal method to select the hidden layer nodes, the optimal number of training effects is 10.

Figure 7.

The relationship between the input dimension and the correlation coefficient R.

The neuron transfer function of the hidden layer and the output layer of the neural network adopts a continuous and differentiable sigmoid function. In order to avoid the saturation area of S-type function and improve the convergence speed and sensitivity of the network, the sample data are normalized before network training. In this article, we use min–max standardized data to make the normalized data in the $(- 1, 1)$ interval. The ratio of data training set to test set is 8:2.

Results and discussions

In order to better understand the performance of the improved adaptive DE-LMBP neural network model, the LMBP neural network model with the same network structure and the standard DE-LMBP neural network model, and the long short-term memory (LSTM) neural network, were established with the same data samples for comparative analysis. The LSTM neural network is a variant of the recurrent neural network, which is suitable for the analysis of time series data.

Because the problem of the neural network itself will lead to the randomness of the results, the authors have carried out several simulation experiments on four models to obtain the optimal convergence speed and accuracy. The convergence simulation results of LMBP, DE-LMBP, LSTM, and adaptive DE-LMBP four prosperous network models are shown in Figure 8.

Figure 8.

(a) LMBP, (b) DE-LMBP, (c) LSTM, and (d) adaptive DE-LMBP convergence speed comparison.

It can be seen from Figure 8 that the LMBP neural network needs more than 100 times to reach the convergence goal of 0.1. Although the convergence speed of the DE-LMBP neural network is faster than LMBP, it still takes 100 times to reach the convergence target of 0.1, and the LSTM neural network has excellent convergence speed. The adaptive DE-LMBP algorithm can achieve the convergence goal less than 20 times, and the training time is less than other models. This shows that the adaptive DE-LMBP converges much faster than the other three models.

The predicted effects of the three models are shown in Figure 9. Although LSTM converges slightly faster than the other two hybrid models, the prediction results are poor. The expected effects of the other three models are better, but the prediction effect of adaptive DE-LMBP is better than the other three models, and the stability is better.

Figure 9.

LMBP, DE-LMBP, LSTM, and adaptive DE-LMBP prediction results.

It can be seen from Table 1 although all the four models can predict the wheel diameter better, the index values of the adaptive DE-LMBP neural network model are better than the other three models. This proves that the improved adaptive DE-LMBP neural network model is more suitable for the prediction of the trend of wheel-set size change.

Table 1.

Prediction accuracy of three models.

	LMBP	DELMBP	LSTM	Adaptive DE-LMBP
MSE	0.0128	0.0103	0.1057	0.0038
RMSE	0.1134	0.1015	0.3251	0.0616
MAE	0.0882	0.0772	0.2859	0.0480
MAPE	0.0099	0.0084	0.0103	0.0054
R	0.9955	0.9974	0.9564	0.9997

DE-LMBP: differential evolution algorithm Levenberg–Marquardt back propagation; LSTM: long short-term memory; RMSE: root mean squared error; MSE: mean squared error; MAE: mean absolute error; MAPE: mean absolute percentage error.

Conclusion

In this article, the combination of the DE algorithm, LM algorithm, and BP neural network is used for wheel-set size prediction. The main work and contribution are as follows: In order to overcome the shortcomings of traditional BP neural network algorithm, the training process is trapped in local extreme points, and its accuracy is improved. This article first uses the adaptive DE algorithm with powerful global optimization ability to perform global pre-optimization and then uses LM algorithm for deep optimization and training BP neural network. Compared with other three prediction models, the experimental results show that the improved adaptive DE-LMBP algorithm proposed in this article is effective. Compared with the LMBP neural network, the standard DE-LMBP neural network algorithm, and the LSTM neural network, the prediction accuracy is significantly improved.

Therefore, in practical applications, the algorithm can be applied to the prediction of the wheel size of high-speed trains, and the maintenance department is provided with a reference according to the trend of the wheel size prediction, thereby effectively reducing the maintenance cost.

Footnotes

Acknowledgements

The authors thank Southwest Jiaotong University Photoelectric Engineering Institute for their kind support in the experiment.

Handling Editor: Diego A Tibaduiza

Declaration of conflicting interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by the National Nature Science Foundation of China (Grant No. 61471304) and we wish to acknowledge them for their support.

ORCID iD

Jiawen Zhang

References

Wang

Yuan

, et al. Optimization of the re-profiling strategy and remaining useful life prediction of wheels based on a data-driven wear model. Syst Eng Theor Pract 2011; 31(6): 1143–1152.

JH.

Analysis on the wearing of Guangzhou metro line 1 vehicles. Techn Elect Locomot 2001; 24(3): 16–17.

Hua

Research on wheel wear predicting and re-profiling strategy based on intelligence analysis. Nanjing, China: Nanjing University of Aeronautics and Astronautics, 2017.

Ignesti

Innocenti

Marini

, et al. A numerical procedure for the wheel profile optimisation on railway vehicles. Proc IMechE Part J: J Engineering Tribology 2014; 228(2): 206–222.

Lim

Mba

Fault detection and remaining useful life estimation using switching Kalman filters. In: Tse

Mathew

Wong

(eds) Engineering asset management—systems, professional practices and certification. New York: Springer International Publishing, 2015, pp. 53–64.

Son

Jung

Kwon

, et al. Fatigue life prediction of a railway hollow axle with a tapered bore surface. Eng Fail Anal 2015; 58: 44–55.

Han

Zhang

WH.

A new binary wheel wear prediction model based on statistical method and the demonstration. Wear 2015; 324–325: 90–99.

Hong

Yang

BS.

Estimation and forecasting of machine health condition using ARMA/GARCH model. Mech Syst Sig Process 2010; 24(2): 546–558.

Xing

Mao

Liao

, et al. Forecasting of wheelset size of urban rail train based on PSO-SVM model. J Shengyang Univ Tech 2014; 36(4): 411–415.

10.

Prediction of wheelset tread wear based on grey theory. Lanzhou, China: Lanzhou Jiaotong University, 2015.

11.

Lin

Pulido

Asplund

Reliability analysis for preventive maintenance based on classical and Bayesian semi-parametric degradation approaches using locomotive wheel-sets as a case study. Reliab Eng Syst Saf 2015; 134: 143–156.

12.

Gebraeel

Lawley

Liu

, et al. Residual life predictions from vibration-based degradation signals: a neural network approach. IEEE Trans Indus Elect 2004; 51(3): 694–700.

13.

Zhang

Jiang

, et al. Life-prediction of multi-stress accelerated life testing based on BP algorithm of artificial neural network. Acta Aero Astron Sin 2009; 30(9): 1691–1696.

14.

Lixin

Zhenhuan

Yudong

, et al. Remaining life predictions of fan based on time series analysis and BP neural networks. In: Proceedings of the 2016 IEEE information technology, networking, electronic and automation control conference, Chongqing, China, 20–22 May 2016. New York: IEEE.

15.

Zhang

Prediction model of end-point phosphorus content in BOF steelmaking process based on PCA and BP neural network. J Process Control 2018; 66: 51–58.

16.

Zhang

Zhao

Tool life prediction model based on GA-BP neural network. Mater Sci Forum 2016; 836–837: 256–262.

17.

Kim

MK.

Predicting electricity consumption in a building using an optimized back-propagation and Levenberg–Marquardt back-propagation neural network: case study of a shopping mall in China. Sustain Cities Soc 2018; 42: 176–183.

18.

Yin

Wang

Zhang

, et al. Fault prediction based on PSO-BP neural network about wheel and axle bogie in urban rail train. Complex Syst Complex Sci 2015; 12(4): 97–103.

19.

Hou

Zhao

, et al. Fuzzy neural network optimization and network traffic forecasting based on improved differential evolution. Fut Generat Comput Syst 2017; 81: 425–432.

20.

Zhongjie

Xuefeng

Zhengjia

, et al. Remaining life predictions of rolling bearing based on relative features and multivariable support vector machine. J Mech Eng 2013; 49(2): 183–189.