Abstract
Catenary works as a key part in the electric railway traction power supply system, which is exposed outdoors for a long time and the failure rate is very high. Once a failure occurs, it will directly affect the driving safety. Based on the above, a model of identifying the health status for the catenary based on firefly algorithm optimized extreme learning machine combined with variational mode decomposition is proposed in this paper. Variational mode decomposition is used to decompose the original detection curve of catenary into a series of intrinsic mode function components, and the intrinsic mode function components filtered by the correlation coefficient method after decomposing each detection curve are input into the firefly algorithm optimized extreme learning machine model to realize health status identification. Compared with some other models, the results show that the proposed model has better health status identification effect.
Keywords
Introduction
Catenary is one of the “three components” in electrified railway, and its working status is directly related to the safety of train operation. Catenary erects along the railway track and arranges outdoors. It is vulnerable to the wind, sun and impact of locomotive pantograph of high-speed train, which becomes the main point of failure in the traction power supply system.1,2 At present, most of railways use catenary inspection vehicles to detect catenary faults. The fault judgments are made by humans based on experience, and their efficiency and accuracy cannot meet the requirements of high-speed and large transportation volume.3,4 Therefore, it is particularly important to find an accurate and efficient method for identifying the health status of the catenary.
Fault identification has always been a research hotspot of scholars. Zhang and Chen.
5
presented a rolling bearing fault identification method based on empirical mode decomposition (EMD) and spectral kurtosis. First, the fault signal was decomposed by EMD to obtain multiple mode functions, and then suitable intrinsic mode function (IMF) components were selected for reconstruction to achieve a certain denoising effect, but EMD is an empirical decomposition method, which is prone to end effects and mode mixing problem, so the identification results will be affected. Liu et al.
6
applied ensemble empirical mode decomposition (EEMD) to extract the fault characteristics of train bogie bearings. Although EEMD effectively suppressed the mode mixing problem of EMD, it required multiple EMD processing, which increased the running time. Wu et al.
7
presented a fault feature extraction method that combining variational mode decomposition (VMD) and dispersion entropy, the waveform method was used to determine the mode number
To solve these problems, a catenary health status identification method based on VMD and ELM optimized by firefly algorithm (FA) is proposed in this paper. Firstly, the original detection curve of contact wire height, contact pressure and contact wire gradient are decomposed by VMD, and then appropriate IMF components are selected by correlation coefficient method, then the selected components are normalized. Finally, normalized data are classified by FA-optimized ELM to achieve health status identification. This method is an offline status identification method. The method can identify the health status of catenary according to the data collected in advance, and provide effective reference for maintenance personnel.
Firefly optimization algorithm
The Firefly algorithm is a swarm intelligence heuristic algorithm proposed by Professor Xin-She Yang of Cambridge University.11,12 The algorithm simulates the biological characteristics of fireflies, and then makes its companions aware of it. The brightness of fireflies is used as the criterion for judging whether its position is optimal. These fireflies search for their companions nearby and move to the brighter companions, thus achieving iteration and optimization of the location of fireflies. Therefore, the position of firefly is equal to the size of the objective function in solving the problem. In the process of iteration and optimization, the target is optimized by moving to the companions with higher brightness. The steps are as follows:
where where
In order to obtain the best global search ability, inertia weight function
Finally, fireflies’ final position
In order to verify the convergence of FA, the following test functions are used to verify the convergence of FA:
First, 50 fireflies are arranged randomly. The initial population is shown in Figure 1, and the solid dots represent fireflies.

Initial population of fireflies.
After testing, through the attraction of the brighter fireflies, all fireflies will gather near the brighter fireflies. The results are shown in Figure 2, which shows that FA has good convergence.

The results of fireflies aggregation.
In order to further verify the optimization performance of FA, FA was compared with differential evolution algorithm (DEA) and genetic algorithm (GA). Test functions
The parameter setting of 3 algorithms is shown in Table 1. Where the population size is set as 50 and the iterations is set as 40.
Parameter setting of 3 algorithms.
The optimization results of 3 algorithms for test functions

Contrast curve of

Contrast curve of
It can be seen from Figures 3 and 4 that FA is superior to DEA and GA in convergence accuracy and convergence speed, and has better optimization ability. Therefore, FA is used to optimize the parameters of ELM in this paper.
VMD-FA-ELM health state identification model
Variational mode decomposition
VMD is a new complex signal decomposition method proposed by K. Dragomiretsk and D. Zosso on the basis of EMD.13,14 It decomposes the signal into limited bandwidth with different center frequencies according to the preset number of modes, updates each mode and its center frequency by using alternating direction multiplier, adjusts each mode to the corresponding base frequency band step by step, extracts each mode and its corresponding center frequency, and obtains the components with different center frequencies.
In VMD, the desired decomposition mode number
The parameters involved in VMD are mode number
VMD parameter configuration.
The center frequencies of the IMF components decomposed by VMD are distributed from low to high. If the mode number
Center frequency of each IMF component under different
From Table 3, when

VMD decomposition results. (a) Modal components and their spectrum of the detection curve of contact wire height by VMD, (b) Modal components and their spectrum of the detection curve of contact pressure by VMD, (c) Modal components and their spectrum of the detection curve of contact wire gradient by VMD.
The correlation coefficient method is employed to determine whether each IMF component is a noise component. The correlation coefficient formula is shown in equation (7).
17
The correlation coefficient between each mode component and the original signal calculated by equation (7) is shown in Table 4.
Correlation coefficient of each IMF.
As shown in Table 4, for all three kinds of signals, the correlation coefficients of IMF1 components are far greater than those of other mode components. Therefore, the decomposed IMF1 components of the three signals can be used for health status identification.
The comparison curves of the contact wire height, contact pressure and contact wire gradient before and after decomposition are shown in Figure 6. The instability and pulsation of the decomposed IMF1 component are significantly reduced, as shown in Figure 6. The similarity of IMF1 component with the original signal is high, and the denoising effect is well.

Comparison of three detection curves before and after decomposition. (a) Comparison between before and after decomposition of the detection curve of contact wire height, (b) Comparison between before and after decomposition of the detection curve of contact pressure, (c) Comparison between before and after decomposition of the detection curve of contact wire gradient.
FA optimized ELM
ELM is an algorithm model proposed by Professor Guangbin Huang for solving the single hidden layer feed-forward neural networks.18,19 Compared with the neural network algorithm using gradient descent method to update weights, the output weights of ELM are calculated by least square method, so ELM has better generalization performance and faster learning speed. Compared with deep learning algorithm, ELM takes less computing resources. Therefore, ELM is used to identify the health status of catenary in this paper.
Although ELM has the advantages of fast convergence speed and less parameters to be set, the input weight matrix

Structure of FA-ELM.
Combination model
Based on the FA, VMD and ELM model mentioned above, the VMD-FA-ELM health status identification model is built in this paper, as shown in Figure 8, and the steps are as follows:

Structure of VMD-FA-ELM combined model.
Experimental verification
Identification results
The catenary detection data of a power section in 2019 is selected as the data source. Among them, there are 2400 sampling points for training samples and 2400 sampling points for testing samples. Select three types of catenary health status: contact wire height, contact pressure and contact wire gradient. Then design labels for all data based on 8 indicators as shown in Table 5.
Labels of health status.
The experimental parameters are set as follows:
Parameters of ELM: The number of input layer nodes is 3, the number of hidden layer nodes is 50, the number of output layer nodes is 1, and activate function is “sig” function. Parameters of FA: The size of firefly is set as 30, the spatial dimension is set as 250, the biggest attractiveness
The parameters of VMD are shown in Table 2.
The identification results based on VMD-FA-ELM are shown in Figure 9, and the confusion matrix of identification results is shown in Figure 10. Where the diagonal box represents the number of samples with the same prediction category as the actual category. The remaining squares from F1 to F8 represent the number of samples with wrong identification. The last row of the matrix represents the Accuracy (to be used to measure the accuracy of system identification, which is equal to the number of samples correctly identified/the number of samples identified). The last column of the matrix represents the Recall (to be used to measure the recall rate of the system identification, which is equal to the number of samples correctly identified/the number of real samples). The box in the bottom right corner represents the total identification accuracy. 20

Identification result of VMD-FA-ELM.

Confusion matrix of VMD-FA-ELM.
It can be seen from Figure 10 that the identification accuracy of VMD-FA-ELM is 98.75%, and the accuracy and recall rate are higher than 99% except for F5 and F6, which indicate that the new method has good effect on health status identification.
Comparison and analysis
On the basis of section “Identification results”, five models, including VMD-GA-ELM, FA-ELM, GA-ELM, VMD-ELM and ELM, are used to compare the identification effect. The identification effect is reflected by the identification accuracy, MSE, R2 and running time.
In order to ensure the fairness and rationality of comparative analysis, the same test samples and training samples as in section “Identification results” are used to test the above five models. The confusion matrices are drawn as shown in Figure 11.

Identification results of five models. (a) Confusion matrix of VMD-GA-ELM, (b) Confusion matrix of FA-ELM, (c) Confusion matrix of GA-ELM, (d) Confusion matrix of VMD-ELM, (e) Confusion matrix of ELM.
It can be seen from Figures 10 and 11 that the VMD-FA-ELM model has the highest identification accuracy, and both the accuracy and recall rates are above 93%. VMD-FA-ELM has a 28.6% improvement in identification accuracy compared to ELM. The identification accuracy of the VMD-GA-ELM model is second (increased by 22.1%), but there are some cases where the accuracy or recall is low (less than 70%).
Compared with ELM model, the identification accuracy of FA-ELM model and GA-ELM model is improved by 16.1% and 8.8% respectively. Compared with the VMD-ELM model, the identification accuracy of VMD-FA-ELM model and VMD-GA-ELM model is improved by 18.8% and 11.8% respectively. This indicates that FA is better than GA in improving the model. Compared with ELM model and FA-ELM model, the identification accuracy of VMD-ELM model and VMD-FA-ELM model are improved by 9.3% and 11.8% respectively. This shows that denoising by VMD has a certain optimization effect on this model.
Since the effect of each run is random, in order to reconfirm the high identification effect of the VMD-FA-ELM model, this paper tests the above six models ten times respectively. Table 6 shows the running time of six models under ten tests.
Labels of health status.
Combined with Table 6, Figures 10 and 11, it can be seen that the identification accuracy of VMD-GA-ELM model is slightly lower than that of VMD-FA-ELM model, but its running time is more than twice that of VMD-FA-ELM. This will reduce the identification efficiency and does not meet the requirements of rapid health status identification of catenary. The traditional ELM does not need to decompose the original data, the parameters are obtained randomly, so the running time is the lowest. However, the identification accuracy of traditional ELM is too low to meet the requirements of catenary health status identification.
Figure 12 shows the box-plot of the identification effect of different models after ten tests. The meaning of each box in the figure is the identification accuracy, MSE and R2 of each model under ten tests. The abscissa is the model type.

Comparison of identification effect under different models. (a) Box-plot of identification accuracy under different models, (b) Box-plot of MSE under different models, (c) Box-plot of R2 under different models.
According to the results of ten runs in Figure 12, VMD-FA-ELM not only has higher identification accuracy than other models, but also the convergence of identification accuracy is better than other models. The MSE of VMD-FA-ELM is significantly lower than other models, and its convergence is also better. And the R2 is better than other models.
Conclusions
In this paper, the health status identification of electrified railway catenary is studied, and a health status identification method based on VMD-FA-ELM is proposed. The conclusions are as follows:
Decomposing the original detection curve of the catenary by VMD can stabilize the non-stationary and complex detection curve of the catenary. Experiments have shown that using the decomposed IMF1 component with the highest correlation coefficient for health status identification has a higher identification accuracy than the accuracy before decomposition. FA has good parameters optimization ability. The ability can effectively solve the problem of low identification accuracy caused by random generation of parameters in ELM learning, and makes full use of the excellent identification ability of ELM. Compared with the typical GA algorithm, the optimized effect on the model is more prominent. The comparison with VMD-ELM indicates the effectiveness of optimizing ELM by FA. Experiments indicate that the identification model based on VMD-FA-ELM can accurately and quickly identify the health status of catenary, which provides a new method for identifying the health status of the catenary. It has certain application value in the era of comprehensive popularization of electrified railway.
Footnotes
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This research is supported by the National Natural Science Foundation of China (61572416), Hunan province Natural science Zhuzhou United foundation (2020JJ6009) and Key Laboratory Open Project Fund of Disaster Prevention and Mitigation for Power Grid Transmission and Transformation Equipment.
