Abstract
Aiming at the inability to accurately predict the remaining useful life of rolling bearings due to the phased degradation in the bearing degradation process, this paper proposes a local-global cooperative learning strategy to solve the problem that the information cannot be fully utilized due to the characteristics of the data. And the strategy is combined with the least squares support vector machine to build a regression model, which improves the accuracy of the remaining useful life of bearings. The strategy evaluates the health state of bearing’s degradation process based on singular value decomposition and kurtosis criteria to divide each degradation stage of the bearing so that the degradation information of each stage of the bearing can be learned. Then, according to the proposed cooperative learning mechanism, the local learning of each stage of the bearing is extended to the global learning of the total degradation process of the bearing. Finally, the learning strategy is applied to the least squares support vector machine model to predict the remaining useful life of bearings better. The results on the PHM2012 dataset show that our method’s values of root mean square error and mean absolute percentage error are 35.4461 and 0.2041, respectively.
Introduction
The rolling bearing is a common mechanical component. Prognostic Health Management (PHM) for bearings has received much attention. PHM consists of four main topics: fault detection, diagnostics, prognostics, and decision making. 1 In fault prognostics, Remaining Useful Life (RUL) prediction is a challenging issue in PHM that aims to predict the remaining working time of bearings before failure. 2
Currently, the RUL prediction methods of bearings mainly consist of two topics: physics-based and data-driven. 3 Physics-based methods frequently focus on the construction of physical and mathematical models, such as state space 4 and partial differential equations 5 to estimate the current health states and predict the future health states. The physics-based methods frequently require extensive domain knowledge. However, it cannot build the robust models for complex critical components. Thus, it is not an ideal method for predicting the RUL of bearings. In contrast, domain knowledge is not necessary for data-driven methods, and physical or mathematical models are not required. These methods learn the degradation characteristics of bearings and regard the degradation process as a functional relationship between the health states and monitoring data by studying a large number of historical data. 3 Then, the RUL of bearings is predicted by studying the degradation process of bearings based on intelligent technologies, such as Machine Learning (ML) or Deep Learning (DL).
At present, the research of bearings’ RUL prediction based on data-driven methods can be divided into two types. The first type is to establish Health Indicator (HI) or Degradation Indicator (DI) by studying the information in the bearing degradation process. 6 Then, machine learning or time series prediction methods 7 are used to build a model and predict RUL 8 based on these indicators, such as Support Vector Machine (SVM) 9 and Support Vector Data Description (SVDD). 10 Wang et al. 11 established a health indicator by combining Deep Convolutional Auto-Encoder (DCAE) and Self-Organizing Map (SOM) networks to perform more advanced characterization against the original vibration data. Yang et al. 12 established a health indicator by combining Piecewise Cubic Hermite Interpolating Polynomial Local Characteristic-scale Decomposition (PCHIP-LCD) and Generalized Regression Neural Network (GRNN) to predict the RUL of rolling bearing.
The second type of RUL prediction methods is to directly extract data features in the bearing degradation process, and then build a model based on extracted data features to predict RUL. Kaya et al. 13 proposed a new feature extraction method based on eight local directional filters under 1D-Local Binary Pattern (1D-LBP) to determine the vibration signal velocities of different fault sizes and types. Kaya et al. 14 proposed a new feature extraction method based on co-occurrence matrices for bearing vibration signals. Zhang et al. 15 proposed a novel bidirectional gated recurrent unit with temporal self-attention mechanism (BiGRU-TSAM) to predict RUL. Zhang et al. 16 proposed a novel adaptive approach based on Kalman filter and expectation maximum with Rauch–Tung–Striebel (KF-EM-RTS) to predict the RUL.
Deep learning 17 is a kind of machine learning algorithm that has appeared in recent years. Ren et al. 18 used time–frequency domain features as input and built an RUL prediction model based on the deep neural network. Mao et al. 19 extracted data features by using the Contractive Denoising Auto-Encoder (CDAE) and the Transfer Component Analysis (TCA), and built a regression prediction model to predict RUL.
The degradation process of bearings is a progressive process from normal state to complete failure state, in which the change trend of each degradation state is different. In other words, the degradation process data of bearings show the characteristics of phased changes, and their nonlinear characteristics are strong. Most of the existing prediction methods cannot make full use of the data information of the degradation process due to the mentioned data characteristics. As a result, the model built is incomplete and the state changes of the degradation process cannot be described completely, which leads to the decline of the accuracy of RUL prediction. For example, Ren et al. 18 did not analyze the degradation information of each health state, so their accuracy of the RUL prediction of bearings was not accurate. Although Mao et al. 19 analyzed the information of the fast degradation state of bearings, they did not analyze the information of the normal state and slow degradation state of bearings, which also led to the inaccurate prediction of RUL.
Therefore, in order to study the degradation characteristics of each local degradation state of bearings and improve the accuracy of RUL, this paper propose a local-global cooperative learning strategy to solve the mentioned problems. The main contributions of this paper are as follows:
Aiming at the study of local degradation state, we propose a method to solve the problem. First, we evaluate the degradation states of bearings based on Singular Value Decomposition (SVD) 20 and kurtosis criterion. 21 In this way, the degradation states are divided into normal state, slow degradation state, and fast degradation state. Next, we build local model for each degradation state to study the characteristics of changes in each degradation state. In this way, we can make full use of the data information of each degradation state.
Aiming at the study of global degradation state, we propose the collaborative cooperation mechanism to solve the problem. It can connect each local model to form the global model. In this way, we change the study of local model into the study of global model.
Aiming at the decline of the accuracy of RUL prediction due to above problems, the strategy is combined with least squares support vector machine to establish a local-global cooperative least squares support vector machine model. The local-global model can not only study the detailed characteristics of change in each local degradation state, but also study the change trend of global degradation process. In this way, the accuracy of RUL prediction is improved.
This paper is organized as follows. In Section II, we describe the proposed local-global cooperative LSSVM and RUL prediction of bearings by introducing the used algorithm like LSSVM and the proposed local-global cooperative learning strategy. Section III is devoted to the experiments on the bearing data set of IEEE PHM Challenge 2012, and followed by a conclusion of this paper in Section IV.
Local-global cooperative LSSVM and RUL prediction of bearings
In this section, the proposed local-global cooperative learning strategy is described in detail. Then, it is combined with LSSVM to predict the RUL of bearings. Figure 1 gives the whole flowchart of the method proposed in this paper. Specifically, the method includes three steps: signal preprocessing, health assessment, building local-global cooperative LSSVM model. Firstly, this paper preprocesses the vibration signal of bearings to obtain the marginal spectrum data of HHT transform, which is used as the input data of the model. Secondly, the health status of bearings is evaluated based on SVD and kurtosis criterion. Finally, a local-global cooperative least squares support vector machine model is built based on the results of the health assessment. Meanwhile, the marginal spectrum data after signal preprocessing is used as the input of the model, and the RUL of bearings is used as the output of the model. Next, the main steps proposed will be described in detail.

Local-global cooperative LSSVM model.
Signal preprocessing
Signal preprocessing of bearings is helpful to study the degradation characteristics. Different from the Fast Fourier Transform (FFT), which needs a complete oscillation period to determine the local frequency value, the instantaneous frequency of the Hilbert-Huang Transform (HHT) is defined as a function of time. 22 Therefore, the marginal spectrum from HHT is used to describe the local characteristics of the signal, especially for non-stationary signals. Because of the non-stationary characteristics of bearing signals, we use HHT to preprocess bearing signals. We provide a brief introduction to HHT.
Suppose the original vibration signal is
here,
HHT is used for each IMF to obtain the Hilbert spectrum and its analytical signal. The Hilbert spectrum is shown equation (2):
The analytical signal is shown in equation (3):
The amplitude function
The corresponding instantaneous frequency is:
Thus, the original signal
The Hilbert spectrum is integrated to obtain the marginal spectrum:
here,
The Hilbert spectrum accurately reflects the amplitude variation trend relative to duration and frequency. The marginal spectrum reflects the contribution of each frequency to the amplitude.
Health assessment based on SVD and kurtosis criterion
In order to divide the degradation process of bearings into different states during their entire life cycle, SVD and kurtosis criterion are used to evaluate their health status. The fast degradation state of bearings is determined based on SVD and correlation coefficient. The normal state and slow degradation state are determined based on kurtosis criterion. In general, this algorithm can shrink the calculation range and make the discrimination between normal state and slow degradation state more prominent by determining the starting point of fast degradation process first. Our main ideas are as follows:
In matrix theory, singular values usually represent important information hidden in the matrix, which has good robustness. 23 At the same time, the correlation coefficient can represent the relationship between two observations. Therefore, when the signal changes in small amplitude, the singular value keeps steady and the correlation coefficient between the matrices built by different time-domain signal segments should be high. On the contrary, when the signal changes dramatically, the singular value changes significantly and the correlation coefficient becomes low. Therefore, SVD and correlation coefficients are easy to identify the fast degradation state of bearings.
It is difficult to identify the slow degradation state of bearings at low frequencies, but easy at high frequencies. 24 Therefore, the signal can be decomposed into multiple intrinsic mode functions to distinguish the modulation information at high frequency. Then, Kurtosis value can be calculated from IMFs at high frequency to recognize the normal state and early fault state.
In this section, the main steps of proposed health assessment are as follows:
Step 1: Suppose the vibration signal of the bearing is
here, m, n are the number of rows and columns of the Hankel matrix, respectively.
Next, conduct SVD for each matrix
Step 2: The singular value is linearly scaled to [−1, 1], and the scaled singular value is then reconstructed to
here,
Step 3: In order to classify the normal and slow degradation states of the bearing, it is necessary to divide the bearing life cycle at a high frequency based on the kurtosis criterion. First, EMD is used to obtain all IMF components of bearing vibration signal. Then, the kurtosis coefficient of each IMF component is calculated by equation (11). Note that other signal decomposition methods like variational mode decomposition also work well. However, we will not provide the detailed comparison between these methods because the topic is not the performance of signal decomposition methods in the paper. Equation (11) is shown as:
here,
Local-global cooperative LSSVM model
According to above results of health assessment, the entire life cycle of bearings can be divided into three states: normal state, slow degradation state, and fast degradation state. Next, a cooperative learning mechanism is proposed. It is that the weight is calculated based on the distance between the input data and the center of each health state cluster. According to the weight, the input data is determined to belong to which health state. Then, the local model built for the specific health state is used for prediction. Thus, the proportion of the output of the local model built for the specific health state is increased. In this way, the local-global cooperative learning strategy is formed. Finally, we will combine the strategy with LSSVM to obtain a local-global cooperative LSSVM to predict RUL of bearings better.
LSSVM (local model construction)
We will introduce the principle of LSSVM in this section because the local model is built based on it. The objective function of LSSVM is shown as equation (12):
here,
It is transformed into equation (13) through the dual representation of the objective function:
here,
Based on KKT conditions, the linear equations corresponding to equation (14) is calculated:
Thus,
here,
Finally, the estimation equation of LSSVM is obtained:
Global model construction based on cooperative learning mechanism
We have built local model for three different states. Therefore, we should combine three local models to build the global model based on the cooperative learning mechanism. The proposed cooperative learning mechanism is calculated by equation (17):
here,
here,
here,
In order to divide the input space and unify it to form a global model, the coordination function is normalized:
In the model, the probability that the input data belong to each health state is determined based on equation (18). Then, the output contribution rate of each local model to the global model is calculated based on this probability. In this way, several local models of different health states cooperate to form a global model. Thus, a local-global LSSVM model is established.
Theoretical analysis
The general form of single model LSSVM regression is:
here,
According to equation (17), the output of the local global model proposed in this paper is:
here,
After writing equation (22) in the same form as equation (21), we obtain:
here,
Then, we define
here,
We suppose
Then, the objective optimization function of the local-global cooperative LSSVM model proposed in this paper is:
Lagrange duality is used to obtain:
The linear equations is obtained by equation (27) based on KKT conditions:
We eliminate
here,
The estimation function of the local-global cooperative LSSVM model is:
RUL prediction of bearings based on local-global cooperative LSSVM model
We introduce the application of the proposed local-global cooperative LSSVM for the RUL prediction of bearings in detail in this section. The local-global cooperative LSSVM is shown in Algorithm. Meanwhile, Figure 2 gives the full prediction method of RUL. The main steps are as follows:
Step 1: HHT is used to extract marginal spectrum of bearings during signal preprocessing. The data of marginal spectrum effectively reflect the full cycle degradation characteristics of bearings. Meanwhile, it is applied to the RUL prediction of bearings.
Step 2: First, we determine the fast degradation state based on SVD and correlation coefficient. Then, according to the kurtosis criterion, we determine the normal state and slow degradation state of bearings from the residual degradation process. Therefore, the full life cycle of bearings is divided into three states: normal state, slow degradation state, and fast degradation state.
Step 3: According to the results of health assessment, we first build the local LSSVM models for each state. Then, we combine each local model based on the cooperative mechanism to form the local-global cooperative LSSVM model. Finally, we adopt marginal spectrum data by HHT as the input of the model to predict the RUL of bearings.

RUL prediction of bearings based on local-global cooperative LSSVM.
Experimental simulation
In this paper, experimental simulation analysis is conducted on the bearing data set of IEEE PHM 2012. At the same time, in order to verify the effectiveness of the proposed method, the proposed method is compared with other methods. In this section, we first evaluate the health status of bearing’s whole life cycle and then predict the RUL of bearings based on the evaluation results. All input variables are normalized to the range [−1, 1]. All experiments were conducted on a computer running MATLAB 2020 b with intel I5-10400F CPU, 8GB RAM and DUAL-RTX 2060-06G-EVO GPU.
Data description
The IEEE PHM Challenge 2012 bearing data set used in this section is collected from the bearing testing platform of the FEMTO-ST Laboratory. 25 The platform can perform accelerated degradation experiments on bearings under certain conditions and simultaneously collect the relevant bearing data.
The data set provides bearing data under three different working conditions. The first working condition are 1800 rpm and loads 4000 N. The second working condition are 1650 rpm and loads 4200 N. The third working condition are 1500 rpm and loads 5000 N. Vibration signals are collected by acceleration sensors placed in both horizontal and vertical directions. The sampling frequency of acceleration sensor is 25.6 kHz.
It should be noted that the degradation processes of bearings under different working conditions are different and the degradation processes of different bearings under the same working conditions are also different. The problem proposed in this paper is that the RUL prediction of bearings is not accurate because many methods cannot make full use of the information in the degradation process of bearings due to the characteristics of phased changes of the degradation process data of bearings. Therefore, in order to verify the effectiveness of the proposed method, we use five bearings under the first working condition to carry out five groups of simulation analysis. At the same time, the horizontal signals are chosen for testing because they are good at tracking the degradation process of bearings. 26
As the time domain signal is difficult to analyze directly, we use HHT to preprocess the original vibration signal. HHT has two advantages for analyzing signal: (1) It does not need to set orthogonal basis in advance and (2) good processing ability for non-stationary signals. Figure 3 shows the time domain signal and the marginal spectrum of three health states of bearing 1-1. It is clear from the marginal spectrum that the slow degradation process changes less while the fast degradation process changes obviously. Therefore, by using HHT to preprocess the original signal through time-frequency domain transformation, the trend information of the degradation process can be revealed in more detail.

Time domain signal and marginal spectrum of bearing 1_1: (a) normal, (b) slow degradation, and (c) fast degradation.
Health assessment
In this section, the performance of the health assessment is testified by bearing 1_1. We first use the correlation coefficient of SVD matrix to determine the fast degradation process of bearing 1_1. Figure 4 gives the result of this.

Result of the fast degradation of bearing 1_1.
From Figure 4, when the bearing falls into the fast degradation state, its correlation coefficient also decreases rapidly under the premise of 95% prefix threshold. Therefore, we can clearly identify the decline point of bearing’s fast degradation state to determine the fast degradation state. It should be noted that sampling frequency of vibration signals is 25.6 kHz and 2560 time points are recorded each 10 s. Each sample point has 2560 time points in the paper.
On the base of these results, the kurtosis is used to determine the normal state and slow degradation state of bearings. Since the fast degradation state of bearings has been divided, this part of the data is discarded and the remaining data is used to determine the normal state and slow degradation state. In Figure 5, the IMF components are sorted by frequency value. As fault occurrence will cause high-frequency resonance, the high-frequency components of vibration signal are sensitive to the change of bearing health state. Therefore, it is of priority to select a high-frequency component of IMFs by EMD. Furthermore, as IMFs with higher normalized kurtosis values are rich in fault information, 27 we choose these IMFs for further calculation.

The first three IMF of bearing 1_1: (a) IMF1, (b) IMF2, and (c) IMF3.
Figure 6 gives the results of kurtosis value for each IMF component. It is clear from Figure 6 that the kurtosis value of the IMF2 component is the highest. Therefore, in order to determine the normal state and the slow degradation state of bearings, we should choose the IMF2 component to analyze.

Kurtosis coefficients of each IMF component of bearing 1_1.
The kurtosis value of the IMF2 component of each sample of the remaining vibration signal is calculated. The moving average method is used to smooth it and the results are obtained as shown in Figure 7.

Smoothed Kurtosis value of IMF2 component on bearings 1_1 under the first working condition.
From Figure 7, it is clear that the kurtosis value of the bearing rises up gradually with the deepening of its degradation process. Therefore, we find the sample point of its initial rise as the boundary between the normal state and the slow degradation state. It is clear from the figure that the initial rise point of bearing 1-1 is about 1452.
According to the results of health assessment, the full life cycle of bearings is divided into three health states: normal state, slow degradation state, and fast degradation state. The determined different states of each bearing are shown in Table 1.
Classification of bearing health status.
Experimental results and analysis
In order to make full use of the degradation process information of bearings and improve the accuracy of RUL prediction, this paper divides the full life cycle of bearings into multiple health states, and then builds local models for each health state to study the data characteristics of each health state. Finally, the local models are combined to form a global model based on the proposed cooperative learning mechanism and the output of the global model is taken as the actual output.
In the local model, the marginal spectrum data of the signals in each health state is taken as the input of each local model, and the remaining useful life of the bearing corresponding to each health state is taken as the output. The first 80% is the training set, and the last 20% is the test set. In the global model, the marginal spectrum data by HHT of the full cycle vibration signal of the bearing is taken as the input of the global model, and the remaining useful life of the bearing is taken as the output of the global model. Because of the large dimension of the input data, Principal Component Analysis (PCA) is used to reduce its dimension and intercept the dimension data whose contribution degree is 98%. As this paper does not focus on data dimensionality reduction, we do not introduce principal component analysis in detail, see Lu et al. 28 for details.
The RUL calculation of bearings is shown by equation (31):
here,
The prediction performance indexes selected in this experiment are Root Mean Square Error (RMSE) and Mean Absolute Percentage Error (MAPE). The corresponding calculation equations are as follows:
here,
In this paper, the health status of the full life cycle process of bearings is evaluated and then local models are built, which significantly improves the prediction accuracy. Figure 8 represents the prediction results of each local model for bearing 1_1. It can be seen from Figure 8 that each local model can well predict the remaining useful life of the corresponding health state of the bearing and the error is very small. It indicates that each local model can well learn the data characteristics of the corresponding health state and make full use of the information of the corresponding health state. The local model prediction performance of all five bearings is shown in Table 2. In this paper, all the LSSVM model parameter optimization are done using grid search and cross validation methods.
Local model prediction performance of all bearings.

Local model prediction results of bearing 1_1: (a) local model 1, (b) local model 2, and (c) local model 3.
In this paper, in order to solve the problem of multi health states in the whole life process of bearings, a local-global cooperative learning strategy is proposed. It is combined with the LSSVM to build a local-global LSSVM cooperative model to predict the RUL of bearings. Due to space limitations, we only provide the RUL prediction results of four bearings, as shown in Figure 9.

Prediction results of four bearings: (a) bearing 1_1, (b) bearing 1_2, (c) bearing 1_3, and (d) bearing 1_4.
In order to verify the effectiveness of the proposed method, the proposed method is compared with the least squares support vector machine model, Ren’s method 18 , Wen’s method, 29 and Mao’s method. 19 The all results are shown in Table 3. Specifically, Ren’s method 18 uses the spectrum of original vibration signal principal energy vector as the input of CNN, and then directly builds an RUL prediction model from CNN. Its main parameter is CNN, which consists of eight layers, including three convolutional layers, three pooling layers, and one full connection layer. Each layer uses the ReLU activation function, and the average pool template is 2 × 2. The prediction model is a six-layer deep neural network, and the network parameters are [200, 100, 50, 30, 8, 1]. The training loss function of the whole network is “mean square error” and the epoch parameter is 100. In Wen’s method, 29 the parameters of λ and b are 0.0172 and 2.055, respectively. The parameters of σ and γ are 0.4406 and 2.0208, respectively. In Mao’s method, 19 the deep feature of HHT marginal spectrum data of original vibration signals is extracted by CDAE. Then, the deep feature is used as the input of TCA to match the domain distribution of target bearing and auxiliary bearing, and the 25-dimensional data feature is obtained. Finally, the LSSVM regression prediction model is built by using the data features as input data. Its main parameters are the output of the CDAE is 50 dimensions, the output of TCA is 25 dimensions, and the parameter optimization of LSSVM adopts grid search and cross-validation methods.
Comparison results of prediction performance of four different methods.
It is clear from Table 3 that the least square support vector machine cannot accurately predict the remaining useful life of bearing, and its prediction accuracy is the lowest among the five comparison methods. Ren’s 18 method cannot study the characteristics of each degradation state due to lacking of health assessment, which lead to the insufficient prediction accuracy. This indicates that building a uniform prediction model for bearing’s whole life, with no health state assessment, is incapable of exploiting enough information about the degradation process. Although the fast degradation state of bearing is evaluated in Wen’s method 29 and Mao’s method 19 , the slow degradation state and normal state of bearing are not evaluated. Although the prediction accuracy is improved compared with LSSVM and Ren’s method, there is still room for improvement in its prediction of the remaining useful life of bearings. In our method, SVD correlation coefficient matrix and kurtosis coefficient are used to evaluate the degradation state of bearing. In this way, the degradation state of bearing has been fully evaluated. Meanwhile, we build a local model to study the characteristics of degradation state for each degradation state of bearing. In this way, we exploit enough information about the degradation process, which lead to the sufficient prediction accuracy of our method.
Conclusion
In this paper, aiming at the characteristic of phased degradation of bearings’ degradation process, a new RUL prediction method of local-global cooperative least squares support vector machine is proposed to improve the accuracy of remaining useful life of rolling bearings. From the experimental results, we can obtain the following conclusions:
Aiming at the characteristics of phased degradation of bearings’ degradation process, we propose a local-global cooperative learning strategy to evaluate each degradation state of bearing based on our proposed health assessment method. Meanwhile, we build a local model for each degradation state to realize the studying of the bearing degradation process. Compared with other methods, it can make full use of the information of the bearing degradation process and study the bearing’s state changes well.
Aiming at the inability to accurately predict the remaining useful life due to the phased degradation in the bearing degradation process, the proposed local-global cooperative learning strategy is combined with the least squares support vector machine to form a local-global cooperative least squares support vector machine model. Compared with other methods, the RMSE and MAPE of our method are lower, and the accuracy of RUL is effectively improved.
In this paper, we use a single horizontal direction signal for studying the RUL of bearing. However, the signal of bearing is collected in multiple directions. And a single horizontal direction signal is not insufficient. Therefore, we will plan to mine the relationship between signal characteristics and the degradation process to strengthen the prediction of the remaining useful life of bearings by looking for potential public information between different signal characteristics in future work.
Footnotes
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported in part by the National Natural Science Foundation of China under Grant 61763049, in part by the Key Projects of Applied Basic Research in Yunnan Province under Grant 2018FA032, and in part by Yunnan Provincial Young and middle-aged Academic and technical Leader Reserve Talent Project under Grant 202005AC160115.
