Abstract
The fault characteristic signals of rolling bearings are coupled with each other, thus increasing the difficulty in identifying the fault characteristics. In this study, a fault diagnosis method of rolling bearing based on least squares support vector machine is proposed. First, least squares support vector machine model is trained with the samples of known classes. Least squares support vector machine algorithm involves the selection of a kernel function. The complexity of samples in high-dimensional space can be adjusted through changing the parameters of kernel function, thus affecting the search for the optimal function as well as final classification results. Particle swarm optimization and 10-fold cross-validation method are adopted to optimize the parameters in the training model. Then, with the optimized parameters, the classification model is updated. Finally, with the feature vector of the test samples as the input of least squares support vector machine, the pattern recognition of the testing samples is performed to achieve the purpose of fault diagnosis. The actual bearing fault data are analyzed with the diagnosis method. This method allows the accurate classification results and fast diagnosis and can be applied in the diagnosis of compound faults of rolling bearing.
Keywords
Introduction
A rolling bearing is easily damaged and widely used in rotating machinery. According to the statistical data, 30% of rotating machinery failures are caused by bearing failures. Therefore, the fault diagnosis of rolling bearings is important. 1
The essence of fault diagnosis is the pattern recognition of fault features, whereas the key to pattern recognition is to design a reasonable classifier. In recent years, intelligent pattern recognition methods have been developed, such as expert system, neural network, support vector machine (SVM), and least squares support vector machine (LSSVM).
An expert system is the earliest intelligent system. Knowledge and experiences of experts in a certain class are saved into a computer to establish a database, which enables the computer to simulate the thinking and judgment processes of experts, and then deal with some complex problems.2,3 However, an expert system still has some defects. The expert system has no self-learning ability and is highly dependent on experts. It is necessary to acquire and update expert knowledge. The level of experts largely determines the level of system fault diagnosis, thus seriously affecting the efficiency of the expert system.
Neural network refers to artificial neural network, which completes the logical expression between input and output by imitating the structure of animal neurons and has a strong nonlinear mapping ability. It is widely used in pattern recognition, fault diagnosis, automatic control, and other fields.4,5 However, the defects of neural network, such as long training time, poor generalization ability, and slow convergence speed, are not conducive to the pattern recognition of small samples.
SVM was first introduced in 1995 by Cortes and Vapnik 6 to deal with the problem of dichotomy and showed unique advantages in pattern recognition of nonlinear small samples.7,8 Then its application scope was extended to machine learning. However, it takes time to solve the quadratic programming problem of SVM, which leads to a long time for the fault diagnosis with SVM model. Widodo and Yang 9 used the wavelet support vector machine to analyze the current signals of many faults for an induction motor and obtained a high fault identification rate. Zhang et al. 10 used an SVM to obtain an ideal diagnostic effect of cutting tool faults. However, when the noise in the signal was stronger than the actual fault signal, the characteristic parameters of the vibration signal for the state monitoring and fault diagnosis were not clear and it is difficult to express the complicated mapping relationship between the health status of equipment and the measured signal with traditional SVM. Moreover, a large amount of data collected by the collection system for mechanical system fault diagnosis led to the problem of high dimensionality and it is difficult to obtain a high accuracy rate for fault diagnosis.11–14
Based on SVM, Suykens and Vandewalle 15 proposed an extension of Standard SVM, LSSVM algorithm. Inequality constraints in SVM are changed into equality constraints in LSSVM algorithm, thus greatly facilitating the solution of Lagrange multiplier. It changes the quadratic programming problem into the problem of solving linear equations, thus reducing the computational complexity and accelerating the solution process. S Zhang 16 used LSSVM as the classifier and combined the detection coils with LSSVM for bearing fault detection. The detection coils provided signals for LSSVM and LSSVM reported fault status through statistical learning algorithm. Wang et al. 17 explored the nonlinear and non-stationary properties of vibration signals of bearings and identified the faults with particle swarm optimization least square support vector machine (PSO-LSSVM) algorithm. Li et al. 18 used the LSSVM to recognize different fault types of wheel bearings. PSO algorithm is a kind of evolutionary algorithm and similar to the simulated annealing algorithm. It starts from a stochastic solution, evaluates the solution by fitness, and finds the optimal solution by iteration. The rule of PSO is simpler since it has no crossover or mutation operation. PSO searches for the global optimum based on the current optimal value. PSO has been widely concerned due to its significant advantages of easy implementation, high accuracy fast convergence, and its high performance in solving practical problems.
In this article, the particle swarm optimization and 10-fold cross-validation (PSO-10-fold CV) and the LSSVM are combined together to identify various failure modes of rolling bearings. First, with LSSVM as the classifier of bearing fault signals, the parameters of the classifier are optimized by combining the PSO with 10-fold cross-validation method. Then, the performance of the method in multi-classification is analyzed. Finally, the LSSVM multi-classifier is used to identify the failure modes of a rolling bearing.
Principle of LSSVM
SVM is a good method to minimize a structural risk and also the core of statistical learning theory. It can get a better generalization ability from the limited sample information and deal with nonlinear and small samples well. SVM is widely applied in the fields of fault diagnosis and regression estimation. 19
SVM is a process of searching for the optimal hyperplane in a linearly separable state. For a binary classification problem,

SVM optimal hyperplane schematic diagram.
However, in practice, most of the sample sets are linearly inseparable, thus generating the problem of nonlinear sample classification. In order to solve this kind of problem, the sample set is mapped to a high-dimensional space by nonlinear mapping, so that the problem becomes a linearly separable classification problem in high-dimensional space. Then the problem is solved through searching for the optimal hyperplane. Finally, the optimal hyperplane in the high-dimensional space is mapped back to the original space, so that it becomes the optimal hyperplane in the nonlinear problem. The process of nonlinear classification is shown in Figure 2.

Nonlinear classification process.
According to the strategy of SVM, linearly inseparable sample points are mapped to high-dimensional space to obtain linearly separable sample points. Then the problem is equivalent to the problem of finding the optimal hyperplane in high-dimensional space. Therefore, the introduction of an appropriate nonlinear mapping
Then the optimal hyperplane in the original space is
The objective function is
According to equation (2), the Lagrangian function is constructed as
Take partial derivatives respectively with respect to
Substituting equation (4) into equation (3) gives
According to the Karush–Kuhn–Tucker (KKT) condition, the optimal solution should meet the following conditions
The kernel function
The final optimal classification function is
In order to solve multiple-scale problems and accelerate the calculation, LSSVM 15 is used. The least squares linear system is set as the loss function to replacing the quadratic programming method adopted in traditional SVM and inequality constraints are replaced with equality constraints. Although the sparsity and robustness of LSSVM are slightly decreased, the algorithm is greatly simplified and the calculation cost is reduced. LSSVM shows high performance in fault diagnosis and pattern recognition.
In general, the nonlinear optimal classification function is set as
The optimization problem of LSSVM is expressed as
where
Take partial derivative respectively with respect to
After eliminating the variables
where
According to Mercer condition, there are mapping functions
Substituting equation (13) into the nonlinear optimal classification function
Kernel function selection and parameter optimization
Selection of Kernel functions
According to the LSSVM algorithm, the kernel function
The selection of kernel function has great flexibility. The complexity of samples in high-dimensional space can be adjusted through changing the parameters of kernel function, thus affecting the search of the optimal function and the final classification effect. The commonly used kernel functions are introduced as follows:
1. Linear kernel function
2. Sigmoid kernel function
3. Polynomial kernel function
4. Radial basis function (RBF)
The characteristics of LSSVM vary with the selected kernel function. Both Sigmoid kernel functions and polynomial kernel functions have certain limitations in parameter selection. Inappropriate parameters may increase the calculation load and sometimes lead to improper results. Linear kernel function is a special form of RBF kernel function, which has a good performance and the strong learning ability, and has been widely used in pattern recognition and fault diagnosis. RBF kernel function has universal features. By adjusting the appropriate parameter
Parameter optimization
LSSVM classifier based on RBF kernel function has two main structural parameters: width
In order to further improve the performance of LSSVM classifier, PSO-10-fold CV methods are adopted to improve the selection of parameters. PSO20,21 is characterized by a simple procedure, fast convergence, and high global optimization and constraint processing ability. Its steps are provided as follows:
The given sample set is divided into 10 independent subsets of the same size.
Two parameters in the kernel function that need to be optimized,
The precision of the classifier of each particle is obtained with current parameters by the 10-fold cross-validation algorithm and the fitness value of each particle is taken as the fitness value of the current particle. One subset is taken as the verification set, whereas the other subset is taken as the training set. Then, the accuracy of the LSSVM classifier is calculated.
The fitness value of each particle is compared with the fitness value of the particle at the current optimal position. If the fitness value is better than the current optimal value, the current optimal position is updated.
The fitness value of each particle is compared with the fitness value of the particle at the global optimal position. If the fitness value is better than the global optimal value, the global optimal position is updated.
The position and speed of particles are updated. The accurate classification rate of more than 90% is set as the termination conditions. When the termination conditions are not satisfied, go to Step 3. When the termination conditions are satisfied, the process ends and the optimal parameter
The flow chart of LSSVM parameter optimization is shown in Figure 3.

Parameter optimization process of LSSVM.
Multi-classification problem analysis
It can be seen from the above basic theories that both SVM and LSSVM are binary classification models, which use the optimal hyperplane to divide original samples into two classes. However, in practice, many problems are not limited to dichotomy, so it is necessary to extend SVM and LSSVM to solve multi-classification problems. The most intuitive idea is to use multiple classifiers for multiple classifications. Common methods include one-versus-one classification algorithm, one-versus-rest classification algorithm, and one-versus-others classification algorithms.
The main idea of the one-versus-one classification algorithm is to establish

One-versus-one classification method.
The main idea of the one-versus-rest classification algorithm is to establish m classifiers in the m classes of samples and set them in the ith classifier. The ith class of samples is positive and the rest are negative. Each classifier can separate a class of samples. The classification method is shown in Figure 5.

One-versus-rest classification method.
The main idea of the one-versus-others classification algorithm is to establish

One-versus-others classification method.
Multiple classification of LSSVM can be realized through the above three methods. However, according to the one-versus-one classification method, two classes may have the same number of votes for a sample in the voting, indicating that the sample will be inseparable. For one-versus-rest classification method, overlapping regions or unknown regions often occur. Since the classified samples are removed from original samples in one-versus-others classification method, the above phenomena do not occur. Therefore, one-versus-others classification method is adopted to classify bearing faults and the classification order is from single fault to composite fault.
Experimental verification and analysis
Previous studies focused on a single rolling bearing fault in the field of pattern recognition. In this article, a rolling bearing fault diagnosis method based on LSSVM classifier is studied. For single faults and compound faults, it has more accurate classification and requires less time. The following experiments showed that this method was applicable for rolling bearing fault diagnosis.
Case 1: Case Western Reserve University Bearing Data
In order to verify the classification effect of LSSVM, the bearing failure data from Case Western Reserve University Bearing Data Center were selected for the verification experiment. The data center only provided the fault data of a single bearing, so the fault data of composite bearings were obtained through the principle of vector superposition simulation.
When SVM or LSSVM is used as a classifier for pattern recognition, a training model should be established first. Therefore, the data set should be divided into training sample subset and testing sample subset. According to the above analysis, the classifier used RBF as the kernel function. Sampling points of time-domain signals could be directly inputted into the classifier as fault eigenvectors, so 1000 sampling points were selected for each set of eigenvectors and 50 sets of data were collected for each fault, including 15 sets of data as training samples and 35 sets of data as testing samples. The two parameters

Fault classification of rolling bearing based on SVM.

Fault classification of rolling bearing based on LSSVM.
Comparison of fault diagnosis results.
SVM: support vector machine; LSSVM: least squares support vector machine.
Compared with SVM, LSSVM had 11.9% higher fault diagnosis accuracy and reduced diagnosis time by 3.1127 (Figures 7 and 8 and Table 1), indicating that the modeling speed and accuracy of LSSVM were significantly higher than those of SVM. LSSVM had a better classification effect than SVM.
In order to improve the classification accuracy, the PSO-10-fold CV method was adopted to optimize the two parameters of the classifier and the optimal parameters were obtained as

LSSVM classification with optimized parameters.
Case 2: data of MFS mechanical fault simulation bench
In order to further verify the classification method in this article, the MFS mechanical fault comprehensive simulation test bench produced by SQI Company in the United States was used in the experiment. The structure and device of the test bench are shown in Figure 10. The test bench can simulate the common mechanical equipment failure and be used to study fault diagnosis.

Mechanical fault simulation experiment device.
By changing fault bearings, MFS mechanical fault comprehensive simulation test bench can simulate outer fault, inner fault, ball fault, outer-inner fault, outer-ball fault, and inner-ball fault of rolling bearings to obtain the actual bearing fault vibration signals. The six types of fault bearings are shown in Figure 11(a)–(f). The acceleration sensor was installed on the bearing seat to measure the vibration signal of the bearing. The bearing model is KR-12K and the inner diameter is 19.05 mm. The outer diameter is 46.99 mm and the number of steel balls is 8. The ball diameter is 7.87 mm and the contact angle is 0.

Six types of fault bearing: (a) outer fault bearing, (b) inner fault bearing, (c) ball fault bearing, (d) outer–inner fault bearing, (e) outer-ball fault bearing, and (f) inner-ball fault bearing.
When the rolling bearing vibration signals were collected on the MFS mechanical fault comprehensive simulation laboratory bench, the sampling frequency was 2.56 kHz and the rotation frequency was 30 Hz. Normal bearing signals and six kinds of bearing failure signals were collected through the module conversion process.
LSSVM was used as a classifier for pattern recognition of the above six bearing fault signals. With RBF as the kernel function, 100 sets of vibration signals were selected for each fault. Each set of vibration signals involved 4000 sampling points. Twenty sets of signals were used as training samples, and the remaining 80 sets of signals were used as testing samples. Since there are six failure states and one normal state in the experiment, it was necessary to construct six LSSVM classifiers, which respectively correspond to outer fault, inner fault, ball fault, inner ball bearing fault, outer ball bearing fault, and outer inner fault, and the rest signals were normal signals. The two parameters were set as

Fault classification results based on three feature vectors: (a) fault classification diagram of MSE, (b) fault classification diagram of energy, and (c) fault classification diagram of MSEE.
Fault diagnosis rate of LSSVM.
LSSVM: least squares support vector machine; MSE: multiscale sample entropy; MSEE: multiscale sample entropy energy.

Parameter optimization process.

LSSVM classification obtained with optimized parameters.
Conclusion
In this article, a fault diagnosis method based on LSSVM is proposed to solve the problem that the fault characteristics of rolling bearing are coupled with each other and the difficulty in identifying fault characteristics of single faults and composite faults. Before establishing the training model, the algorithm combined with PSO-10-fold CV was used to optimize the parameters, so that the model has a better classification effect. Through the processing and comparative analysis of the actual bearing fault vibration signals, the experimental results showed that the fault diagnosis classifier based on LSSVM had a more accurate classification function, shorter classification time, and higher diagnosis performance of rolling bearing faults.
In this article, a supervised fault diagnosis method and the sample data of known classes were adopted. However, the classification of samples of unknown classes still needs to be further studied.
Footnotes
Handling Editor: Andreas Rosenkranz
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: National Natural Science Foundation of China (grant nos. 61803005, 61640312, 61763037) Beijing Natural Science Foundation (grant nos. 4192011, 4172007) Shandong Province Key Research and Development Program Project (grant no. 2018CXGC0608) and Education Commission of Beijing Funding.
