Abstract
Aiming at the problem of gear fault diagnosis, in order to effectively extract the features and improve the accuracy of gear fault diagnosis, the method based on wavelet-packet independent component analysis and support vector machine with kernel function fusion is proposed in this research. The proposed wavelet-packet independent component analysis feature extraction method can effectively combine the advantages of wavelet packet and independent component analysis methods and acquire more comprehensive feature information. Besides, the proposed kernel-function-fusion support vector machine can well integrate the advantage characteristics of each kernel function. The energy features of wavelet packet coefficients are acquired with four-layer wavelet packet decomposition and then the extracted energy features are further optimized by the independent component analysis method. The kernel-function-fusion support vector machine method is adopted to realize the gear fault diagnosis. Two kernel function models with the best self-classification accuracy are employed to serve the gear fault diagnosis corporately. The test samples are primarily classified by the main kernel function model, and then some samples are selected to be reclassified with the other kernel function model. Finally, the two kernel function models cooperate to determine the type of test samples. The comparison investigations demonstrate that the proposed method based on wavelet-packet independent component analysis and support vector machine with kernel function fusion achieves very high diagnosis accuracy.
Introduction
In the industrial production process, in order to minimize the losses caused by machine fault, the state monitoring system and fault diagnosis system play a very important role in practical application. Gear is an extremely crucial primary element of rotary machine. Therefore, timely and effectively monitoring the running states of gears is of great significance for the operation and maintenance of machinery equipment.
Research scholars have done many related investigations on fault diagnosis and the research results are quite fruitful. Ruo et al. 1 applied the structured sparse time-frequency analysis to the gear fault diagnosis and compared it with some of the most advanced vibration separation methods and traditional time-frequency analysis methods, which showed that this method has the advantages of separating the characteristic signals under strong noise, and can solve the internal time–frequency structure of gear vibration signal. The Riemannian manifold method was adopted to test the fault status of mechanical equipment and detect the abnormality by visualizing the distribution of covariance matrix. 2 The wind turbine gearbox was used as the experiment object to verify the effectiveness of the proposed method and the results showed that the Riemannian manifold method is reasonable and effective. M Heidari et al. 3 compared the wavelet packet (WP) support vector machine (SVM) and least squares SVM (LSSVM) to identify the fault diagnosis of gearbox. The classification results indicated that the two classification methods are very effective, and the classification performance of wavelet SVM is better. Back propagation (BP) neural network method was proposed to carry out the fault diagnosis of the gearbox and it presented well. 4 In addition, many other feature extraction and mode classification methods have been investigated in the gear fault diagnosis, such as empirical mode decomposition, 5 ensemble empirical mode decomposition, 6 local mean decomposition, 7 independent component analysis (ICA), 8 singular value decomposition, 9 SVM,10,11 neural network,12,13 hidden Markov model,14,15 decision tree, 16 root mean square, 17 and so on.18–25 In summary, there are a great many of feature extraction or mode classification methods and each of them has its own characteristics (advantages and limitations). For the gear fault diagnosis, a single feature extraction method is usually adopted to acquire characteristic information of gear’s running states, and the measured vibration signal is complex and vulnerable to noise interference. Hence, the traditional single feature extraction methods in most cases are difficult to acquire the features that can fully reflect the running states of gears, which seriously affects the diagnosis accuracy in a certain degree. Meanwhile, the SVM is usually used with LSSVM or other terms, but little work is focused on the kernel function fusion.
In order to more effectively extract the relevant features of gear faults and further improve the diagnostic accuracy, a fault diagnosis method based on wavelet-packet independent component analysis (WP-ICA) and SVM with kernel function fusion is proposed in this research. The WP-ICA is adopted to acquire the running features of gear conditions, which effectively combines the advantages of WP and ICA methods and can extract more comprehensive feature information. Meanwhile, the classification method of SVM with kernel function fusion is proposed to achieve better identification of gear fault diagnosis.
WP-ICA feature extraction
The working statuses of machinery equipment are usually presented in the vibration signal when the mechanical equipment is running. That is to say, the running statuses can be evaluated by analyzing the characteristics of the vibration signal. Therefore, the effective feature extraction method plays a very important role in the fault diagnosis of mechanical equipment. At present, there are many feature extraction methods used in machinery equipment fault diagnosis. For based on the vibration-signal feature extraction, each method has its own unique advantages, so the method that integrates the advantages of different feature extraction methods can obtain more effective feature information. In the view of the advantages of WP and ICA methods, this article adopts the combination of the two methods to extract the features.
WP method
WP analysis is a signal processing method that extends based on wavelet analysis. It not only decomposes the low-frequency part of the signal, but also further decomposes the high-frequency part of the termination decomposition in the wavelet decomposition process, which can greatly improve the time-frequency resolution. The essence of WP decomposition is to perform orthogonal decomposition of a function space step by step. Muti-resolution analysis is generally based on different scale factors
The orthogonal decomposition of Hilbert space
Defining a subspace
where
Since the scale function
That
ICA method
Independent component analysis (ICA) is a brand-new signal data processing method developed in the signal processing field in recent years. Its core idea is to blindly separate the observation signals mixed on the premise of statistical independence and then decompose the signal into several independent components to achieve the purpose of separating the source signal from the observation signal. Its basic model is shown as follows
where
The purpose of the ICA is to find a weight matrix B such that
where
The specific algorithm flow is shown in Figure 1.

ICA feature extraction algorithm flow.
WP-ICA combination
Due to the gear vibration signal with characteristics of serious noise interference, in order to remove noise and extract useful feature information as much as possible, this article proposed a WP-ICA feature extraction method of gear vibration signal based on WP and ICA methods, which effectively combined the advantage characteristics of the two feature extraction methods. Zuo et al. 26 applied wavelet transform and ICA to gearbox fault diagnosis which extracted coefficients of wavelet transform at different scales as input to ICA. Without employing classification method, Zuo et al. mapped the wavelet coefficients into a polar diagram to enhance periodic transients caused by gearbox and bearing faults. 27 In order to effectively achieve gear fault diagnosis, in this investigation, WP-ICA feature extraction method is proposed to acquire the features of gear states and work in with the kernel-function-fusion SVM classification method.
In this research, in order to effectively require the features of the gear states, the data of vibration signal are preprocessed with wavelet function and down sampling. Then the preprocessed signal is decomposed by four-layer WP and the energy features of each frequency band are employed as the inputs to ICA. WP method has high-resolution characteristic, which can preserve the useful information of the signal as much as possible. ICA method can separate signal series into independent signal series. The proposed WP-ICA method makes full use of advantage characteristics of the two feature extraction methods, which not only retains effective feature information but also further increases the difference of feature information.
The specific WP-ICA feature extraction steps are shown as follows:
Step 1: The wavelet function is used to remove noise of the original gear vibration signal under normal, slight, medium, and broken tooth statuses.
Step 2: For further decreasing the impact of noise signal, the vibration signal data are divided into small data segment with each 100 data points. Then, each small data segment uses its average value as a sampling point and the formula is as follows
where
Step 3: The energy features are acquired by performing WP decomposition. The new sample data are subjected to four-layer WP decomposition. Then, the energy characteristics of each frequency band are extracted and the formula is shown as follows
where
Step 4: The energy feature vector is further optimized using the ICA method.
Step 5: The final sample data feature vectors are obtained by normalizing energy feature vectors.
The overall feature extraction flow is shown in Figure 2. The wavelet function and down sampling are employed to reduce the noise interference in gear vibration signal and then 16 dimensions frequency energy feature are acquired by four-layer WP decomposition. Finally, ICA method is used to further optimize the extracted features, thereby obtaining more useful features that can fully reflect the characteristics of the vibration signal and achieving the purpose of improving the classification accuracy.

WP-ICA algorithm.
SVM with kernel function fusion
As an important part of fault diagnosis, mode recognition usually uses the extracted feature information as input parameters, adopts a suitable classification method to train the input parameters, and finally employs the trained network model to identify the category of the test sample. Among many mode recognition methods, SVM is widely applied due to its strong adaptability, high training speed, and good generalization ability in small sample classification problems.
SVM introduction
SVM is based on statistical learning theory proposed by Vapnik.28,29 It is a machine learning method based on Vapnik-Chervonenkis (VC) dimension theory and structure minimization theory. For two types of linear classification problems, SVM is to maximize the interval and transform the two-class linear problem into a convex one of the quadratic planning problem. Its goal is to find an optimal hyper-plane in the
where
The optimal hyper-plane decision function of the linear classifier is shown as follows
where
The optimal hyper-plane decision function of the nonlinear classifier is shown as follows
where
For muti-classification problems, the solution is to solve multiple classification problems by constructing multiple SVM two-class classifiers. The common SVM muti-classification solution methods adopted have one-to-one and one-to-rest. When the one-to-one method is used to classify the k categories of sample data, k(k – 1)/2 binary classifiers are constructed to classify samples of k types. The principle of this method is to determine the category of sample data by calculating the total number of votes. When the one-to-rest method is used to classify the k categories of sample data, k binary classifiers need to be constructed. When constructing each binary classifier, the sample data belonging to this category are marked as a category, and the rest of the sample data are marked as another category.
Kernel function fusion
SVM classification method is also usually called as a kernel classification method. As we all know, the linear relationship can be expressed as a dot product, and the dot product determines the Mercer core. Therefore, the nonlinear problem in the low-dimensional feature space can be transformed into the linear problem in the high-dimensional feature space by leading into the kernel function. Thus, the kernel function method can solve the nonlinear problem in the low-dimensional feature space by solving the linear classification problem in the high-dimensional feature space. At present, the common kernel functions are divided into four categories: they are linear kernel function, polynomial kernel function, radial basis kernel function, and sigmoid kernel function, respectively. Each of them has itself advantage characteristics. Linear kernel function has good linear classification performance; polynomial kernel function has excellent global performance; radial basis kernel function has very superior local performance and sigmoid has no local minimum point problem.
Generally, when we use the SVM method to classify the sample data, only single-type kernel function model is adopted to identify the category of test samples. Each kernel function has its own characteristics. If the kernel functions are employed cooperatively, complementary characteristics of each kernel function will be integrated and achieve better performances. It is no doubt that when one trained kernel function model misclassifies the sample, while another model maybe classify the sample correctly. Hence, the proposed kernel function fusion method can improve the classification accuracy to a certain extent.
On the basis of WP-ICA feature extraction, the feature samples are divided into two parts, namely train samples and test samples. The train samples are employed to train the SVM kernel function network models. In the four types of trained kernel function network models, the two kernel function models with the best self-classification accuracy are selected. Among them, the kernel function model with the highest self-classification accuracy is as main classification model and the other kernel function model is as auxiliary classification model to cooperate with the main classification model. The kernel function model with the highest self-classification accuracy is used to classify the test samples. Then, according to output vote vectors of the SVM toolbox, some sample data are selected to be reclassified with the auxiliary kernel function model according to the set vote parameters (such as the different maximum and minimum value with different votes). Finally, the two kernel function models together realize recognition classification. The specific algorithm flow is shown in Figure 3.

Kernel-function-fusion SVM.
Experiment Study
In order to verify the effectiveness of the proposed WP-ICA and SVM with kernel function fusion method in this article, multi-view experimental analyses were conducted.
Experiment sample data and analyzing planning
In order to ensure the universality of the proposed method, authoritative public data (Kayvan Rafiee’s Official Webpage—Free Dataset) were used for experimental analysis. These data contain measured vibration data of the equipment under normal, slight, medium, and broken tooth conditions. The original sample data are abundant and representative.
So as to fully verify the effectiveness of the proposed method, in-depth experimental analysis and comparative investigation were conducted from feature extraction and mode classification aspects, namely longitudinal and horizontal comparisons. One is to verify the effectiveness of the feature extraction method of proposed WP-ICA, comparing the classification accuracy of the proposed WP-ICA method with the single WP feature extraction method. The other is to compare the classification effectiveness of kernel function fusion and single kernel function based on SVM classification method under the suggested feature extraction method in this article. The specific experimental analysis planning is shown in Figure 4.

Experimental analysis planning.
Data processing
According to the WP-ICA features extraction method mentioned in section “WP-ICA combination,” the features of the original gear vibration signal are extracted. The four-layer WP decomposition is shown as Figure 5, and the feature vector curves of WP and WP-ICA are shown in Figure 6. Figure 6(a) represents 16 dimensions energy features of the gear’s four states under thefour-layer WP decomposition and Figure 6(b) stands for 16 dimensions energy features optimized by ICA method on the basis of the four-layer WP decomposition. By comparing dispersion degree of feature value in the same dimension under different gear condition, especially in the 11–15 dimensions, Figure 6 shows that the extracted features with WP-ICA method for different running states of gear are easier to be distinguished than the extracted features with WP method.

Normal signal coefficient reconstruction waveform.

The feature vector curves: (a) WP and (b) WP-ICA.
Six hundred feature samples are obtained for each running states (normal status, slight fault, medium fault, and broken tooth), where 300 feature samples are randomly selected as training samples and the remaining 300 feature samples are used as test samples. In terms of feature extraction, the characteristic information of the gear vibration signal is extracted using the WP and WP-ICA methods, respectively, and the SVM classification method is employed to identify type of the sample data. The c–g parameter optimization with WP and WP-ICA features is shown in Figure 7, respectively. In the aspect of mode recognition, the training samples are employed to train the SVM kernel function models. According to the mentioned fusion algorithm in section “Kernel function fusion,” the polynomial kernel functions and radial basis kernel functions of SVM are employed to fuse and serve for the recognition classification of the gear fault. The polynomial kernel function with the highest self-classification accuracy is selected as the main classifier to classify the test samples. Based on the vote parameters summarized with the output vote vectors of SVM toolbox, a part of the test samples is selected to be reclassified with the auxiliary radial basis kernel function model. The two models cooperated to identify the type of the test samples.

c and g parameter optimization of grid algorithm: (a) WP and (b) WP-ICA.
Experimental analysis and results
The experimental analysis was carried out from both feature extraction and mode recognition to verify the effectiveness of the proposed methods. In terms of feature extraction, the feature sample information of the gear vibration signal is first extracted using WP and the proposed WP-ICA, respectively. Then the training and testing feature samples are used to train the SVM network model and examine the identification performances, respectively. The identification results are shown in Table 1.
Identification results of different feature extraction methods.
WP: wavelet packet; WP-ICA: wavelet-packet independent component analysis.
It can be seen from Table 1 that the highest and lowest accuracy of the four status identifications with the WP feature extraction method are 90.67% and 73.33%, respectively. However, the corresponding maximum and minimum identification accuracy for feature extraction of the WP-ICA method are 100% and 97.67%, respectively. The identification accuracy of WP-ICA feature extraction method is greatly improved. The experiment results verify that the WP-ICA feature extraction method can more effectively acquire the running information of the gears.
The performance of the classifier sometimes presents with some differences due to the data characteristics. On the condition of the proposed feature extraction method, this article further carries out a horizontal comparison investigation, which compares the performances of SVM classifiers between the kernel function fusion and single kernel function. The identification results of SVM with single kernel function and with the kernel function fusion are shown in Table 2.
Identification results of single kernel function and kernel function fusion.
From the experimental results in Table 2, it can be seen that the recognition accuracy of the lowest are 97.67% and 98.67% for the SVM with single kernel function and with kernel function fusion, respectively, and the highest are both with 100%. Both methods of SVM with single kernel function and with the kernel function fusion show high recognition accuracy. From the synthesis of the highest and lowest recognition accuracy and the accuracy of each state recognition, the classification performance of kernel-function-fusion SVM is better than the SVM with single kernel function. Through longitudinal and horizontal comparison and analysis, it is verified that the proposed WP-ICA feature extraction method combined with the proposed kernel-function-fusion SVM classification method can serve with higher fault diagnosis accuracy than traditional method.
Conclusion
In order to improve the accuracy of gear fault diagnosis effectively, this research develops a novel application method. A fault diagnosis method based on WP-ICA and SVM with kernel function fusion is proposed. During the data process, wavelet function and down sampling are adopted to remove the influence of noise signal. A feature extraction method based on WP-ICA is proposed, which can integrate the advantage characteristics of each feature extraction method and effectively acquire the features of the running gear statuses. Meanwhile, the SVM with kernel function fusion is proposed to further improve the diagnosis accuracy. In order to verify the effectiveness of the proposed method in the round, the longitudinal and horizontal comparison investigations are carried out. The longitudinal comparison investigation shows that the proposed feature extraction method can more effectively acquire the running features of gear than the single WP feature extraction method. The longitudinal comparison investigation verifies that the proposed method of SVM with kernel function fusion can achieve higher diagnosis accuracy than the traditional SVM method. The developed method combining the WP-ICA and SVM with kernel function fusion presents very good performances in gear fault diagnosis.
Footnotes
Handling Editor: Zengtao Chen
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This research was financially supported by the National Natural Science Foundation of China (61773078), Open Foundation of Remote Measurement and Control Key Lab of Jiangsu Province (YCCK201303), and Industrial Technology Project Foundation of ChangZhou Government (CE20175040).
