Intelligent diagnosis of rolling bearing compound faults based on device state dictionary set sparse decomposition feature extraction

Abstract

Identification of rolling bearing fault patterns, especially for the compound faults, has attracted notable attention and is still a challenge in fault diagnosis. Intelligent diagnosis method is an effective method for compound faults of rolling element bearing, and effective fault feature extraction is the key step to decide the intelligent diagnosis result to some extent. The sparse decomposition method could capture the complex impulsive characteristic components of rolling bearing more effectively than the other time–frequency analysis method when compound fault arises in rolling bearing. Based on the self-learning dictionary under different operating states of the device corresponding to the special features modes, an intelligent diagnosis method of rolling bearing compound faults based on device state dictionary set sparse decomposition feature extraction–hidden Markov model is proposed in the article. First, characteristic dictionaries of rolling bearing under different operating conditions are extracted by sparse decomposition self-learning method, and state dictionary set of rolling bearing is constructed. Then, the compound fault signals of bearing are transformed into sparse domain using the constructed dictionary set to extract sparse features. At last, the extracted sparse features are used as training and testing vectors of hidden Markov model, and satisfactory intelligent diagnosis results are obtained. The validity of the proposed method is verified by compound faults of rolling element bearing. In addition, the advantages of the proposed method are also verified by comparing with the other feature extraction and intelligent diagnosis methods, and the proposed method provides a feasible and efficient solution for fault diagnosis of rolling bearing compound faults.

Keywords

Intelligent diagnosis rolling bearing compound fault dictionary set sparse decomposition hidden Markov model

Introduction

The rolling element bearing is one of the most commonly used rotating components, and it is meaningful to study effective fault diagnosis methods to avoid catastrophic accident. The fast Fourier transform (FFT) and envelope demodulation are the two classical signal-processing methods for this purpose. In recent years, kinds of related new signal-processing methods for fault diagnosis of rolling element bearing have been arising such as wavelet transform (WT),¹ tunable quality-factor wavelet transform (TQWT),² ensemble empirical mode decomposition (EEMD),³ Spectral kurtosis (SK),⁴ and modified fast kurtogram.⁵ However, most of the above techniques are only effective for single fault of rolling bearing, which means that if faults occur in different parts of rolling element bearing at the same time, most of the above methods would not work effectively: (1) coupling effect phenomenon will exist in the compound faults usually. (2) The gear-meshing signal with its harmonics will make a big interference if compound faults arise in the gearbox. Fault diagnosis of rolling bearing compound fault is not only challenging but also is a hot area. Amounts of literature arise in recently years, and most of them are mainly focusing on classification of compound faults using intelligent algorithms. The support vector machine (SVM) was improved in paper⁶ and the one-against-all multiclass support vector machine (MSVM) method was proposed, and which was combined with heterogeneous feature models and used in multiple combined fault diagnosis of bearings. The experimental results showed that the classification performance could be improved effectively by the proposed method. A step-by-step compound faults diagnosis method for equipment based on majorization–minimization (MM) and constraint sparse component analysis (SCA) was proposed to separate the compound faults.⁷ A novel method called deep decoupling convolutional neural network was proposed for intelligent compound faults diagnosis successfully.⁸ A novel method called multi-scale feature extraction (MFE) and MSVM with particle parameter adaptive (PPA) was proposed for intelligent multiple-fault diagnosis.⁹ To diagnose compound faults of locomotive roller bearings accurately, a novel hybrid intelligent diagnosis method was proposed,¹⁰ and the diagnosis results of the compound faults of the locomotive roller bearings verified that the proposed hybrid intelligent method may accurately recognize compound faults. A new method based on use of combined mode functions for selecting the intrinsic mode functions instead of the maximum cross correlation coefficient–based EEMD technique, sandwiched with, convolution neural networks, which were deep neural nets, used as fault classifiers.¹¹ A multi-fault diagnosis method of rolling bearing elements by combing wavelet analysis with hidden Markov model (HMM) was proposed.¹² An effective method for multi-fault diagnosis was presented with optimizing signal decomposition levels using wavelet analysis and SVM.¹³ A novel method of rolling bearing fault diagnosis based on a combination of EEMD, weighted permutation entropy, and an improved SVM ensemble classifier was proposed.¹⁴ The stationary WT and singular value decomposition were combined and the stationary wavelet singular entropy was proposed, which was used to extract fault feature of compound faults of rolling bearing. The extracted features were passed on to a kernel extreme learning machine classifier and satisfactory results could be obtained through experiment verification.¹⁵ The multi-scale wavelet entropy of rolling bearing compound faults was computed and used as training and test input of the kernel extreme learning machine classifier, and the results verified that the proposed method had slight much better diagnosis accuracy than other related methods.¹⁶

Based on the above literature, it could be concluded that there are two key steps in intelligent diagnosis of rolling bearing compound faults: feature extraction and efficient intelligent classification algorithm. As a relative signal-processing method, sparse decomposition could capture the implicit characteristics of vibration signal when fault arises in rotating machinery, and it has been used widely in fault diagnosis of rotating machinery and other areas.²,^17–20 However, most of papers relating to fault diagnosis of rolling bearing basing on sparse decomposition mainly focus on single fault of rolling bearing, and it has great application potential in feature extraction of rolling bearing compound fault. In recent years, kinds of intelligent classification algorithms such as deep belief network (DBN),²¹,²² generalized linear regression model,²³ hidden Markov random field,²⁴ hybrid MLPNN-ICA,²⁵ hybrid CNN-MLP,²⁶ and so on have been arising. However, most of these intelligent methods are used mainly in image classification. Besides, HMM is an effective and mature intelligent algorithm, which has been used widely in fault diagnosis of rotating machinery^27–29 and the authors of this article also has done some work on HMM.³⁰ So, the article proposes an intelligent diagnosis method of rolling bearing compound faults based on device state dictionary set sparse decomposition feature extraction-HMM. First, characteristic dictionaries of rolling bearing under different operating conditions are extracted by sparse decomposition self-learning method, and state dictionary set of rolling bearing is constructed. Then, the compound fault signals of bearing are transformed into sparse domain using the constructed dictionary set to extract sparse features. At last, the extracted sparse features are used as training and testing vectors of HMM, and satisfactory intelligent diagnosis results are obtained.

The organization of the article is as following: section “The sparse feature extraction method based on device state dictionary set” is dedicated to the sparse feature extraction method based on device state dictionary set. Section “HMM” and section “The flow chart of the proposed method” are dedicated to the basic theory of HMM and the flow chart of the proposed method, respectively. Experiment is carried out in section “Experiment,” and the analysis results are presented in section “Experiment.” Besides, the comparison and discussion are also given in section “Experiment” and conclusion is given in section “Conclusion” at last.

The sparse feature extraction method based on device state dictionary set

The construction method of sparse feature

Suppose there exists $L$ known target classes and the total training sample sets are represented by $Y = [Y_{1}, Y_{2}, \dots, Y_{L}]$ , and $Y_{i}$ is the $i th$ training sample set. Let $D_{i} = [d_{i, 1}, d_{i, 2}, \dots, d_{i, n_{i}}]$ represent the sub-dictionary of the $i th$ target class, and $D_{i}$ could be obtained using the following function

\begin{array}{l} {D_{i}, X_{i}} = \arg \min_{D_{i}, X_{i}} {‖ Y_{i} - D_{i} X_{i} ‖}_{2}^{2} + λ {‖ X_{i} ‖}_{0} \\ s . t . \forall i, {‖ d_{i, k} ‖}_{2} \leq 1 \end{array}

(1)

All the sub-dictionaries are gathered into a larger redundant dictionary $D = [D_{1}, D_{2}, \dots, D_{L}]$ , so that the dictionary contains all the sub-dictionaries to express the target class. $D$ is named as dictionary set in order to distinguish it from $D_{i}$ . Assuming $y$ is a test sample signal and its sparse coefficients $X = [X_{1}, X_{2}, \dots, X_{L}]$ under $D$ are calculated, and $X_{i} = [x_{i, 1}, x_{i, 2}, \dots, x_{i, n_{i}}]$ are the sparse coefficients corresponding to $D_{i}$ . Furthermore, the potential components of each sub-dictionary could be obtained using the sub-dictionary $D_{i}$ and its corresponding coefficients $X_{i}$

y_{i} = D_{i} X_{i} = \sum_{m} x_{i, m} d_{i, m} (1 \leq i \leq L)

(2)

Meanwhile, the test sample single $y$ could be expressed as

y = \sum_{j = 1}^{L} y_{j}

(3)

The sub-dictionary $D_{i}$ of signal $y_{i}$ has strong adaptability to the test sample signal $y$ . That is to say, the sub-dictionary $D_{i}$ could be more likely to be activated to approach or represent the test sample signal $y$ . $i$ is assumed to represent the class label of the test sample $y_{i}$ , and the sub-dictionary $D_{i}$ is more likely to be activated. That is to say, the non-zero term in its sparse coefficients of $D_{i}$ is most likely to appear in $X_{i}$ . Sparse coefficients are often used directly as sparse features for classification in image or speech signal-processing area. However, in mechanical signals, the dimension of sparse coefficients is often high. In this article, a sparse feature construction method based on energy distribution is proposed.

The test sample signal $y$ could be decomposed into the sum of a series of sub-components under the redundant dictionary set $D = [D_{1}, D_{2}, \dots, D_{L}]$ using the following equation

y = \sum_{j = 1}^{L} l_{j}

(4)

where

l_{i} = D_{i} X_{i}

. The energy of each sub-component could be defined as following

E_{i} = \sum_{n} {(l_{i} (n))}^{2}

(5)

The normalized energy shown in equation (6) is used as sparse feature to prevent the occurrence of large amounts of energy

{\tilde{E}}_{i} = \frac{E_{i}}{E} = \frac{E_{i}}{\sum_{m} E_{m}}

(6)

The last obtained normalize sparse eigenvectors are shown in equation (7)

F = [{\tilde{E}}_{1}, {\tilde{E}}_{2}, \dots, {\tilde{E}}_{L}]

(7)

In summary, the flow chart of sparse feature construction method based on dictionary learning is shown in Figure 1, and it could be divided into two main steps: the construction of redundant dictionary set $D = [D_{1}, D_{2}, \dots, D_{L}]$ and sparse features $F = [{\tilde{E}}_{1}, {\tilde{E}}_{2}, \dots, {\tilde{E}}_{L}]$ . The shift invariant sparse coding (SISC)³¹ method is used to construct the redundant dictionary set $D = [D_{1}, D_{2}, \dots, D_{L}]$ , and the feature-sign search (FSS)³² method which will be discussed in section “Fast algorithm for sparse decomposition” is used to calculate the sparse coefficients of $D_{i}$ .

Figure 1.

Sparse feature construction based on dictionary learning.

Fast algorithm for sparse decomposition

Although there are many algorithms for solving sparse coefficients such as matching pursuit (MP),³³ basis pursuit (BP),³⁴ and so on, all of these algorithms have the problem of large amount of computation. The FSS method solves the analytic solution by guessing the sign of coefficients, which is more efficient than MP and BP. The objective optimization function of BSS could be expressed as equation (8) when the dictionary set $D$ is known

\min_{S} {‖ y - D S ‖}_{2}^{2} + β \sum_{j} | s_{j} |

(8)

This problem existing in equation (8) is a $L_{1}$ regular least squares problem. The $L_{1}$ norm problem of $| s_{j} |$ could be ignored if the sign of each element in coefficients $S$ is known: when $s_{j} < 0$ , there is $| s_{j} | = - s_{j}$ ; when $s_{j} > 0$ , there is $| s_{j} | = s_{j}$ ; when $s_{j} = 0$ , there is $| s_{j} | = 0$ . Then, equation (8) could be transformed into unconstrained quadratic optimization problem which could achieve effective solution. The following are main steps of FSS:

Initialize each element of the sparse coefficient with their corresponding sign, that is, $s_{i} = 0$ , $θ_{i} = 0, θ_{i} \in {- 1, 0, 1}$ . Besides, the initialization set is initialized as $active set = {}$ .

For all the coefficients with value of 0, select $i = \arg \max_{i} | \partial {‖ y - D S ‖}_{2}^{2} / \partial s_{i} |$ .

If $\partial {‖ y - D S ‖}_{2}^{2} / \partial s_{i} > β$ , then $θ_{i} = - 1$ , $active set = {i} \cup active set$ .

If $\partial {‖ y - D S ‖}_{2}^{2} / \partial s_{i} < β$ , then $θ_{i} = 1$ , $active set = {i} \cup active set$ .

Note $\hat{D}$ , $\hat{S}$ , and $\hat{θ}$ as the sub-vector sets corresponding to $D$ , $S$ , and $θ$ in $active set$ , and the analytical solution of unconstrained quadratic optimization problems is computed through $\hat{D}$ , $\hat{S}$ , and $\hat{θ}$ . Besides, the discrete search is carried out in the closed line segment from $\hat{S}$ to ${\hat{S}}_{new}$ , and the sign of the searched point ${\hat{S}}^{'}$ is compared with the sign of ${\hat{S}}_{new}$ as follows:

If the signs of ${\hat{S}}^{'}$ and ${\hat{S}}_{new}$ are same, then $\hat{S} = {\hat{S}}_{new}$ ;

If the signs of ${\hat{S}}^{'}$ and ${\hat{S}}_{new}$ are not same, then $\hat{S} = {\hat{S}}^{'}$ .

The 0 components of $\hat{S}$ are removed from $active set$ , and update $θ_{i} = sign (\hat{S})$ , and the optimal conditions are checked as follows:

For non-zero coefficients: $\partial {‖ y - D S ‖}_{2}^{2} / \partial s_{i} + β sign (s_{i}) = 0, \forall s_{i} \neq 0$ . If the condition a is not satisfied, then go back to step (3). Otherwise, check the condition b.

For zero coefficients: $| \partial {‖ y - D S ‖}_{2}^{2} / \partial s_{i} | \leq β, \forall s_{i} = 0$ . If the condition b is not satisfied, then go back to step (2). Otherwise, end the iteration process.

HMM

A HMM¹² is a finite state statistical structure with a fixed number of states, and it is generally applicable to analyze the non-stationary signals such as speech and time-varying noise. HMM is a double-embedded stochastic process with an underlying stochastic process which is not observable directly, but can be observed only through another set of stochastic process which produces the sequence of observations. HMM could be divided into discrete hidden Markov model (DHMM) and continuous hidden Markov model (CHMM) based on the property of the observations which is discrete or continuous. A DHMM can be described using the following parameters:

States. Let $N$ represent the number of states in the model. The states can be described as $S = {S_{1}, S_{2}, \dots, S_{N}}$ . The individual state at time $t$ is denoted as $q_{t}$ . Apparently, there is $q_{t} \in S$ .

Observation symbols. Let $M$ denote the number of distinct observation symbols per state. The observation symbols are represented as $V = {V_{1}, V_{2}, \dots, V_{M}}$ . The observable value at time $t$ is denoted as $o_{t}$ . $o_{t} \in V$ .

State transition probability distribution $A = {α_{i j}}$

α_{i j} = P (q_{t + 1} = S_{j} | q_{t} = S_{i}) (1 \leq i, j \leq N)

(9)

with the following property

\sum_{j = 1}^{N} a_{i j} = 1 \forall i

(10)

in which $i$ and $j$ denote the state indices.

4. Observation symbol probability distribution $B = {b_{j} (k)}$

b_{j} (k) = P (o_{t} = V_{k} | q_{t} = S_{j}) (1 \leq j \leq N, 1 \leq k \leq M)

(11)

5. Initial state distribution $π = {π_{i}}$

π_{i} = P (q_{1} = S_{i}) (1 \leq i \leq N)

(12)

In summary, a HMM $λ$ is defined by two model parameters, $N$ and $M$ , observation symbols, and three sets of probability measures: $A$ , $B$ , and $π$ . The model $λ$ is expressed as

λ = (π, A, B)

(13)

In practical applications, the observations encountered are continuous usually. Although the continuous signal can be encoded into discrete points, amounts of valuable information may be lost in this encoding process. In this case, a CHMM is advantage over a DHMM. In a CHMM, the Gaussian mixture model is used usually to fit the probability distribution of the observations

b_{j} (O) = \sum_{m = 1}^{M} c_{j m} N (O, μ_{j m}, U_{j m}) (1 \leq j \leq N)

(14)

In equation (14), $M$ is the number of Gaussian elements, $c_{j m}$ is the mixture coefficient of $m th$ Gaussian element in $j th$ state. $μ_{j m}$ and $U_{j m}$ are the mean vector and covariance matrix of the mixture coefficient of $m th$ Gaussian element in $j th$ state. In summary, a CHMM can be described as being shown in equation (15) whose observations probability is mixture Gaussian distribution

λ = (π, A, C, μ, U)

(15)

The parameters can be estimated using the expectation maximum (EM³⁵) algorithm.

The flow chart of the proposed method

The flow chart of the proposed method is shown in Figure 2 which contains main four basic steps as follows:

Figure 2.

Flow chart of the proposed method.

State dictionary set construction. Apply dictionary learning method on the training vibration data of the different running states of rolling bearing, and the sub-dictionary $D_{i}$ corresponding to the different running states are obtained. Then, fuse each sub-dictionary $D_{i}$ and the device state dictionary set $D = [D_{1}, D_{2}, \dots, D_{c}]$ is obtained.

Sparse feature extraction. Extract the sparse feature based on $D = [D_{1}, D_{2}, \dots, D_{c}]$ and each group of signals is decomposed into the sum of a series of sub-components, that is, $L = {L_{1}, L_{2}, \dots, L_{C}}$ . Besides, the obtained sparse feature is normalized and last obtained sparse feature is marked as $F = [{\tilde{E}}_{1}, {\tilde{E}}_{2}, \dots, {\tilde{E}}_{c}]$ .

The training of HMM model. A HMM model is trained for each state of data and the trained HMM models are saved in the model base, that is, $λ = {λ_{1}, λ_{2}, \dots, λ_{C}}$ .

On-line diagnosis. On-line diagnosis of the test samples using the trained HMM models in the model base. First, extract the feature of the test samples using the state dictionary set and the feature sequence $O = {o_{1}, o_{2}, \dots, o_{T}}$ is generated. Then, calculate the probability of test samples under different models, that is, $P (O | λ_{i})$ , and the state of the test samples is decided by the biggest $P (O | λ_{i})$ .

Experiment

The rolling element bearing compound fault experiment is carried out in the section to verify the effectiveness of the proposed method. Figure 3 is the test rig, and NU205 is the used rolling bearing type in the experiment. The corresponding parameters of NU205 are presented in Table 1. Four running states of test bearings are implemented, respectively: normal (N), outer race and ball compound fault (OB), outer and inner race compound fault (OI), and outer race and inner race and ball compound fault (OIB). Figure 4(a)–(c) is the processed faults on inner race, out race, and rolling element of the test bearing, respectively, and the three kinds of compound faults are realized by their different combinations. The right-end bearing supporting the right end of the shaft is detachable for the convenient replacement of the bearing in the test processes. During the experiment, the outer ring is fixed and the inner ring rotates synchronously with the shaft. The acceleration sensor is installed near the test bearing and is used to collect the peak value of corresponding vibration signal. Set the sampling frequency as $f_{s} = 8192 Hz$ and the rotating frequency of the shaft is $f_{r} = 13.3 Hz$ . Equations (16)–(18) are used to calculate the characteristic frequencies of inner race fault, outer race fault, and rolling element

f_{i} = \frac{Z}{2} (1 + \frac{d}{D} \cos β) f_{r}

(16)

f_{o} = \frac{Z}{2} (1 - \frac{d}{D} \cos β) f_{r}

(17)

f_{b} = \frac{D}{d} [1 - {(\frac{d \cos β}{D})}^{2}] f_{r}

(18)

Figure 3.

Test rig.

Table 1.

Parameters of the test rolling element bearing.

Type	Ball number	Ball diameter (mm)	Pitch diameter (mm)	Contact angle	Motor speed (r/min)
NU205	12	7.5	39	0	800

Figure 4.

Processed faults on the each parts of the test bearing: (a) inner race fault, (b) rolling element fault, and (c) outer race fault.

In equations (16)–(18), $Z$ is the number of rolling elements, $d$ is rolling element diameter, $D$ is the pitch diameter, and $β$ is the contact angle. The values of $f_{i}$ , $f_{o}$ , and $f_{b}$ are 95.38, 64.61, and 5.38 Hz, respectively, through calculation.

The time-domain waveforms of the test bearings’ four states with their corresponding envelope demodulation spectral are shown in Figure 5. Figure 5(a) and (b) is the time-domain waveform with it envelope demodulation spectral of N. It is evident that very little amount of impulsion signal arises in the time-domain waveform of N. In Figure 5(d), though the fault characteristic frequency (FCF) of outer race is extracted, the FCF of ball is not extracted. In Figure 5(f) and (h), the spectral lines are chaotic from which the compound fault features of OI and OIB could not identified clearly.

Figure 5.

Time-domain waveforms of test bearings’ fours states with their corresponding envelope demodulation spectral: (a) time-domain waveform of N, (b) envelope demodulation spectral of the signal shown in (a), (c) time-domain waveform of OB, (d) envelope demodulation spectral of the signal shown in (c), (e) time-domain waveform of OI, (f) envelope demodulation spectral of the signal shown in (e), (g) time-domain waveform OIB, and (h) envelope demodulation spectral of the signal shown in (g).

Ten groups of samples of each state are selected randomly as training samples and are used to learn the corresponding sub-dictionary $D_{i}$ . The original signal of bearing’ four states are analyzed by SISC, and the parameters are selected as follows: the atomic length is 256, the overlap rate is 0.25, the sparsity is 1, and the number of base atoms is 4. Then, the sub-dictionary of the four states are obtained, and use $D_{1}$ to represent the sub-dictionary of N, $D_{2}$ to represent the sub-dictionary of OB, $D_{3}$ to represent the sub-dictionary of OI, and $D_{4}$ to represent the sub-dictionary of OIB. The device state dictionary set $D = [D_{1}, D_{2}, D_{3}, D_{4}]$ is obtained by fusing the four of them. The learned basis for each state using SISC is presented in Figure 6.

Figure 6.

Learned basis atoms for each condition: (a1)–(a4) for Normal; (b1)–(b4) for OB; (c1)–(c4) for OI; and (d1)–(d4) for OIB.

The sub-dictionary has much better adaptability to the samples in the corresponding state since each of them is obtained from the training samples in the corresponding state. In other words, the sub-dictionaries are more easily activated when the test samples are expressed by the state dictionary set. The test sample could be decomposed into a series of sub-components ${l_{1}, l_{2}, l_{3}, l_{4}}$ using the sparse feature construction method as introduced in the previous section after obtaining the dictionary set. Besides, the normalized sparse feature ${E_{1}, E_{2}, E_{3}, E_{4}}$ is also obtained, and values of sparse features represent the energy distribution of samples in each sub-dictionary. Figure 7 is the energy distribution of each class of bearing data on each sub-dictionary.

Figure 7.

Energy distribution of each class of bearing data on each sub-dictionary: (a) N, (b) OB, (c) OI, and (d) OIB.

Each group of training samples are divided into 10 segments, and the length of each segment was 0.08 s. Each segment of the signal is sparsely decomposed, and sparse features are extracted in the dictionary. A 4*10 feature vector sequence can be obtained from each training sample group. HMM is trained using the feature sequences of different states, and four HMM models ${λ_{N C}, λ_{ORF}, λ_{IRF}, λ_{REF}}$ corresponding to the bearing’ four states are obtained. Sparse features are extracted from each group of test samples and are input into each trained HMM models. Likelihood probability is calculated by Viterbi algorithm in turn, and the state of the test samples is determined by the maximum value of likelihood probability. There are total of 80 sets of test samples: No. 1–20 come from N state, No. 21–40 come from OB state, No. 41–60 come from OI state, and No. 61–80 come from OIB state. Figure 8 shows the diagnosis results using different feature extraction vector: (a) sparse feature vectors (SFVs); (b) time-domain statistical feature vectors (TDSVs) such as AMP, P-P, and so on; and (c) wavelet packet energy (WPE). Misclassified samples are marked with black circles: there are total two N samples and one OIB sample misclassified as shown in Figure 8(a). There are total 9 samples misclassified as shown in Figure 8(b), and there are total 17 samples misclassified in Figure 8(c). It is evident that the SFV has better classification result than the other two feature extraction vectors.

Figure 8.

Diagnosis results using different feature set: (a) SFV, (b) TDSV features, and (c) WPE features.

The K-nearest-neighbor (KNN) and BP neural network algorithm are used to analyze the above three kinds of different features to compare their analysis results, and the diagnosis results are given in Table 2. BP neural network adopts three-layer network structure, in which the number of nodes in the input layer is consistent with the characteristic dimension, while the numbers of nodes in the hidden layer and the output layer are 10 and 4, respectively. The diagnostic rates of KNN, BP, and HMM were 90%, 92.5%, and 96.25%, respectively, when the same sparse feature is used as input, and the diagnostic accuracy of HMM is the highest. It can also find that HMM has the best diagnostic effect when other eigenvectors are used. By comparing different features under the same classifier, it is found that the diagnostic rate corresponding to SFV is significantly higher than that of TDSV feature and WPE feature.

Table 2.

Comparisons with other feature set and classifiers.

Feature class	KNN (%)	BP (%)	HMM (%)
SFV	90	92.5	96.25
TDSV	77.5	83.75	88.75
WPE	80	77.5	78.75

KNN: K-nearest-neighbor; HMM: hidden Markov model; SFV: sparse feature vector; TDSV: time-domain statistical feature vector.

Conclusion

Identification of rolling bearing fault patterns, especially for the compound faults, has attracted notable attention and is still a challenge in fault diagnosis. As a relative signal-processing method, sparse decomposition could capture the implicit characteristics of vibration signal when fault arises in rotating machinery, and it has great application potential in composite fault diagnosis of rolling bearing. In this article, a method of constructing sparse features based on dictionary learning is proposed. This method learns each state of the device to get the feature dictionary, then fuses all the feature dictionaries to form the state dictionary set, and then the sparse features of the different compound faults of rolling bearing are learned based on the obtained state dictionary set. Furthermore, the learned sparse features are used as training and test input of HMM, and satisfactory classification results are obtained at last. The concrete flow chart of the proposed method is given, and the validity of the proposed method is verified by compound fault experiment of rolling bearings. Besides, in order to highlight the superiority of sparse feature, the classification results of sparse feature are compared with those of time-domain feature and energy feature of wavelet packet. The comparison results show that sparse feature has high classification accuracy and stability.

Footnotes

Handling Editor: James Baldwin

Declaration of conflicting interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: The research is supported by the National Natural Science Foundation (approved grant: U1804141) and the Key Science and Technology Research Project of the Henan Province (approved grant: 192102210105).

ORCID iD

HongChao Wang

References

Qiu

Lee

Lin

, et al. Wavelet filter-based weak signature detection method and its application on rolling element bearing prognostics. J Sound Vib 2006; 289: 1066–1090.

Selesnick

IW.

Wavelet transform with tunable Q-Factor. IEEE T Signal Proces 2011; 59: 3560–3575.

Huang

NE.

Ensemble empirical mode decomposition: a noise assisted data analysis method. Adv Adapt Data Anal 2009; 1: 1–41.

Antoni

The spectral kurtosis: a useful tool for characterizing non-stationary signals. Mech Syst Signal Pr 2006; 20: 282–307.

Antoni

Fast computation of the kurtogram for the detection of transient faults. Mech Syst Signal Pr 2007; 21: 108–124.

Manjurul

Kim

JM.

Reliable multiple combined fault diagnosis of bearings using heterogeneous feature models and multiclass support vectors machines. Reliab Eng Syst Safe 2019; 184: 55–66.

Hao

Song

Ren

, et al. Step-by-step compound faults diagnosis method for equipment based on majorization-minimization and constrained SCA. IEEE-ASME T Mech 2019; 24: 2477–2487.

Huang

Liao

Zhang

, et al. Deep decoupling convolutional neural network for intelligent compound fault diagnosis. IEEE Access 2019; 7: 1848–1858.

Zhang

Cai

Cheng

WM.

Multiple-fault diagnosis method based on multiscale feature extraction and MSVM-PPA. Shock Vib 2018; 2018: 6209371.

10.

Lei

YY.

Application of a novel intelligent method to compound fault diagnosis of locomotive roller bearings. J Vib Acoust 2008; 130: 034501.

11.

Singh

Kumar

Dwivedi

JP.

Compound fault prediction of rolling bearing using multimedia data. Multimed Tools Appl 2017; 76: 18771–18788.

12.

Purushotham

Narayanan

Prasad SAN.

Multi-fault diagnosis of rolling bearing elements using wavelet analysis and hidden Markov model based fault recognition. NDT&E Int 2005; 38: 654–664.

13.

Abbasion

Rafsanjani

Farshidianfar

, et al. Rolling element bearings multi-fault classification based on the wavelet denoising and support vector machine. Mech Syst Signal Pr 2007; 21: 2933–2945.

14.

Zhou

Qian

Chang

, et al. A novel bearing multi-fault diagnosis approach based on weighted permutation entropy and an improved SVM ensemble classifier. Sensors 2018; 18: 1934.

15.

Rodriguez

Cabrera

Lagos

, et al. Stationary wavelet singular entropy and kernel extreme learning for bearing multi-fault diagnosis. Entropy 2017; 19: 541.

16.

Rodriguez

Pablo A Lida B , et al. Combining multi-scale wavelet entropy and kernelized classification for bearing multi-fault diagnosis. Entropy 2019; 21: 152.

17.

Wang

Chen

Dong

GM.

Feature extraction of rolling bearing’s early weak fault based on EEMD and tunable Q-factor wavelet transform. Mech Syst Signal Pr 2014; 48: 103–119.

18.

Lee

Battle

Raina

, et al. Efficient sparse coding algorithms. Adv Neural Inf Process Syst 2007; 19: 801–808.

19.

Liu

Huang

YX.

Adaptive feature extraction using sparse coding for machinery fault diagnosis. Mech Syst Signal Pr 2011; 25: 558–574.

20.

Rahmani

Akbarizadeh

Unsupervised feature learning based on sparse coding and spectral clustering for segmentation of synthetic aperture radar images. IET Comput Vis 2014; 9: 629–638.

21.

Samadi

Akbarizadeh

Kaabi

Change detection in SAR image using deep belief network: a new training approach based on morphological images. IET Image Process 2019; 13: 2255–2264.

22.

Zalpour

Akbarizadeh

Alaei-Sheini

N. A

new approach for oil tank detection using deep learning features with control false alarm rate in high-resolution satellite imagery. Int J Remote Sens 2020; 41: 2239–2262.

23.

Moghaddam

Akbarizadeh

Kaabi

Automatic detection and segmentation of blood vessels and pulmonary nodules based on a line tracking method and generalized linear regression model. Signal Image Video P 2019; 13: 457–464.

24.

Tirandaz

Akbarizadeh

Kaabi

PolSAR image segmentation based on feature extraction and data compression using Weighted Neighborhood Filter Bank and Hidden Markov random field-expectation maximization. Measurement 2020; 153: 107432.

25.

Ahmadi

Akbarizadeh

Iris tissue recognition based on GLDM feature extraction and hybrid MLPNN-ICA classifier. Neural Comput Appl 2020; 32: 2267–2281.

26.

Sharifzadeh

Akbarizadeh

Kavian

YS.

Ship classification in SAR images using a new hybrid CNN–MLP classifier. J Indian Soc Remote 2019; 47: 551–562.

27.

Kouadri

Hajji

Harkat

, et al. Hidden Markov model based principle component analysis for intelligent fault diagnosis of wind energy converter systems. Renew Energ 2020; 150: 598–606.

28.

Xin

Hamzaoui

Antoni

Semi-automated diagnosis of bearing faults based on a hidden Markov model of the vibration signals. Measurement 2018; 127: 141–166.

29.

Sadhu

Prakash

Narasimhan

A hybrid hidden Markov model towards fault detection of rotating components. J Vib Control 2017; 23: 3175–3195.

30.

Wang

HC.

Fault diagnosis of rolling element bearing based on wavelet Kernel principle component analysis-coupled hidden Markov. J Vibroeng 2017; 19: 5992–6006.

31.

Wang

Gong

, et al. Blind source separation of rolling element bearing’ single channel compound fault based on shift invariant sparse coding. J Vibroeng 2017; 19: 1809–1822.

32.

Elad

Aharon

Image denoising via sparse and redundant representations over learned dictionaries. IEEE T Image Process 2006; 15: 3736–3745.

33.

Hong XB, Zhou JX, He YK. Damage detection of anchored region on the messenger cable based on matching pursuit algorithm. Mech Syst Signal Pr 2019; 130: 221–247.

34.

Zheng

Chen

ZY.

The application of basis de-noising of gear fault diagnosis. J Vib Meas Diagn 2003; 23: 128–130.

35.

Rabiner

LR.

A tutorial on hidden Markov models and selected application in speech recognition. Proc IEEE 1989; 77: 257–286.

Intelligent diagnosis of rolling bearing compound faults based on device state dictionary set sparse decomposition feature extraction–hidden Markov model

Abstract

Keywords

Introduction

The sparse feature extraction method based on device state dictionary set

The construction method of sparse feature

Fast algorithm for sparse decomposition

HMM

The flow chart of the proposed method

Experiment

Conclusion

Footnotes

Declaration of conflicting interests

Funding

ORCID iD

References