Fault diagnosis of motor bearing based on deep learning

Abstract

The effective fault diagnosis of the motor bearings not only can ensure the smooth and efficient operation of equipment but also can detect and eliminate the running fault in time to prevent major accidents. Based on deep learning algorithm, this article constructs a stacked auto-encoder network. The input data are compressed and reduced by introducing sparsity constraint, so that the network can accurately extract the fault characteristics of the input data, and the fault recognition ability of the network can be improved by introducing random noise. The simulation result shows that the stacked auto-encoder network can not only overcome the shortcomings of traditional fault diagnosis method that requires to distinguish fault samples manually and needs a large number of prior knowledge but also realize the self-learning of fault signal feature. The accuracy rate of fault identification reaches 98%, 94%, 96%, and 95.5% in four different working conditions. What’s more, the network can exhibit strong robustness under different working conditions. Finally, the new research ideas of fault diagnosis in thermal power plant are put forward by copying the idea of fault diagnosis of motor bearing.

Keywords

Fault diagnosis feature self-learning deep learning sparsity constraint data reconstruction

Introduction

Motor bearings are an important component of electric motors and are widely used in industrial fields such as electric power production. If the equipment fails, it will affect system operation and may even cause serious economic losses and casualties. Therefore, the effective fault diagnosis of the motor bearing can not only ensure the efficient operation of the systems but also detect and eliminate operational faults in time, effectively preventing major accidents.¹

As vibration signals are highly accurate indicator of the health conditions of mechanical equipment, they are widely used in fault diagnosis.² The traditional method using to detect motor bearing fault is generally divided into three steps. First, the sensor is used to collect the vibration signal of the motor bearing, then the time domain and frequency domain analysis methods are used to analyze the collected signal, and finally the result is shown whether the motor bearing is faulty.^3,4 The analysis methods include multinomial logistic regression (MLR), support vector machine (SVM), wavelet packet transform (WPT), and stability analysis.⁵ Wang et al.^6–10 used sliding-mode control to identify fault diagnosis based on nonlinear Markovian jump singular systems. However, the model represented by nonlinear Markovian jump singular systems is random, which may result in the model not being able to match the real fault characteristics. Thus some researchers have made some contributions to solve this problem. Sun and Yang¹¹ proposed a fault diagnosis system using link-type neural network with least squares method and SVM to improve the recognition rate of motor bearing faults. The method requires a relatively long calculation time; thus, it is not efficient. Liang et al.^12,13 used a geometric approach to solve the problem of fault detection and isolation, and verified the usefulness of the proposed technique. But this method is only suitable for discrete systems, so it cannot be widely promoted. Liu et al.¹⁴ combined the wavelet transform with the empirical mode decomposition and extracted the signal features based on the envelope demodulation signal. However, wavelet transform is not suitable for data with ambient noise. To solve the problem, Pan et al.¹⁵ proposed a fault feature extraction method based on complex wavelet multi-scale decomposition. However, due to higher rotary machinery system complexity and sensory data heterogeneity, the effective diagnosis of multiple health state classifications based on sensory data with strong ambient noise and working condition fluctuations is still a problem and a major challenge for the application of the proposed methodologies in complex engineering systems due to possible information loss and external influences. When the machinery system becomes complicated, it is very difficult to diagnose the health state of the devices based on the vibration signal, because it is unrealistic to extract fault features and sort the type of faults from the complicated sensory data with strong ambient noise.¹⁶

In recent years, deep learning has developed rapidly and gradually becomes a hot research topic.¹⁷ Compared to shallow machine learning, deep learning attempts to build more complex nonlinear functions by simulating the process of learning knowledge in the human brain. One of the great advantages of deep learning is the use of unsupervised training methods. It can learn and extract the characteristics of the data, which greatly reduces the difficulty of identifying fault data.¹⁸ In addition, deep learning is believed to be able to discover useful high-order feature representations, as well as the relevance of initial signals, which motivates the emergence of promising applications for dealing with diagnosis problems faced during classification tasks with complex and mixed system health states, both effectively and accurately.^19,20 To overcome the problem which it is not easy to extract fault features and sort the type of faults from the complicated sensory data with strong ambient noise, we try to apply it to the fault diagnosis of motor bearings. However, although there exists great potential, as well as a crucial need to address these challenges by utilizing the advantages of deep learning techniques, these are still rarely applied in current fault diagnosis research of electromechanical systems.

This article proposes a stacked auto-encoder (SAE) deep neural network and uses it for fault diagnosis. SAE can learn and extract features from the vibration signal of the bearings. SAE adopts an unsupervised training method, which can independently learn data features and effectively avoid the problem of manually classifying fault data.²¹ This article uses SAE to diagnose motor bearing faults in three steps: (1) constructing SAE deep neural network, adding sparse constraints to improve the compression capability, and introducing random noise to the input information to reconstruct the original data; (2) inputting the original vibration signal, training the SAE network layer by layer, and extracting feature information by self-learning; and (3) testing the accuracy of fault identification by the test data, and comparing the fault recognition rate with the other algorithms, which includes auto-encoder (AE), deep belief network (DBN), and SVM. We also compare with the existing reference results to show the advantage of the proposed method at the end of the article.

Fault diagnosis based on SAE

The description of AE

SAE is a deep neural network stacked by AE. The AE adjusts the network weight through training and learning, and finally makes the output of the network equal to the input of the network. Its network structure is shown in Figure 1, where ${x_{1}, x_{2}, \dots, x_{n}; x_{i} \in R^{n}}$ can be treated as a set of unlabeled data and ${x'_{1}, x'_{2}, \dots, x'_{n}; x'_{i} \in R^{n}}$ represents the network output. The b is called bias unit.

Figure 1.

The network structure of auto-encoder.

The transfer process of raw data from the input layer to the hidden layer is called encoding, and the transfer process from the hidden layer to the output layer is called decoding, which can be described by equations (1) and (2)

d = S (W_{1} x + b_{1})

(1)

y = S (W_{2} x + b_{2})

(2)

where $S (\cdot)$ represents sigmoid function, $W_{1}$ represents the weight matrix between the input and hidden layers, $W_{2}$ represents the weight matrix between the hidden and output layers, and $b_{1}$ and $b_{2}$ represent the bias.

AE tries to learn a function $h_{w, b} (x) \approx x$ . In other words, AE is trained to learn an approximate function such that the network output is similar to the network input.

The training of AE

A training set ${(x^{(1)}, y^{(1)}), \dots, (x^{(m)}, y^{(m)})}$ is assumed. It consists of m training samples. The network can be trained using batch gradient descent. For a single training example $(x, y)$ , the cost function can be defined as

J (W, b; x, y) = \frac{1}{2} h_{w, b} (x) - y^{2}

(3)

For a training set of m samples, the overall cost function is

\begin{matrix} J (W, b) = [\frac{1}{m} \sum_{i = 1}^{m} (\frac{1}{2} h_{w, b} (x) - y^{2})] \\ + \frac{λ}{2} \sum_{l = 1}^{n_{l} - 1} \sum_{i = 1}^{s_{l}} \sum_{j = 1}^{s_{l} + 1} {(W_{ji}^{(l)})}^{2} \end{matrix}

(4)

where $λ$ represents weight decay coefficient that controls the relative importance of the two terms in equation (5). $W_{ji}^{(l)}$ represents the synaptic weight between the ith neuron in layer l and jth neuron in layer l + 1. $n_{l}$ represents the number of layers in AE. In other words, $n_{l}$ can represent the output layer of the network, and $s_{l}$ represents the number of the total neurons in layer l. The first term in the definition of $J (W, b)$ is an average sum-of-squares error term. The second term is a weight decay term that can decrease the magnitude of the weights and prevent overfitting.

The weight W and bias b are updated with gradient descent as follows

W'_{ij}^{(l)} = W_{ij}^{(l)} - α \frac{\partial}{\partial W_{ij}^{(l)}} J (W, b)

(5)

b'_{i}^{(l)} = b_{i}^{(l)} - α \frac{\partial}{\partial b_{i}^{(l)}} J (W, b)

(6)

where $α$ represents the learning rate. The partial derivatives in the equations above are derived as follows

\frac{\partial}{\partial W_{ij}^{(l)}} J (W, b) = [\frac{1}{m} \sum_{i = 1}^{m} \frac{\partial}{\partial W_{ij}^{(l)}} J (W, b, x^{(i)}, y^{(i)})] + λ W_{ij}^{(l)}

(7)

\frac{\partial}{\partial b_{i}^{(l)}} J (W, b) = \frac{1}{m} \sum_{i = 1}^{m} \frac{\partial}{\partial b_{i}^{(l)}} J (W, b, x^{(i)}, y^{(i)})

(8)

where

\frac{\partial}{\partial W_{ij}^{(l)}} J (W, b; x, y) = a_{j}^{(l)} δ_{i}^{(l + 1)}

(9)

\frac{\partial}{\partial b_{i}^{(l)}} J (W, b; x, y) = δ_{i}^{(l + 1)}

(10)

where $a_{j}^{(l)}$ represents the activation of unit j in layer l. $δ_{i}^{(l + 1)}$ represents the error term of layer $l + 1$ , given by

δ_{i}^{l} = (\sum_{j = 1}^{s_{l} + 1} W_{ji}^{(l)} δ_{i}^{(l + 1)}) f' (z_{i}^{(l)})

(11)

where $z_{i}^{(l)}$ represents the input weighted sum of unit i in layer l, and $f' (\cdot)$ represents partial deflection of sigmoid function. The error term of the output layer $(n_{l})$ is given by

δ_{i}^{(n_{l})} = - (y_{i} - a_{i}^{(n_{l})}) f' (z_{i}^{(n_{l})})

(12)

f' (z_{i}^{(n_{l})}) = a_{i}^{(l)} (1 - a_{i}^{(l)})

(13)

where $a_{i}^{(l)}$ represents the activation of unit i of layer l, $a_{i}^{(n_{l})}$ represents the activation of unit i in output layer, and $z_{i}^{(n_{l})}$ represents the input weighted sum of unit i in output layer.

Repeating equations (5) to (13) can make the output of AE equal to the input of AE by minimizing the overall cost function (equation (4)).

Improvement of AE

To learn feature information from complex input signals, the original data must be effectively compressed. Usually, by controlling the number of neurons in the hidden layer, the number of neurons in the hidden layer is less than the number of nodes in the input layer. At this time, the high-dimensional raw data are compressed in the hidden layer. This will alleviate the difficulty which AE learns the original data features.

Restricting the number of neurons can reduce the dimension of the data, but the network can learn few features in the hidden layer. On the basis of guaranteeing the diversity of the features in the hidden layer, a method called sparse constraint is introduced in this study to improve AE. The main idea is not to reduce the number of neurons but to consider restrictions to limit the activities of the neurons. It will reduce the dimension of the input data. Accordingly, the original overall cost function (equation (5)) should be modified to introduce an additional penalty factor, given by

J_{sparse} (W, b) = J (W, b) + β \sum_{j = 1}^{s} KL (ρ ∥ {\hat{ρ}}_{j})

(14)

where

KL (ρ ∥ {\hat{ρ}}_{j}) = ρ \log \frac{ρ}{{\hat{ρ}}_{j}} + (1 - ρ) \log \frac{1 - ρ}{1 - {\hat{ρ}}_{j}}

(15)

where $\sum_{j = 1}^{s} KL (ρ ∥ {\hat{ρ}}_{j})$ represents the sparsity penalty term, $β$ controls the weight of the sparsity penalty term, ${\hat{ρ}}_{j}$ represents the average activation of unit j in hidden layer, $ρ$ represents a sparsity parameter, and s represents the number of units in one hidden layer.

The penalty term has the following property: if ${\hat{ρ}}_{j} = ρ$ , then $KL (ρ ∥ {\hat{ρ}}_{j}) = 0$ ; the value increases monotonically with the difference between ${\hat{ρ}}_{j}$ and $ρ$ . Therefore, the activations of hidden units are sufficiently small when $ρ$ is set close to zero.

In addition, in order to improve the ability of AE to identify multiple fault information, this article considers introducing random noise into the original data, so that the network can learn a more essential representation of the original data. The main idea is to set a small number of nodes in input layer to zero at a small probability. However, the probability of introducing random noise should be appropriate, otherwise the noise may cause irreversible damage to the input data.

By stacking the AEs layer by layer, we get an SAE, and the output of each layer in the network can be represented as the characteristics of the original data in different dimensions. We added the softmax classifier to the last layer of SAE, then aggregated the learned features. Finally, fault diagnosis of the motor states can be achieved. The SAE structure in this article is shown in Figure 2.

Figure 2.

The network structure of SAE.

Motor bearing’s fault experiment

The experimental data in this article are from the US Case Western Bearing Data Center.²² The motor bearing vibration signal is measured by the experiment system shown in Figure 3. The system is consisted of a 2 hp motor (left), a dynamometer (right), and associated control circuitry. The bearing model is SKF. The experimenters use electric spark machine to troubleshoot the inner ring, the outer ring, and the ball of the bearing. Then the experimenters place a 16-channel accelerometer on the casing for data acquisition. The sampling frequency is 12 and 48 kHz.

Figure 3.

The bearing test system.

Selecting the vibration signal of the bearing’s drive end (DE) for SAE’s training. We use the data obtained at the frequency of 12 kHz to train the SAE, as shown in Table 1.

Table 1.

The comparison of test results under a single training sample.

Methods	Working condition 2 (%)	Working condition 3 (%)	Working condition 4 (%)
SAE	93	94.6	94.5
AE	80	79.5	81.6
DBN	82.3	85.5	87
SVM	70	72	73.1

SAE: stacked auto-encoder network; AE: auto-encoder; DBN: deep belief network; SVM: support vector machine.

Using the collected data to plot the time domain and frequency domain curves of the bearing vibration signal, as shown in Figures 4 and 5.

Figure 4.

The bearing vibration signal in the time domain: (a) normal operation, (b) inner ring fault, (c) outer ring fault, and (d) ball fault.

Figure 5.

The bearing vibration signal in the frequency domain: (a) normal operation, (b) inner ring fault, (c) outer ring fault, and (d) ball fault.

It can be seen from Figure 4 that although the time domain waveform can indicate the relevant pulse of the fault in advance, however, in some cases (especially when a ball fault occurs), it has a particularly large noise interference, which is not conducive to fault diagnosis. It can also be seen from Figure 5 that the spectral characteristics of various types of faults cannot be accurately distinguished because of noise interference. In summary, for complex bearing vibration signals, a more effective method is needed to improve the accuracy of fault diagnosis.

Simulation and analysis

Parameter determination

The optimum SAE model has important effects on the accuracy rate of fault diagnosis. The unsupervised learning effect of SAE is affected by parameters of model, such as the number of nodes in the input and hidden layers, sparse parameter $ρ$ , and the number of times of network training.^23,24 We use the data under working condition 1 as the training data and carry out relevant experiments to determine the optimal parameters of the SAE. The evaluation criterion is the reconstruction error of the SAE’s first layer, as shown in formula (3). The experimental results are shown in Figure 6.

Figure 6.

Reconstruction error curves for different SAE model parameters.

In fact, the more nodes in the input layer, the more information the model can learn. However, considering the complexity of the calculation, the number of input layer nodes cannot be increased arbitrarily, and an appropriate value should be selected. As can be seen from Figure 6(a), when the number of nodes of the input layer is increased from 100 to 500, the reconstruction error of SAE is gradually reduced to the lowest value. If you continue to increase the number of nodes, the reconstruction error does not change much. This result shows that SAE has the best learning effect when the number of input layer nodes is 500.

The number of nodes in the hidden layer determines the degree which the model compresses the input data. The degree of compression is high when the number of nodes in the hidden layer is small. Considering the complexity of the calculation, based on experience, we choose to build SAE with three layers of AE. The number of hidden layer nodes in the last two layers is set to 50 and 25, respectively. We adjust the number of nodes in the first hidden layer and then observe the change of the reconstruction error. It can be seen from Figure 6(b) that when the hidden layer nodes are smaller than 20% of the input layer nodes, the reconstruction error is within a reasonable range. This result shows that when the number of hidden layer nodes is small, the original data can be better compressed, which is beneficial to reduce the complexity of SAE feature learning.

Sparse constraint is introduced to improve the capability of the SAE model. It can be seen from Figure 8(c) that when the value of $ρ$ is between 0.05 and 0.25, the reconstruction error of the network continues to decrease, showing that the inhibitory effect on neurons is appropriate. With the increase of $ρ$ , the inhibitory effect on neurons is excessive, and the reconstruction error increases rapidly.

As can be seen from Figure 6(d), when the probability of introducing noise is in the range of 0 to .1, the reconstruction error decreases with the increase in noise probability. However, with the continuous increase of noise probability, the reconstruction error becomes larger. This result indicates that if the introducing noise probability is reasonable, it is beneficial to enhance the diversity of SAE learning. However, if the noise is too large, it will destroy the characteristics of the original data and reduce the learning effect. Therefore, it is appropriate to set the introducing noise probability to be .1. Based on the above experimental analysis, the key parameters of the SAE are shown in Table 2.

Table 2.

The SAE network parameters.

Structure parameters	Input neurons	Hidden layer 1	Hidden layer 2	Hidden layer 3	Output layer	Transfer function
	500	100	50	25	4	Sigmoid
Learningparameters	Number of training	Batch size	$ρ$	Noiseprobability	$β$	Sparsepunishmentfactor
	150	100	0.25	0.1	0.003	0.1

SAE: stacked auto-encoder network.

Simulation analysis of fault diagnosis based on SAE

The process of data propagation in SAE can be seen as the process of reconstructing data features. When data passes through the hidden layer, they are compressed. This is equivalent to reducing the dimensions of the data and reducing the complexity of the SAE. We use the data under the working condition 1 to do experiment. We use SAE and principal components analysis (PCA) to learn the original data separately. Figure 7 is the result of their learning.

Figure 7.

Extraction of scatterplots under different methods: (a) feature of learning based on PCA and (b) feature of learning based on SAE.

It can be seen from Figure 7 that through feature learning, SAE can easily distinguish various types of fault data, but the characteristics of fault data learned by PCA are confused and cannot be distinguished. The experimental results show that the deep learning’s ability of extracting effective information from complex raw data is superior to traditional feature extraction methods.

In order to further verify the effectiveness of SAE in motor fault diagnosis, we send the data characteristics obtained by SAE to the softmax classifier for training and finally obtain the fault type output of the motor bearing. We carried out simulation analysis on four different working conditions in Table 3. Each type of working condition selects 1000 sets of data as the training set and 200 sets of data as the test set. The fault diagnosis results of SAE are shown in Figures 8 and 9, where 1 indicates normal operation, 2 indicates inner ring fault, 3 indicates outer ring fault, and 4 indicates ball fault.

Table 3.

DE bearing working conditions.

Working conditions	Load (hp)	Spinningspeed(r/min)	Fault diameter(inch)
Working condition 1	0	1797	0.007
Working condition 2	1	1772	0.014
Working condition 3	2	1750	0.021
Working condition 4	3	1730	0.021

Figure 8.

SAE fault diagnosis results for case 1 and case 2.

Figure 9.

SAE fault diagnosis results for case 3 and case 4.

It can be seen from Figure 8 that for the 200 test samples under working condition 1, only three sets of ball fault samples are misdiagnosed as inner ring fault, and the accuracy of fault diagnosis is as high as 98.5%. The accuracy of fault diagnosis under the other three working conditions also reached 94%, 96%, and 95.5%, respectively. The experimental results show that the constructed SAE has a high recognition ability for the fault diagnosis of the motor bearing. We also compare the proposed method with the exiting research. The methods used in these studies include AE, DBN, and SVM. We also use the above methods for fault identification under these four working conditions. The experimental results are shown in Figure 10. As can be seen from the figure, SVM has the worst fault recognition capability when faced with complex systems. Because AE has only a single-layer structure, the ability to extract valid information from raw data is insufficient, and the recognition ability is not high. Although DBN has a deep neural network structure, the fault recognition capability still needs to be improved because of the lack of a method to solve the random noise in the data. SAE overcomes the problems of the above methods, so the fault diagnosis effect is significantly better than the other method.

Figure 10.

The comparison of fault diagnosis results of various methods under different operating conditions.

In the actual industrial process, the load on the motor may change frequently. In order to verify the robustness of the SAE for motor bearing fault diagnosis, we only use 1000 sets of data under the working condition 1 as the training samples and use the data under other working conditions as the test samples. The simulation results are shown in Table 3.

The experimental results in Table 3 show that the working fluctuation leads to a decrease in the accuracy of fault identification. Under the working condition 4, the recognition accuracy rate of SVM decreases the most, which decreases by 6.2%, but the SAE’s fault recognition rate falls within an acceptable range. The experimental results show that the recognition ability of SAE has strong robustness.

Application prospect of SAE in thermal power plant

Thermal power generation plays an important role in economic development. Therefore, it is also very important to monitor the fault of mechanical equipment in thermal power plants. The research object of this article is motor, which can generate fault data through human processing; however, the thermal power plant is a complex and holistic system. So it is impossible to build an actual physical model to generate fault data.

Zeng et al.²⁵ established an accurate coal mill model based on mass balance and energy balance theorem. Drawing on the idea of making fault to obtain fault data, we can solve the problem, which the actual coal mill model cannot be established, by model simulation. Then, we can use the SAE proposed in this article to realize the fault diagnosis of coal mill.

Further prospects, for the diagnosis of other important mechanical equipment in the industry, we can also apply the method of this paper. First, we establish an accurate mathematical model through mechanism analysis, then generate fault data through simulation, and finally use SAE to realize fault diagnosis of equipment.

Conclusion

In fact, SAE is a deep neural network composed of SAEs. It adopts the unsupervised learning method for training, and learns the characteristics of the input data autonomously by reconstructing the original information, which overcomes the problem that the traditional fault diagnosis method needs to manually classify the fault data. However, fault diagnosis based on SAE also has its own defects, such as relying on experiments to determine the parameters of the network. Therefore, it is necessary to conduct more in-depth theoretical research on the selection of network parameters in the future.

Footnotes

Handling Editor: James Baldwin

Declaration of conflicting interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) received no financial support for the research, authorship, and/or publication of this article.

ORCID iD

Yifan Jian

References

Shi

Xiong

Chen

, et al. Divisional fault diagnosis of power grids based on RBF neural network and fuzzy integral fusion. P Chin Soc Electr Eng 2014; 34: 562–569.

Chen

Shen

Yan

Enhanced least squares support vector machine-based transfer learning strategy for bearing fault diagnosis. Chin J Sci Instrum 2017; 38: 33–39.

Yan

Zhang

, et al. Stability analysis for delayed neural networks via improved auxiliary polynomial-based functions. IEEE T Neur Net Lear 2019; 30: 2562–2568.

Bai

Huang

, et al. Improved stability analysis for delayed neural networks. IEEE T Neur Net Lear 2018; 29: 4535–4541.

Yan

Zhan

, et al. Improved inequality-based functions approach for stability analysis of time delay system. Automatica 2019; 108: 108416.

Wang

Karimi

Shen

, et al. Fuzzy-model-based sliding mode control of nonlinear descriptor systems. IEEE T Cybernetics 2019; 49: 3409–3419.

Wang

Karimi

Lam

, et al. An improved result on exponential stabilization of sampled-data fuzzy systems. IEEE T Fuzzy Syst 2018; 26: 3875–3883.

Wang

Yang

Yan

Reliable fuzzy tracking control of near-space hypersonic vehicle using aperiodic measurement information. IEEE T Ind Electron 2019; 66: 9439–9447.

Wang

Xia

Shen

, et al. SMC design for robust stabilization of nonlinear Markovian jump singular systems. IEEE T Automat Contr 2018; 63: 219–224.

10.

Wang

Shen

Karimi

, et al. Dissipativity-based fuzzy integral sliding mode control of continuous-time T-S fuzzy systems. IEEE T Fuzzy Syst 2018; 26: 1164–1176.

11.

Sun

Yang

SY.

Application of functional link artificial neural networks constructed with least squares support vector machine in fault diagnosis of rolling bearings. P Chin Soc Electr Eng 2010; 30: 82–87.

12.

Liang

Zhang

Huang

, et al. Prescribed performance cooperative control for multiagent systems with input quantization. IEEE T Cybernetics. Epub ahead of print 31 January 2019. DOI: 10.1109/TCYB.2019.2893645

13.

Liang

Zhang

Ahn

CK.

Event-triggered fault detection and isolation of discrete-time systems based on geometric technique. IEEE T Circuits-II. Epub ahead of print 27 March 2019. DOI: 10.1109/TCSII.2019.2907706

14.

Liu

Order bispectrum analysis based on fault characteristic frequency and its application to the fault diagnosis of rolling bearings. P Chin Soc Electr Eng 2013; 33: 123–130.

15.

Pan

Liang

, et al. Application of the complex wavelet analysis in fault feature extraction of blower rolling bearing. P Chin Soc Electr Eng 2015; 35: 4147–4152.

16.

Shen

Wang

Kong

, et al. Fault diagnosis of rotating machinery based on the statistical parameters of wavelet packet paving and a generic support vector regressive classifier. Measurement 2013; 46: 1551–1564.

17.

Liu

Luo

Research and development on deep learning. Appl Res Comput 2014; 31: 1921–1930.

18.

Arel

Rose

Karnowski

TP.

Deep machine learning-a new frontier in artificial intelligence research. IEEE Comput Intell M 2010; 5: 13–18.

19.

Zhang

Zhu

Sun

, et al. Cross-media retrieval with collective deep semantic learning. Multimed Tools Appl 2018; 77: 22247–22266.

20.

Niu

Liu

Zhou

, et al. Multiple Lyapunov functions for adaptive neural tracking control of switched nonlinear nonlower-triangular systems. IEEE T Cybernetics. Epub ahead of print 3 April 2019. DOI: 10.1109/TCYB.2019.2906372

21.

, et al. Research and prospect of deep auto-encoders. Comp Modern 2014; 8: 128–134.

22.

Loparo

KA.

Case Western Reserve University bearing data center [EB/OL]. Cleveland, OH: Case Western Reserve University, 2012, http://csegroups.case.edu/bearingdatacenter/pages/welcome-case-western-reserve-university-bearing-data-center-website

23.

Wang

Jing

Yang

Study of isolated speech recognition based on deep learning neural networks. Appl Res Comput 2015; 32: 2289–2298.

24.

Pan

Wang

Survey on collaborative filtering recommendation algorithm based on extreme learning machine stacked denoising autoencodes. Appl Res Comput 2016; 33: 2332–2335.

25.

Zeng

Gao

Modeling and simulation of MPS medium speed coal mills. J Chin Soc Power Eng 2015; 35: 55–61.