Abstract
Gearbox diagnosis under stationary operating conditions has been extensively investigated; however, variable operating conditions such as load and speed changes play important roles in affecting the accuracy of gearbox diagnosis. This article presents an integrative approach of intrinsic time-scale decomposition and hierarchical temporal memory for gearbox diagnosis under variable operating conditions. A total of two modules are emphasized including a feature extraction method and an integrative feature fusion and classification model. Intrinsic time-scale decomposition method is investigated to extract the gearbox features which are insensitive to variable operating conditions, and its performance overcomes the commonly used empirical mode decomposition in terms of decomposition result and computational efficiency. Hierarchical temporal memory integrates feature fusion and pattern classification in one model to autonomously diagnose gearbox defect. Performance comparison among the presented method, back-propagation neural network, support vector machine, and fuzzy c-means clustering using experimental data demonstrate the effectiveness of the presented method.
Keywords
Introduction
As an essential component in virtually all industrial processes, gearbox is widely used to convert speed and torque to maintain machinery normal operation. 1 Gearbox defect may cause failure of whole system, leading to significant economic losses, costly downtime, and even catastrophic damage. Thus, online monitoring and fault diagnosis of gearboxes are of great importance to achieve a high degree of availability, reliability, and operation safety.
Effective signal processing algorithms play important roles in gearbox defect diagnosis and become an active research field.2–4 Typical gear faults including pitting, chipping, and crack may cause amplitude or phase modulation of vibration signals; thus, advanced signal processing techniques have been investigated for gearbox diagnosis. Vibration analysis techniques including statistical metrics, spectral kurtosis, and envelop analysis are applied to diagnose the presence of naturally developed faults within gearboxes in Elasha et al. 5 A hybrid technique integrating Hilbert transform and wavelet packet transform is presented to improve time–frequency resolution for gear incipient fault detection in Fan and Zuo. 6 A sparsity-enabled signal decomposition method based on tunable Q-factor wavelet transform is also investigated for fault feature extraction of gearbox. 7 To eliminate the selection of scale and base wavelet in wavelet transform, empirical mode decomposition (EMD), as an adaptive signal processing technique, is performed with statistical parameter analysis of vibration and acoustic signals to detect local faults of helical gears. 8 Another study on fault diagnosis of wind turbine gearbox had solved the nonlinear and nonstationary problem by combining EMD and wavelet transform. 9 A self-adaptive noise cancellation method was presented in Tian and Qian 10 to eliminate the white noise which effectively improved the accuracy of fault diagnosis. A least mean square–based adaptive filtering scheme is investigated to diagnose tooth breakage with different severities. 11 These developed techniques are effective in fault feature extraction. However, the relationships between fault features and gearbox failure modes are not explicit, which causes difficulty in identifying gearbox failure modes. 12
To address the above issues, artificial intelligence techniques have been investigated to classify gearbox defects by taking the extracted fault features as inputs. In Hajnayeb et al., 13 a novel system based on multilayer perceptron artificial neural networks (ANNs) was designed to classify four different conditions of a gearbox using its vibration signals 14 and introduced Daubechies wavelets (db1–db15) for feature extraction of vibration signals produced by a bevel gearbox in various conditions and faults. The J48 algorithm was used for feature selection and classification of various conditions of gearbox. Statistical features, frequency-domain features, and instantaneous energies based on EMD are applied for the gear crack diagnosis with different levels. 15 Time history, spectrum analysis, and fractal dimension are used for classifying the tooth crack and spalling failure of gear system in Ma and Chen. 16 In Liang et al., 17 the fault symptoms of gear tooth crack are identified and located through analyzing the effect of multiple vibration sources. An integrative approach of ensemble EMD (EEMD) and principal component analysis (PCA) is also reported in Yang and Wu 18 for gearbox defect diagnosis. The above techniques are developed and suitable for gearbox feature extraction and fault diagnosis under stationary operating conditions. However, load change and speed change of gearboxes as well as the impact of external factors exist in practice, 19 so gearboxes are usually operated under variable conditions in some application scenarios, such as wind turbine gearbox. Particular interest in gearbox diagnosis under varying operating conditions also arises since it may highlight a series of transient defect features which enhance gearbox diagnosis.
Many efforts have been put on fault feature extraction of gearbox under variable operating conditions. Dynamic and three-dimensional finite element analytical models of cracked gears are established for gear fault diagnosis under different load conditions in Meltzer and Dien. 20 Given that faulty gearbox is more susceptible to load than healthy gearbox, a regression equation describing the slope of vibration meshing component amplitude with respect to instantaneous input speed is selected as gearbox fault feature under varying load conditions. 21 In Cheng et al., 12 a three-dimensional finite element model was built to reveal stress intensity factors for surface crack on the spur gear tooth. An envelope order spectrum is developed to illustrate the amplitude-modulated and frequency-modulated features of vibration signals under varying operating speed conditions.22,23 In Heyns et al., 24 an autoregressive exogenous (ARX)–based model is investigated to obtain a residual vibration signal representing fault feature of gearbox under fluctuating operating conditions. Therefore, different techniques have been investigated for gearbox fault extraction under variable operating conditions. However, the extracted fault features still need to be identified by visual interpretation for gearbox diagnosis.
To advance gearbox defect diagnosis under variable operating conditions, this article presents an integrative approach of intrinsic time-scale decomposition (ITD) and hierarchical temporal memory (HTM) for autonomous gearbox defect diagnosis. ITD is a recently developed adaptive time–frequency analysis technique which is suitable for multi-component amplitude-modulated and frequency-modulated signals in gearbox. The features insensitive to varying operating conditions are extracted from ITD decomposition results as inputs for HTM. HTM is a dynamic pattern classifier modeling human brain activity, which absorbs the features simplified by the ITD method. It takes a hierarchical structure to identify the most representative patterns from features by eliminating the feature fusion step in conventional pattern classifiers (e.g. support vector machine (SVM) and ANN). The integrative approach of ITD and HTM is experimentally demonstrated in yielding higher classification accuracy in gearbox defect diagnosis under variable operating conditions.
The rest of this article is organized as follows: The theoretical background of ITD and HTM is first discussed in section “Theoretical background.” The theoretical framework of the integrative approach of ITD and HTM is then presented in section “Integrative approach of ITD and HTM.” The feature extraction strategy as well as performance comparison between ITD and commonly used EMD is also discussed. Next, the effectiveness of the developed method is demonstrated in the experimental studies on a gearbox testbed. Finally, the conclusions are drawn.
Theoretical background
ITD
As an adaptive time–frequency analysis method, ITD method is first proposed in Frei and Osorio. 25 It has been widely used in biomedical signal processing and bearing defect diagnosis. 26 Comparing with other adaptive time–frequency analysis methods such as EMD and local mean decomposition (LMD), ITD shows obvious advantages in computational efficiency and frequency resolution for complex and nonstationary signals. First of all, ITD is originally proposed for nonstationary signals that are time varying. The instantaneous frequency and amplitude of proper rotation component (PRCs) can be reserved accurately, and the instantaneous information is obviously imperative for the variable working condition. Second, ITD can effectively control end effect, restricting the defections at the edge of the first and last extreme points. Also, ITD cast out the time-consuming interpolation and screen processions, consequently earning efficiency over EMD. Thus, ITD is more suitable for dynamic analysis and dealing with large quantities of original data. Thus, it is promising for gearbox signal processing.
Instead of iterative envelope extraction in the EMD method, the ITD method adopts linear transformation to adaptively decompose the signal into a series of several PRCs independent of each other. For a signal Xt, ITD utilizes a baseline-extracting operator ξ for signal decomposition. The first decomposition of signal Xt is shown below
where Lt is a decomposed baseline signal and Ht is the PRC. After extracting the baseline from signal Xt, the residual of the signal becomes an inherent rotation component.
Define {τk, k = 1, 2,…} as the time of local extrema of the
Xt, τ0 is set as 0, and
Denote
where Lk + 1 is supposed as
where α ∈ (0, 1) and is usually set as 0.5, τ0 is
set as 0, and
According to equation
(1), a baseline signal and a PRC are obtained in the decomposition process. The first
decomposition has access to a baseline signal
where HξkXt is the k + 1 layer of the PRC and ξpXt is either the monotonic trend or the lowest frequency baseline.
The instantaneous amplitudes and instantaneous frequencies of the PRCs are analyzed in the frequency domain. Through analyzing the spectrum, the amplitude modulation and frequency modulation of the signal can be derived, respectively. 25
Hierarchical temporal memory
HTM is a recently developed machine learning technology that aims to capture the structural and algorithmic properties of the neocortex in human brain. 27 It has been applied to classify human body acceleration patterns, 28 vision-based hand shape, 29 remote gaze gesture, 30 and sign language. 31 In comparison with traditional ANNs, HTM not only has better self-adaptability, higher learning efficiency, and lower requirements for the number of samples but also can recognize complicated patterns with strong noise.
Most machine learning techniques are relatively static. The model accuracy highly depends on the quality and quantity of training data. HTM is an online learning system 32 which continuously updates the model with new data arriving. HTM is a memory-based system by modeling the neurons as arranged in columns, regions, layers, and a hierarchy structure. Figure 1 shows a simplified HTM diagram arranged in a two-level hierarchy. The inputs to the bottom layer are time varying data, and the recognition results are obtained from the top layer. Each layer is decomposed into different regions, and each region consists of a sheet of highly interconnected cells arranged in columns. Figure 2 describes a small section of an HTM region with four cells per column organized in a two-dimensional array of columns. Each column connects to a subset of the input, and each cell connects to other cells in the region. 33 It simulates the information representation in human brain named sparse distributed representation, which represents a small portion of active neurons within a large population of neurons. As shown in Figure 2, the HTM region creates a sparse distributed representation after receiving an input from its previous level, and the dark neurons represent active cells.

HTM network arranged in a two-level hierarchy.

HTM region with a sparse distributed representation. 31
There are three basic functions in HTM including learning, recognition, and prediction. An HTM region performs learning tasks by finding the sequences of patterns in sensory data. It searches the combinations of input bits that occur often, named spatial patterns. Then, it studies how these spatial patterns appear in sequence over time, which is stored as temporal patterns. After finishing learning tasks, an HTM region can perform pattern recognition on new inputs. When receiving a new input, the region matches it with previously learned spatial and temporal patterns using sparse distributed representations. The great majority of memories in HTMs are used to store the sequences of patterns as well as transitions between spatial patterns. An HTM region can predict what inputs it will likely receive next by means of matching stored sequences with current input.
Based on the representative patterns in the spatial pooler of an HTM, the input numeric eigenvalues are transferred into a spare bit matrix, effectively eliminating the interference of noise. Also, the digits of the bit matrix are comparatively independence, which can be better used for Bayesian classifier. The representative patterns described by bit matrixes are fed into Bayesian classifier as follows:34,35
1. A prior probability P(Ci) for each working state is calculated which is approximated by the ratio of sample amount Ni of each state and the total amount of samples N as
where i is the index of state Ci.
2. For a test sample X, HTM calculates the posterior probability of state to which X belongs
where
3. The state Ci with the maximum posterior hypothesis is selected as the classification result
Integrative approach of ITD and HTM
The essentials of gearbox diagnosis are to extract representative features insensitive to varying operating conditions and improve the pattern classification accuracy. To enhance gearbox diagnosis under variable operating conditions, this article presents an integrative approach of ITD and HTM, and the details are discussed below.
Formulation of the integrative approach
The framework of the integrative approach of ITD and HTM is shown in Figure 3. The vibration signal measured from a gearbox is first processed by ITD to extract fault feature information which is insensitive to varying operating conditions. The extracted features are then converted into the input sequence of HTM.

Procedure diagram of fault diagnosis based on HTM.
The training process of HTM in the spatial pooler can be approximately divided into three stages as follows:
Coverage: obtain the bits of each region according to the current input sequence;
Inhibition: calculate active region in which the bit is set as 1 and the others as 0.
Learning: update corresponding coefficients of the HTM model.
With the utilization of the spatial pooler, HTM network obtains corresponding representative patterns from the input sequence, and the representative patterns are fed into Bayesian classifier to autonomously identify gearbox defects. The details of ITD and HTM are discussed below.
ITD for feature extraction
EMD and ITD algorithms are two types of adaptive signal decomposition methods which are well suitable for processing nonlinear and nonstationary signal. 36 Performance comparison between ITD and EMD is illustrated using a simulated gearbox vibration signal. First, an amplitude and frequency-modulated simulation signal x(t) is constructed as follows
White Gaussian noise with amplitude 0.05 is added into the above signal x(t), and the time-domain waveform of the simulated signal is shown in Figure 4.

Time-domain waveform of simulated signal.
The ITD algorithm is applied to decompose signal, and the analysis results are shown in Figure 5(a). There are three PRCs and a residue component r3, and most of the signal energies are concentrated in the first two PRCs, PRC1 and PRC2, which represent the original signal components x1(t) and x2(t), respectively. The third PRC, PRC3, represents the white noise.

Decomposition results of the simulated signal using (a) ITD and (b) EEMD.
The decomposition results of EEMD are shown in Figure 5(b). Six intrinsic mode functions (IMFs) representing decomposed signal components are obtained. Compared with ITD, the calculation time has been tremendously increased. It takes 0.11 s for the ITD method to complete signal decomposition. The EEMD takes 13.5 s for the total decomposition when number of realizations (NR) of EEMD is chosen as 50. Moreover, the parameter selection will have a significant influence on the performance of EEMD decomposition. Thus, ITD is more efficient than EMD in terms of computational complexity.
The ITD algorithm is applied to decompose the vibration signal, and the proper rotational components corresponding to gearbox defects are selected as the signal of interest. Different features are extracted including PRC energy ratio, PRC energy entropy, impulsion index, tolerance index, kurtosis index, peak index, wavelet index, and so on to represent the status of gearbox under variable operating conditions. The energy of PRC is computed as
where ci(t) represents the ith PRC and N is the total length of PRC. The energy ratio of PRC describes the energy distribution of different frequency bands 27 and thus can be selected as the feature of gearbox
The energy entropy can characterize the uncertainties of energy distributions of PRCs among different frequency bands, which is expressed as
Considering variable speed and load working conditions, the commonly used indicators, such as variance and root mean square value, are no longer applicable. But most dimensionless indexes are not sensitive with the working condition, load, and speed of equipment. They are only concerned with the state of the device and also sensitive enough with fault. Therefore, dimensionless indexes are well suitable for the diagnosis of variable working condition. The dimensionless indexes selected are impulsion index, tolerance index, kurtosis index, peak index, and waveform index. 37
The mentioned dimensionless indexes are fundamental tasks in many statistical analyses to characterize the location and variability of a dataset. Tolerance index, impulsion index, and kurtosis index are sensitive with incipient fault and can reflect the spare impulsion signal variation. Waveform index and peak index are sensitive with slight deviation in the time domain. The above features are summarized in Table 1.
Expressions of features for gearbox defect diagnosis.
HTM for defect classification
HTM is applied here for transferring feature input into a sparse image matrix as output. Considering that the default input format of HTM is a sequence of binary bits, the extracted features are first converted into binary sequence. The specific steps are as follows:
1. Normalization. The extracted features are first normalized into the range of [0, 99] and represented by two decimal numbers
where floor(·) is a round down function.
2. Bit vector conversion. The normalized data are then converted into bit vectors following the rules as shown in Table 2. For example, the number 10 is represented by the bit vector of [1 1 1 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 1].
3. Input sequence generation. An N-dimensional feature vector is converted to an N × 20 bit matrix. The input sequence can be visually expressed in a black and white binary image or bitmap, in which the black block represents bit 1, and the white one represents bit 0. The bit number of the bitmap equals to the length of the sequence. Take a 12-dimensional vector as an example, the generated 12 × 20 bitmap is shown in Figure 6, in which the vectors are arranged from top to down.
Bit vectors of the numbers 0 through 9.

Illustration of HTM classification model.
The input sequence is fed into the bottom region of the HTM model. The top region and the bottom region are in a parent–child relationship, 32 while the top is the parent region. With the spatial pooler discussed above, HTM fuses the bit vectors in the bottom region to obtain representative bit patterns in the top region and then classifies the patterns using Bayesian classifier. Thus, HTM integrates feature fusion and pattern classification in one model.
Experimental studies
Experiments on a parallel-shaft helical gearbox rig are performed to evaluate the performance of the presented integrative ITD and HTM method. The test rig has a gear ratio of 80:32, and its transmission mechanism is shown in Figure 7. Different types of gear faults including tooth fracture and tooth wear are introduced as shown in Figure 8. The gearbox is driven by an induction motor (rated power 2 kW), and its speed is controlled by a variable speed controller. A magnetic brake (maximum load: 17.25 N m) is equipped in the output shaft of gearbox to change the load conditions. Four accelerometers are installed in different locations of the gearbox including the front end of the input shaft, the back end of the input shaft, the front end of the output shaft, and the back end of the output shaft to acquire the gearbox status using a data acquisition system. All the accelerometers are absorbed by magnet base in the vertical direction.

Experimental setup of gearbox testbed.

Photographs of fault gears: (a) tooth fracture and (b) tooth wear.
To investigate gearbox diagnosis under variable operating conditions, 38 four different scenarios regarding speed and load are considered: (1) constant speed and constant load condition: the gearbox runs at the speed of 480 r/min without load; (2) constant speed and varying load condition: the gearbox runs at the speed of 480 r/min with varying load from 0% to 40% of the full range; (3) varying speed and constant load condition: the gearbox runs at varying speed from 360 to 600 r/min without load; and (4) varying speed and varying load condition: the gearbox runs at varying speed from 360 to 600 r/min and varying load from 0% to 60% of the full range. Take the data obtain in the back end of the output shaft, for example. Figure 9(a) shows the vibration signal for the case of gear tooth fracture under rotational speed fr = 480 r/min. Its spectrum clearly shows the high speed shaft rotating frequency (f1 = 19.53 Hz), gear meshing frequency (f2 = 640.2 Hz), and gearbox resonance frequencies (f3) excited by tooth fracture in Figure 9(b).The ITD method is then applied to decompose the vibration signal, and the decomposed results are shown in Figure 9(c).

Experimental data signal processing using ITD: (a) time series data, (b) spectrum of experimental data, and (c) decomposition results of ITD method.
Next, the features listed in Table 1 are extracted to represent the gearbox status. All the indexes are used as the 12 input values, including 1 ITD energy entropy, 6 PRC feature energy, and 5 dimensionless indexes. Table 3 shows the representative features of PRC energy entropy under different operating scenarios. The feature has been analyzed in terms of mean value and variation range. PRC energy entropy of gears under normal condition is much larger than that under fault condition. Because there are no obvious impact characteristics for normal gears and the energy distribution is relatively uniform, the frequency distribution of the energy has relatively high uncertainty. For fault gear, vibration signal is more intensive in high frequency range, which are mainly distributed in the mesh frequency and higher-order multiples, so the uncertainty of the frequency distribution is relatively small. It is found that the extracted features are distinctive for different gear status and are robust to the variable operating conditions.
Feature evaluation under variable operating conditions.
The extracted 12-dimensional vector T = [HPRC, p1, p2, p3, p4, p5, p6, If, CLf, Kv, Cf, Sf] is then fed into the HTM model for autonomous gearbox diagnosis. The average feature vectors are as shown in Table 4.
Average feature vectors in all conditions.
Convert every feature vector into a 240-bit bitmap, acting as an input sequence of the HTM network. A bitmap of normal state at constant speed and under constant load condition is shown in Figure 10. The bitmap can be identified using HTM, and then the active column area and top state vectors of each sample can be obtained, which are shown in Table 5.

Bitmap of normal state in the working condition of constant speed and constant load.
Active columns and state vectors of the top region corresponding to the training samples at constant speed and under dead load.
The top state vectors obtained from Table 5 are taken as feature vectors, classification and recognition are conducted via classifier, and then the fault diagnosis can be realized. The k-fold cross-validation method 39 is used to calculate the recognition accuracy of the HTM model. The basic principle of the k-fold cross-validation is to divide the N experimental samples into k disjoint subsets of samples. In this article, we divide the 36 samples for each case into 4 subsets evenly. A total of 144 sets of data are adopted at all, 108 sets of them are used for training and 36 sets of them are served for testing. The gearbox defect identification accuracy under variable operating conditions is shown in Figure 11. In the first two operating scenarios, the gearbox defects are identified accurately. In the last two operating scenarios, the defect identification accuracy is 97.2%.

Gearbox diagnosis results under different operating conditions: (a) constant speed and constant load, (b) constant speed and varying load, (c) varying speed and constant load and (d) varying speed and varying load.
In order to demonstrate the superiority of the HTM for variable working conditions, commonly used pattern classification techniques including back-propagation (BP) neural network (NN), fuzzy c-means clustering (FCM), and SVM are also tested, and the results are compared in Table 6.
Classification accuracy using different methods.
BP: back-propagation; FCM: fuzzy c-means clustering; SVM: support vector machine; HTM: hierarchical temporal memory; ITD: intrinsic time-scale decomposition.
According to the classification results of HTM, the normal and tooth wear state can be accurately classified, and part of the tooth fracture signal may be misclassified into tooth wear condition. It is mainly caused by the similarity of tooth wear and tooth fracture fault characteristic frequency in spectrum. Also, coupling with aggravation of the tooth wear degree, the difference between the two faults will be smaller. Finally, the interaction of the method proposed here and the data collected tends to misclassify tooth fracture fault to tooth wear condition compared with FCM and SVM. We can see the total recognition rate of HTM is higher compared with other classification techniques. It is because BP NN and SVM are essentially a set of samples of the input and output that is transformed into a nonlinear optimization problem. And FCM is more suitable for slug sample set in which each type of the sample’s characteristics has little difference. It can also be seen from the standard deviation of total accuracy, the method of this article possesses the highest accuracy and stability. The results show that HTM has superiority over conventional pattern classifiers and can be effectively applied in gearbox fault diagnosis under variable working conditions.
Conclusion
This article developed an integrative approach of ITD and HTM method for fault diagnosis of gearbox under variable operating conditions. Simulation and experimental studies have been performed to validate the effectiveness of the presented method. From the above analysis, the conclusions can be drawn as follows:
The feature extraction method based on ITD is investigated. A variety of features including PRC energy ratio, PRC energy entropy, and dimensionless indexes are obtained to represent gearbox status. Experimental studies show that the obtained features are insensitive to variable operating conditions including speed and load changes.
An integrative feature fusion and pattern classification model based on HTM is carried out. The effectiveness of the presented method under variable working conditions is validated through the performance comparison with BP NN, SVM, and FCM.
More experimental studies under different operating conditions will be performed to further evaluate the presented method in our future research.
Footnotes
Acknowledgements
The valuable comments from anonymous reviewers are appreciated to improve the article’s quality.
Academic Editor: Pak Wong
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This research received financial support provided by National Science foundation of China (no. 51504274) and Science Foundation of China University of Petroleum, Beijing (nos 2462014YJRC039 and 2462015YQ0403).
