Abstract
Prognostic and health management of planetary gearbox can be of important to prevent catastrophic event and economic loss. The time-varying gear meshing position and the complex transmission path bring great challenge for the health monitoring of planetary gearbox. Recent researches show that the multichannel information fusion methods can provide more comprehensive fault information to achieve more accurate diagnostic result for the planetary gearbox. This paper proposed a novel information fusion method called variational embedded composite multiscale diversity entropy (veCMDE). The proposed veCMDE utilizes moving average windows under each scale factor to extract richer fault information at deep scales, which overcomes the defect of poor statistical reliability of the original variational embedding method. Based on veCMDE and eXtreme Gradient Boost classifier, a novel multichannel information fusion health monitoring frame of planetary gearbox has been proposed. Then, a rigid-flexible coupling dynamics model has been built to examine the performance and explore the dynamical properties of the proposed veCMDE. Lastly, an experiment using a practical planetary gearbox is conducted to prove the superiority of the proposed veCMDE method. Both simulation and experiment results demonstrate that the proposed veCMDE can enhance the statistical reliability at deep scales, resulting in a better performance in information fusion.
Keywords
Introduction
Planetary gearbox provides a large gear ratio in a small, light-weight package, which is a critical component for mechanical transmission and commonly found in aircraft propulsion system, wind turbines, etc. 1 As the power transmitting parts, the planetary gearbox may have a sudden mechanical failure during working. Such sudden failure may cause disastrous consequences or huge economic losses. Thus, the health monitoring technology becomes indispensable to guarantee the reliability and availability of the planetary gearbox.
Accurately capturing the abnormal condition of the planetary gearbox will no doubt help the engineers make the optimal maintenance decisions, timely replacing the faulty component. 2 However, the planetary gearbox has the unique dynamical characteristics compared to fixed-axle gearbox: the time-variable meshing position, multiple pairs of teeth meshing at the same time, and the complex vibration transmission path. 3 These make the vibration response of planetary gearbox present significant complexity, modulation, time-varying, and transient properties.
Above mentioned unique dynamic characteristics brings great challenge for diagnosing faults of planetary gearbox. Thus, a large number of scholars have put efforts in fault identification of planetary gearbox. Lei et al. 4 proposed two time and frequency domain features, which perform better than the traditional time or frequency characteristics. Feng et al. 5 applies Vold Kalman filter to adaptively screen the appropriate bandwidth of the vibration signal, and this signal processing method can greatly reduce the background noise effect on the fault identification of wind turbine planetary gearbox. Sharma and Tiwari 6 proposed weighted multiscale fluctuation based dispersion entropy to extract fault characteristics from the vibrational signal, which provides more separable features for fault identification. Zhang et al. 7 investigated the dynamic behaviors of the planetary gear through rigid-flexible coupling dynamics model.
Above mentioned researches are focus on the method using vibrational signals measured from single channel. Recent researches show that the multichannel information fusion methods will no doubt provide more comprehensive fault information than the single-channel-based method, achieving more accurate diagnostic result. Fu et al. 8 fused the diagnosis results from multiple channels and decided the fault state of gearbox comprehensively. The experiment result showed that the accuracy of three sensors based diagnosis method is significantly higher than the single sensor based method. Maheswari and Umamaheswari 9 combined the multivariate mode signal processing with the noise-assisted analysis in measuring the multi-frequency modes based vibrations. The result also showed the multichannel information fusion method has the better performance than the single-channel-based method in planetary gear condition monitoring.
The difficulty of information fusion is how to comprehensively and deeply fuse multichannel signals with different dimensions, different information content, and different characteristics. Multivariate entropy provides an alternative way for the multichannel feature extraction and multichannel information fusion. 10 The multivariate entropy uses the multivariate phase space to reconstruct multichannel signals with different dimensions in a state space with uniform dimensions, which allows the entropy measure explore the dynamic characteristics hidden in the different channels. Wei et al. 11 combined the composite procedure with multivariate multiscale symbolic dynamic entropy for the recognizing faults of gearbox. Wang et al. 12 improves multivariate multiscale sample entropy by taking advantage of generalized refined composite procedure for the health monitoring of rotating machinery.
Recently, Wang et al. 13 proposed variational embedded multiscale diversity entropy (veMDE) for multichannel information fusion. Comparing with the mainstream multivariate embedding procedures, the veMDE establishes multiple different structure based phase spaces for each channel, which breaks through the limitation that the traditional multivariate entropy method exhibits poor recognition ability for the single fault hidden in multiple channels.
However veMDE has a certain defect that multiscale procedure will loose statistical reliability at deep scales, because the multiscale procedure will shorten the time series when the scale increases. 14 For example, for a signal with 2048 data points, the data points of the multiscale time series at scale 10 is only 204, which is reduced to 1/10 of the original signal. Such multiple time series may provide insufficient fault information for the entropy method, resulting inaccurate fault features quantification of planetary gearbox.
To overcome this defect, this paper proposed variational embedded composite multiscale diversity entropy (veCMDE) for multichannel information fusion. The proposed veCMDE utilizes moving average windows under each scale factor to enhance the statistical reliability at deep scales. Based on veCMDE and eXtreme Gradient Boost (XGBoost) classifier, 15 a novel multichannel information fusion health monitoring frame of planetary gearbox has been proposed. Then, a rigid-flexible coupling dynamics model has been established to obtain the simulated signals of planetary gearbox under different working conditions with different measuring points. The simulated signals are employed to estimate the performance level of the proposed veCMDE versus the single-channel based method. Lastly, an experiment using a practical planetary gearbox is used to prove the superiority of the proposed veCMDE method.
Methodology
Variational embedding composite multiscale diversity entropy
The original veMDE has the defect that multiscale procedure will loose statistical reliability at deep scales, because the multiscale procedure cut down the data length at high scales. Thus, this paper proposed variational embedding composite multiscale diversity entropy (veCMDE) for multivariate information fusion. The veCMDE utilizes moving average windows under each scale factor to improve the statistical reliability, which allows the diversity entropy to explore more refined fault information hidden at deeper scales.
In this section, we elaborate on the derivation procedure of veCMDE, as displayed in Figure 1. For a multivariate collected signal
where

Flowchart of the calculating the proposed veCMDE.
The flowchart of the proposed veCMDE contains five key steps, which is displayed in Figure 1. To be specific, the calculation steps of veCMDE are as follows:
Step 1. Given vibration signal
Step 2. Embed the one-dimensional
Physically,
Step 3. Evaluate the cosine similarity as equation (4). Then a set of similarities between every two adjacent are denoted as
Step 4. Obviously,
Step 5. Based on the composite multi-scale time series for each channel, the veCMDE can be calculated as:
EXtreme Gradient Boost
XGBoost classifier is a commonly used classifier for processing large-scale data with fast calculation and good classification ability. The second order Taylor expansion is used to calculate the loss function of the XGBoost, which can speed up the convergence rate by using the first and second order derivatives. Also, the additional regularization term in the loss function can control the model complexity. The flowchart of XGboost classifier can be divided into three steps:
Step 1. Continually split features to generate a tree. As each tree is added, a new function f (x) will be learned to match the residuals of the last forecast.
Step 2. For each sample, the training model generates K trees. Each tree is composed several leaf nodes according to the features of a sample. Each leaf node provides the score, which is used to predict the final score of the sample.
Step 3. Finally, predict the value for the sample by adding up the score of each tree.
In this paper, the model for each iteration is set to “gbtree.” The minimum gradient descent of the loss function for node splitting is set to 0.5. The maximum number of trees is set to 50.
Multichannel information fusion health monitoring frame
To accurately acquire the health condition of the planetary gearbox, a novel multichannel information fusion health monitoring frame is established by the proposed veCMDE and XGBoost. The proposed health monitoring frame can comprehensively fuse the vibrational signals from different channel, giving an accurate diagnostic result for planetary gearbox. The framework can be realized by three steps as below.
Step 1. Data collection: synchronously collect the vibration signals of planetary gearbox under various fault severities using multiple sensors with different installation site.
Step 2. Feature extraction: calculate the dynamic complexity of collected signals by the proposed veCMDE, which is the feature set of each signal.
Step 3. Fault severity identification: input the extracted features to train the XGBoost classifier to build a health monitoring model of the planetary gearbox. At last, the trained model can be used to monitor the health condition for planetary gearbox.
Simulation study
In this section, a rigid-flexible coupling dynamic model was constructed to simulate the vibration signals of the planetary gearbox considering different health conditions. The simulated signals were used explore the information fusion stability of veCMDE. At the same time, the information fusion ability of the proposed method was compared with five methods: CMSE, CMFE, CMPE, CMDE, and veMDE.
Simulation setting
To estimate the performance level of the proposed veCMDE in information fusion, a rigid-flexible coupling dynamics model has been established to obtain the simulated signals of planetary gearbox under different working conditions with different measuring points. Firstly, a three-dimensional model of planetary gearbox was established as a practical planetary gearbox as Section “Experiment study.” The three-dimensional model of planetary gearbox is shown in Figure 2. The planetary gearbox is a single-stage planetary gearbox, with a ring gear of 81 teeth, three planetary gears each with 31 teeth, and a sun gear of 21 teeth. The detailed structural parameters are shown in Table 1.

Three-dimensional model of planetary gearbox: (a) 3D model and (b) layout.
Structural parameters of planetary gear box.
Secondly, the rigid-flexible coupling dynamics model of the planetary gearbox is set as follows: the front cover, ring gear, and back cover of the planetary gearbox are splitted into finite element meshes using Hypermesh software; Then the planetary gearbox model is imported into Adams software, and the front cover, gear ring, and back cover of the gearbox are replaced with flexible body. Other components are regarded as rigid body.
Finally, constraint force, contact force, and measuring point was set in the Adams software. Figure 2 shows the three different locations of the measure points. The measured signal can be regarded as vibration signal as accelerometer acquired. The sampling frequency of simulation signal was set to 16,384 Hz, the rotation speed of sun gear was 2400 rpm.
In this paper, three health conditions of planetary gear box are considered, including normal state, sun gear broken, and ring gear broken. Thus, this will be a three-class classify problem in mathematical. The fault gear models are shown in Figure 3. To facilitate the training and visualization of the fault diagnosis model, 126 samples were generated for each class. Each sample collected vibration signals at three different locations (sensor positions are shown in Figure 2). Each sample has a data length of 3 × 1024. The waveforms of simulated signals at three different measuring points are as shown in Figure 4.

Fault gear in the simulation: (a) sun gear broken and (b) ring gear broken.

Waveforms of simulated gear signal.
Results and analysis
After obtaining the simulated signals displayed in Section “Simulation setting,” the simulated signals were used to examine the information fusion and feature extraction ability of veCMDE. The visual features and Fisher score (FS) are adopted to intuitively show the superiority of veCMDE. Fisher score is a physical index to examine the feature extraction ability. 16 FS denotes the distance ratio of inter-class to inner-class, as calculated in equation (8). Therefore, the higher FS value of features can indicate the better feature extraction ability of the entropy method.
where indicates the total number of the samples.
The visualized features of each method are shown as Figure 5. Three conclusions can be made as follows:

Result of the simulation evaluation: (a) CMSE, (b) CMFE, (c) CMPE, (d) CMDE, (e) veMDE, and (f) veCMDE.
Firstly, among single-channel-based feature extraction methods (CMSE, CMSE, CMSE, CMSE), only CMDE has a clear class center. Therefore, the final FS of CMDE (3.67) is higher than that of other single-channel-based feature extraction methods. Therefore, it can be concluded that CMDE performs best feature extraction ability in terms of extracting features from single-channel.
Secondly, the variational-embedding-based multichannel information fusion methods has higher FS value than single-channel-based feature extraction methods. The most obvious examples are veMDE versus CMSE, CMFE, and CMPE: for the three single-channel-based feature extraction methods, it is difficult to form an obvious class center; although there is a clear class center for CMDE, the inner-class distance is too far to form clear borderline among different classes. For veMDE, there is a clear class center, and the inner-class distance becomes small by variational embedding procedure, resulting in a clear borderline for each class. This proves the effectiveness of variational-embedding-based methods in the multichannel information fusion of planetary gearbox.
Finally, veCMDE obtained the highest FS (12.53), indicating that veCMDE is able to extract the best features among the six methods. Although veMDE make each class centrally distributed, the distance between different classes is still too small. The veCMDE not only has a clear class center, but also has larger inter-class distance, which makes the three clusters can be easily distinguished. This is because the composite multiscale procedure used in veCMDE greatly improves the stability under high-scales, resulting in more reliable fault information extracted by diversity entropy. The better feature extraction ability makes the veCMDE has a better performance in this simulation test.
To conclude, the feature extraction method based on multi-channel information fusion performs better than single-channel-based feature extraction methods. Meanwhile, it also proves that veCMDE has stronger feature extraction ability compared with veMDE.
Experiment study
In this section, the proposed veCMDE-XGBoost method is applied for diagnosing the mechanical faults of planetary gearboxes. The experiment designed different fault types to measure the performance level of proposed veCMDE-XGBoost method. At the same time, different information fusion methods mvMSE, mvMPE, mvMFE, mvMDE, veMDE are compared with veCMDE.
Experiment setting
In this section, the proposed fault severity identification frame was evaluated through the real planetary gearbox. The planetary gearboxes dataset was collected from a test rig HD-FD-H-03X. This experiment is conducted by Aviation Equipment Intelligent Maintenance Lab, Northwestern Polytechnical University. The planetary gearbox model ZLS160-5-S-T is a single-stage planetary gearbox, with a ring gear of 81 teeth, three planetary gears each with 31 teeth, and a sun gear of 21 teeth. In this experiment, a one stage planetary gearbox was used as the test object. The test rig is displayed in Figure 6(a) and the structure diagram of the test planetary gearbox is plotted in Figure 6(b). In this experiment, a three-phase asynchronous motor drove the planetary gearbox with magnetic damping. The magnetic damping is an electromagnetic particle brake which aims to provide radial load at the output shaft. The radial load was 5 Nm. The torque load provide by electromagnetic particle brake aims to keep the gears in contact. As this study only investigate the pure geometric faults, the load weakly affected the measured signal. Three accelerometers are mounted on the case of the planetary gearbox as displayed in Figure 7. The sampling frequency of the dataset was 16 kHz.

(a) Test rig and (b) structure diagram of planetary gearbox.

Measuring point of the test gear.
This experiment set up nine health states, including normal condition, half-broken teeth on planet gear (HBTP), crack on planet gear (CPG), wear on planet gear (WPG), inner ring fault on sun gear bearing (IRFS), crack on planet bearing shaft (CPBS), half-broken tooth on Sun gear (HBTS), wear on sun gear (WSG), and outer ring fault on sun gear bearing (ORFS). The above fault components of the test planetary gearbox are displayed in Figure 8. 292 samples are selected for each health state, and the data length in each sample is 1024.

Faulty components: (a) half-broken teeth on planet gear, (b) crack on planet gear, (c) wear on planet gear, (d) inner ring fault on sun gear bearing, (e) crack on planet bearing shaft, (f) half-broken tooth on sun gear, (g) wear on sun gear, and (h) outer ring fault on sun gear bearing.
The experiment was conducted following the developed multichannel information fusion health monitoring frame. After obtaining the vibration signal, veCMDE is employed to perform feature extraction from five channels. Then the training dataset is constructed by randomly selecting 292 samples for each health state, and the remaining samples are set to the test dataset. Then, the fault features are put into XGBoost classifier. To reduce randomness of the result, each entropy-based method is set to run 20 times. Finally, the output diagnosis accuracy of the classifier is employed to estimate the feature extraction ability of the test method. It is obviously that the higher test accuracy indicates the better feature extraction ability.
Result analysis
Figure 9 displays the diagnostic accuracies of different entropy method. From Figure 9, three conclusions can be made:

Experiment result.
First, the mean accuracy of veMDE (92.35%) is higher than that of mvMSE (78.78%), mvMFE (80.73%), mvMPE (91.53%), and mvMDE (87.00%). This demonstrates that the variational embedding strategy have a better feature extraction ability than other multivariable strategies. This is attributed to that the variational embedding strategy uses different structures for each channel to construct phase space, which can extract unique fault features from the different channels. Conversely, the original multivariate strategy may fail to capture the potential fault information hidden in different channels.
Second, the diagnostic accuracy of veCMDE (95.20%) is higher than that of veMDE (92.35%). This indicates that the proposed variational embedded composite multiscale process can extract more fault information. Specifically, the veCMDE method uses a moving average window to generate multiple time series under different scale factors. Then, the veCMDE is calculated by taking the average diversity entropy value of the composite multiscale time series, which greatly improves the stability of high-scale diversity entropy. Consequently, the diversity entropy is capable to acquire more comprehensive fault information at a larger scale.
Third, the proposed method also achieves the lowest standard deviation among the six methods. It suggests that the composite multiscale method can greatly improves the stability of high-scale diversity entropy. To have a clear vision of the stability of veCMDE at high scales, Figure 10 show the standard deviation of veCMDE compared with mvMDE and veMDE at different scales.

Standard deviation of diversity entropy value under different scale: (a) normal gear and (b) fault gear.
Figure 10 displays that the standard deviation of mvMDE and veMDE increase with the scale factor. Due to the multiscale procedure cut down the data length of the time series at high scales, and the shortened multiscale time series cannot provide enough information for DE to estimate dynamic complexity. Therefore, the diversity entropies fluctuate at high scales, showing instability behaviors in Figure 10. Obviously, the large standard deviation of the entropy value enlarges the deviation of extracted features at high scales, resulting in unsatisfied classification result in Figure 9.
However, veCMDE exhibits stable complexity value at high scales. From Figure 10, the standard deviation of veCMDE gradually becomes stable when the scale increases, but the standard deviation of mvMDE and veMDE gradually increases with the increase of scale. And the standard deviation of veCMDE is always lower than that of mvMDE and veMDE. This proves the composite multiscale procedure can enhance the statistical reliability of the complexity estimation at high scales, resulting in a better classification performance (low standard deviation).
Conclusion
This paper proposed variational embedded composite multiscale diversity entropy (veCMDE) for multichannel information fusion. The proposed veCMDE utilizes moving average windows under each scale factor to enhance the statistical reliability at deep scales. Based on veCMDE and eXtreme Gradient Boost (XGBoost) classifier, a novel multichannel information fusion health monitoring frame of planetary gearbox has been proposed. The contributions of this paper are summarized as follows:
A novel multichannel information fusion health monitoring frame of planetary gearbox has been proposed.
A rigid-flexible coupling dynamics model of planetary gearbox has been established to explore the information fusion stability of veCMDE.
An experiment using practical planetary gearbox shows that the proposed veCMDE has the best information fusion and the best stability compared to the existing methods.
In future work, we will investigate how to build models using only simulated signals based on transfer learning, addressing the issue of unavailability of fault samples in practical engineering scenarios.
Footnotes
Handling Editor:Sharmili Pandian
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: Scientific research program funded by National Natural Science Foundation of China (No.72401228), Natural Science Foundation of Shaanxi Province under Grant 2023-JC-QN-0414.
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
