Sage Journals: Discover world-class research

Abstract

Fault detection in rotating machinery is critical to reliability and safety. However, it faces difficulties due to complex, noisy fault signatures, non-stationary behavior, and the impracticality of obtaining large labeled datasets, limiting the effectiveness of both traditional and deep learning-based methods in real-world applications. This paper introduces a novel approach that combines Variational Mode Decomposition (VMD) and Long Short-Term Memory (LSTM) networks to improve gear and bearing defect detection, filling a gap in fault diagnostics by effectively handling limited training data. VMD decomposes signals into intrinsic mode functions (IMFs), while LSTM classifies fault types and severity levels based on time-domain features extracted from the IMFs. Tested on the Case Western Reserve University Dataset (CWRUDS) for bearing defects and the Laboratory of Mechanics and Structures Dataset (LMSDS) for combined gear and bearing defects, the method outperforms vibratory analysis and conventional classifiers such as MLP, 1D-CNN, 2D-CNN, and standalone LSTM. The results show that the VMD-LSTM model is superior at reliably detecting defects and accurately diagnosing faults in complex, data-limited scenarios, making it a promising solution for machinery health monitoring.

Keywords

variational mode decomposition long short-term memory fault detection severity classification rotating machinery signal decomposition feature extraction vibratory analysis

Introduction

In industrial settings, fault detection in rotating machinery must be reliable and timely in order to ensure operational safety and reduce unplanned downtime. Despite the importance of this task, traditional methods frequently fail to accurately diagnose faults due to the complexity of fault patterns and the difficulties posed by noisy, non-stationary signals. Traditionally, vibration analysis has been a cornerstone of fault diagnosis, employing signal processing techniques to detect irregularities in machine operation. For instance, Kebabsa et al.¹ applied spectral analysis to monitor machine conditions and implement predictive maintenance in industrial settings. While such methods are useful in some situations, they frequently struggle with complex fault signatures that are influenced by noise, overlapping frequency components, and variations in operating conditions.

To address these limitations, researchers investigated advanced signal processing techniques. The wavelet transform,² empirical mode decomposition (EMD),³ and its variations, such as ensemble EMD,⁴ complete ensemble EMD with adaptive noise,⁵ and improved complete ensemble EMD with adaptive noise,⁶ have all been extensively studied. Although these methods are more adaptable and noise-resistant than traditional spectral analysis, they still struggle with highly non-stationary signals and overlapping fault frequencies. Other techniques, such as high-frequency resonance analysis,⁷ cyclostationary analysis,⁸ and kurtogram analysis,⁹ have demonstrated promise but are frequently limited by computational complexity and parameter selection sensitivity.

Variational Mode Decomposition (VMD)¹⁰ is a recent advancement in signal decomposition that has gained popularity for its ability to adaptively decompose signals into oscillatory modes with distinct frequencies and amplitudes. Choudhury et al.¹¹ found that VMD outperformed EMD for Tacho-Less Order Tracking in gear fault detection. VMD’s noise tolerance makes it ideal for diagnosing faults in rotating machinery, as demonstrated by Mohanty et al.¹² The authors used VMD to diagnose ball-bearing defects, demonstrating its effectiveness in signal decomposition. Sharma and Parey¹³ applied VMD to extract weak fault transients, enabling the detection of gearbox defects under varying speed conditions. Sharma¹⁴ expanded this work by proposing a VMD-based approach combined with permutation entropy for gear fault detection in real-speed applications and successfully validated its effectiveness in diagnosing planetary gearbox faults. Chen et al.¹⁵ introduced a VMD-acoustic emission technique for gearbox fault detection, concluding that the method could generate distinct features representing various fault conditions. Zhang et al.¹⁶ also developed a refined Coarse-to-fine VMD approach aimed at improving pattern recognition accuracy and parameter selection, demonstrating its effectiveness through experimental results.

In parallel, the integration of artificial intelligence (AI) techniques has emerged as a powerful tool for fault detection, offering improved feature extraction and classification capabilities. Unsupervised learning¹⁷ methods, such as Principal Component Analysis (PCA),¹⁸ Competitive Learning,¹⁹ and Self-Organizing Maps (SOM),²⁰ have been used to identify patterns in unlabeled data.

However, these methods frequently necessitate extensive preprocessing and may struggle with complex, multi-dimensional data. On the other hand, supervised learning²¹ approaches, including Support Vector Machines (SVM),²² Convolutional Neural Networks (CNN),²³ and Nearest Neighbor algorithms,²⁴ have shown greater promise in fault classification. Long Short-Term Memory (LSTM) networks,²⁵ a type of recurrent neural network (RNN), have received special attention for their ability to capture long-term dependencies in sequential data, making them extremely useful for fault prognosis in complex systems.

Xie and Zhang²⁶ demonstrated that LSTM outperforms other methods in machine prognosis for complex bearing systems. Anwarsha and Narendiranath Babu²⁷ further validated the effectiveness of LSTM networks in detecting faults across various components of rotating machinery. To assess LSTM’s performance in real-world applications, Cao et al.²⁸ employed it for diagnosing wind turbine gearbox faults, confirming its capability for accurate fault detection. Additionally, Masri and Al-Jabi²⁹ utilized LSTM-based neural networks to develop analytical predictive models for wind speed, direction, and mechanical power, achieving an average error of less than 3% and an R-squared value of 0.95, demonstrating the model’s precision.

Despite these advances, a significant challenge persists: the reliance on large labeled datasets, which is frequently impractical in industrial settings due to time, cost, and sensor limitations. This limitation has prompted the creation of hybrid approaches that combine advanced signal processing and AI techniques. Several studies investigated hybrid approaches. Damou et al.³⁰ proposed an approach combining ensemble empirical mode decomposition, wavelet packet transform, and random forest for bearing fault diagnosis, achieving high accuracy in identifying defects. Moumene and Ouelaa³¹ combined wavelet transform and pattern recognition to detect faults in gears and bearings, resulting in high accuracy in identifying combined fault types.

Gu et al.³² enhanced fault detection by combining VMD and Continuous Wavelet Transform (CWT) to improve CNN-based diagnosis. Li et al.³³ utilized wavelet packet decomposition and support vector machines (SVMs) to detect aero-engine bearing faults, concluding that their approach was both accurate and easy to implement. In an experimental study, Almutairi and Sinha³⁴ applied vibration-based machine learning (VML) to classify rotor and bearing faults, evaluating their method at two different speeds. Furthermore, Tong et al.³⁵ proposed a novel fault diagnosis framework for rolling element bearings that combines dual-tree complex wavelet packet transform, improved intrinsic time-scale decomposition, singular value decomposition, and an online sequential extreme learning machine to effectively extract meaningful fault features and identify fault patterns from vibration signals. While these hybrid methods are promising, they still require significant computational resources and may not perform well in data-limited scenarios.

Despite advances in fault detection methods for rotating machinery, complex fault patterns and the need for large datasets continue to pose challenges. This reliance limits their practical use in industrial settings, where collecting large amounts of labeled data is costly and time-consuming. To address these issues, this paper proposes a novel approach that combines VMD and LSTM networks to improve fault detection in rotating machinery despite limited training data. VMD decomposes signals into intrinsic mode functions (IMFs), which LSTM then uses to classify fault types and severity levels. This approach takes advantage of the strengths of both methods: VMD’s ability to handle noisy, non-stationary signals and LSTM’s ability to learn complex temporal patterns. The proposed method, which reduces the reliance on large labeled datasets, provides a practical solution for real-time fault diagnosis in industrial applications where data availability is frequently limited.

The manuscript is structured as follows: Section “Introduction” introduces bibliographic research, followed by Sections “Proposed approach” and “Theoretical background,” which explain the proposed approach and methods. Section “Experimentation” covers the experimental part. Section “Results and discussion” contains the results and discussion section. Conclusions are presented in Section “Conclusion.”

Proposed approach

Traditional vibratory analysis is widely used for fault detection in rotating machinery; however, it often struggles with complex fault patterns, noise interference, and varying fault conditions, making accurate diagnosis challenging. While deep learning methods have been introduced to classify defects directly from raw signals, they typically require large labeled datasets, which are often impractical to obtain in industrial settings. To address these challenges, advanced signal decomposition techniques have been developed to extract meaningful fault signatures by breaking down complex signals into more interpretable components.

This paper proposes a novel integration of Variational Mode Decomposition (VMD) and Long Short-Term Memory (LSTM) networks that improves fault detection accuracy while using significantly fewer training samples. VMD adaptively decomposes vibration signals, revealing important fault-related information that would otherwise go undetected. The LSTM then uses sequential learning to classify fault types and severity levels, ensuring reliable performance even in small-data scenarios. By combining these two techniques, the proposed approach overcomes the limitations of both traditional vibratory analysis and deep learning models that require extensive training data, resulting in superior accuracy in complex fault diagnosis. Figure 1 depicts the proposed method in step-by-step detail:

➢ Data acquisition: This is the initial step in the workflow. Vibrational signals are repeatedly measured on the machine during various fault scenarios. These signals provide the raw input data for the fault detection process.

➢ Apply VMD: In this step, the measured vibrational signals are decomposed into intrinsic mode functions (IMFs) using the VMD technique. The number of IMFs denoted as K, is determined by calculating the mean value of the IMFs number for each signal using the work of Mnassri et al.¹⁸ and Isham et al.³⁶

➢ Data splitting: Following decomposition, the dataset is split into three subsets: training (80%), validation (10%), and testing (10%). This division ensures that the model is trained on a large enough dataset while keeping separate subsets for validation and testing to prevent overfitting. Each signal in these subsets is represented by an IMF, which serves as the foundation for subsequent analysis. The training set teaches the model, the validation set fine-tunes hyperparameters, and the testing set evaluates the model’s performance on previously unseen data.

➢ Feature extraction: Scalar indicators are the most important features in the detection of faults in rotating machinery. Consequently, scalar indicators are calculated for all IMFs obtained in the three subsets.

➢ Training and classification: The features of the training and validation subsets are used to train the LSTM network, resulting in a reliable classification model. During training, the LSTM network learns to classify fault types and severity levels using the extracted features. Once trained, the model is used to classify the testing set based on the calculated features.

Figure 1.

Flow chart of the proposed VMD-LSTM fault detection approach.

Finally, to confirm the superiority of the proposed approach over conventional vibration analysis, an envelope analysis is performed on the testing set’s IMFs to obtain envelope spectra and verify the potential detection of faults in these signals.

Theoretical background

Understanding the theoretical foundations is pivotal in comprehending the proposed approach for fault detection. This section briefly describes the basic principles of the methods used.

Variational mode decomposition

Variational Mode Decomposition is a signal processing technique used for adaptive decomposition of signals into a set of oscillatory components called Intrinsic Mode Functions (IMFs). The core principle of VMD is to optimize a cost function by decomposing a given signal x(t) into a sum of K narrowband oscillatory modes. This decomposition process can be represented mathematically as follows: given a signal x(t), the goal is to decompose it into K modes: u_k(t) = 1, 2, …, K, and corresponding center frequencies f_k.

The optimization problem to obtain the VMD can be formulated as¹⁰:

min \sum_{k = 1}^{K} {‖ x (t) - \sum_{k = 1}^{K} u_{k} (t) ‖}_{2}^{2} + λ \sum_{k = 1}^{K - 1} {‖ f_{k + 1} - f_{k} ‖}_{2}^{2}

(1)

where, u_k(t) represents the kth mode, and f_k is the center frequency of the kth mode. The first term ensures the fidelity of the input signal reconstruction. The second term enforces a smooth variation in the center frequencies. $λ$ is a regularization parameter controlling the trade-off between fidelity and smoothness obtained based on the work of Dragomiretskiy and Zosso.¹⁰

Long short-term memory

Long Short-Term Memory is a type of recurrent neural network (RNN) architecture designed to capture long-range dependencies in sequential data. The LSTM unit consists of various gates (input, forget, output) that regulate the flow of information within the network.

Mathematically, an LSTM unit for a given input sequence $X = {x_{1}, x_{2}, . . . ., x_{T}}$ of length T can be defined by its computation steps as²⁵:

Compute the input, forgot, and output gates:

\begin{matrix} i_{t} = σ (W_{i} \cdot [h_{t - 1}, x_{t}] + b_{i}) \\ f_{t} = σ (W_{f} \cdot [h_{t - 1}, x_{t}] + b_{f}) \\ o_{t} = σ (W_{o} \cdot [h_{t - 1}, x_{t}] + b_{o}) \end{matrix}

(2)

Update the cell state C_t and hidden state h_t:

{\tilde{C}}_{t} = \tanh (W_{C} . [h_{t - 1}, x_{t}] + b_{C})

C_{t} = f_{t} * C_{t - 1} + i_{t} * {\tilde{C}}_{t}

h_{t} = o_{t} * \tanh (C_{t})

(3)

where, i_t, f_t, o_t are the input, forget, and output gate vectors at time step t; W_i, W_f, W_o, W_C represent weight matrices; b_i, b_f, b_o, b_C are bias terms; σ denotes the sigmoid function, and tanh denotes the hyperbolic tangent function.

Time domain features

Time-domain indicators serve as crucial descriptors extracted from the intrinsic characteristics of signals, providing valuable information about fault conditions in rotating machines. Their importance lies in capturing nuanced fault signatures that might otherwise remain hidden, offering a concise yet informative representation of underlying fault conditions. Table 1 lists the thirteen scalar indicators used in this study.

Table 1.

Time domain indicators for a given signal S(n).

Scalar indicator	Equation	Scalar indicator	Equation
Crest Factor CF	$\frac{Max \| S (n) \|}{\sqrt{\frac{1}{N} {\sum_{n = 1}^{N} (S (n))}^{2}}}$	Peak to Peak P2P	$Max \| S (n) \| - Min \| S (n) \|$
Crest Value CV	$Max \| S (n) \|$	Root Mean Squared RMS	$\sqrt{\frac{1}{N} \sum_{n = 1}^{N} {(S (n))}^{2}}$
Energy E	$\sum_{n = 1}^{N} {[S (n)]}^{2}$	Standard Deviation Std	$\frac{1}{(N - 1)} \sum_{n = 1}^{N} {(S (n) - \bar{S} (n))}^{2}$
Impulse Factor IF	$\frac{max \| S (n) \|}{\frac{1}{N} \sum_{n = 1}^{N} \| S (n) \|}$	Shannon Entropy SE	$- \sum_{n = 1}^{N} {(S (n))}^{2} \log ({(S (2))}^{2})$
K-Factor KF	$\frac{Max \| S (n) \|}{\sqrt{\frac{1}{N} \sum_{n = 1}^{N} {(S (n))}^{2}}}$	Shape Factor SF	$\frac{\sqrt{\frac{1}{N} {\sum_{n = 1}^{N} (S (n))}^{2}}}{\frac{1}{N} \sum_{n = 1}^{N} \| S (n) \|}$
Kurtosis K	$\frac{\frac{1}{N} \sum_{n = 1}^{N} {(S (n) - \bar{S})}^{4}}{{[\frac{1}{N} \sum_{n = 1}^{N} {(S (n) - \bar{S})}^{2}]}^{2}}$	Skewness S	$\frac{{(S (n) - mean (S (n)))}^{3}}{{(\frac{1}{(N - 1)} \sum_{n = 1}^{N} {(S (n) - \bar{S} (n))}^{2})}^{3}}$
Margin Factor MF	$\frac{max \| S (n) \|}{{(\frac{1}{N} \sum_{n = 1}^{N} \| S (n) \|)}^{2}}$

Experimentation

In the experimental phase, the proposed approach was tested on two distinct datasets. The first dataset, CWRUDS (Case Western Reserve University Dataset),³⁷ was obtained online from Case Western Reserve University and only included bearing defect signals. The second dataset, LMSDS (Laboratory of Mechanics and Structures Dataset), was created in the Laboratory of Mechanics and Structures at the University of Guelma in Algeria and includes a comprehensive collection of signals covering both bearing and gear defects. The use of these distinct datasets aimed to validate and demonstrate the approach’s efficacy in fault detection across different severities and types of defects.

CWRUDS

The Bearing Data Center at Case Western Reserve University²⁷ has a large database of vibration signals derived from various bearing types operating under different conditions and exhibiting multiple defects. This database contains signals from both normal and faulty bearings, allowing for a thorough investigation of bearing behavior under varying loads, speeds, and lubrication conditions. The signals were collected using a specialized test bench designed to generate typical bearing faults and then capture their characteristics. Figure 2 shows the test bench setup, which includes a motor on the left, a “transducer/encoder” coupling in the center, and a dynamometer on the right. In this study, signals were recorded at a sampling rate of 48 kHz, resulting in a 10-s signal for each case, which was segmented to yield a 1-s sample. This set included signals from both normal operating bearings and those with faults (ball defect, inner ring defect, outer ring defect), allowing for a thorough analysis across a wide range of operational conditions and fault scenarios.

Figure 2.

Case Western Reserve University’s test bench.

LMSDS

The LMSDS included various vibration signals for gears and bearings in a variety of working states and conditions. The data were collected at the Laboratory of Mechanics and Structures (LMS), University of Guelma, Algeria, using the testing setup shown in Figure 3. This setup consists of an electrical motor with speed control, which allows for variable rotational frequencies. The flexible coupling transmits the motor’s rotational motion to the gearbox’s drive shaft. The gearbox consists of four spur gears arranged on three shafts (drive, middle, and driven), which are guided by six housing bearings fixed to a steel box. The Plexiglas top face makes it easier to create gear faults and replace gears, as well as apply lubrication. An electromagnetic brake connected to the third shaft creates a load that simulates transmission pressure. Defects were simulated on gears 2 and 4, as well as the middle shaft bearing. The vibration data were collected under identical load conditions induced by the brake. The data were sampled at 16,384 Hz with a recording time of 1 s.

Figure 3.

Laboratory of Mechanics and Structures’ test bench.

Table 2 displays the frequency characteristics required for fault detection, including the rotation frequencies Fr and calculated frequencies of the bearings (outer race defect BPFO, inner race defect BPFI, ball defect BPF, and cage defect CF).

Table 2.

Frequency characteristics of the defects for both sets.

LMSDS				CWRUDS
F (Hz)	Shaft 1	Shaft 2	Shaft 3	Fr (Hz)	BPFI	BPFO	BPF	CF
F _r	14	11.76	16.93	28.66–29.95	155.19–162.18	102.74–107.36	135.08–141.16	10.68–11.16
BPFO	42.88	36.02	51.86
BPFI	69.10	58.04	83.56
BPF	56.50	47.46	68.32
CF	5.34	4.49	6.46

Table 3 provides the complete compositions of the samples present from the two datasets: SD (Small Defect), AD (Average Defect), and CD (Critical Defect). Each severity of the dataset has 30 samples with a duration of 1 s each, divided into 24 for training, 3 for validation, and 3 for testing.

Table 3.

Total compositions of the datasets used (CWRUDS on the left and LMSDS on the right).

Data set	Defect type	Defect severity	Data set	Defect type	Defect severity
CWRUDS	Without defect	/	LMSDS	Without defects	/
	Ball defect	SD		Bearings defects	SD
					AD
		AD			CD
				Gears defects	SD
		CD			AD
					CD
	Inner ring Defect	SD			CD + SD
					CD + AD
		AD			CD + CD
				Combined gears and bearings defects	SD bearing + SD gear
		CD			SD bearing + AD gear
					SD bearing + CD gear
	Outer ring defect	SD			AD bearing + SD gear
					AD bearing + AD gear
		AD			AD bearing + CD gear
					CD bearing + SD gear
		CD			CD bearing + AD gear
					CD bearing + CD gear

Each vibration signal was segmented into 1-s samples, with CWRUDS sampled at 48 kHz and LMSDS at 16,384 Hz. The dataset was split into 80% training, 10% validation, and 10% testing subsets. The LSTM model was trained using two LSTM layers (each with 64 hidden units), a dropout layer (0.2), and a fully connected SoftMax layer. The model was optimized using the Adam optimizer (learning rate = 0.001, batch size = 32, and a range of 60–100 epochs) with early stopping to prevent overfitting.

Results and discussion

Vibratory analysis

This section presents a conventional vibration analysis of the testing sets. The VMD decomposition parameters (K, α) were set based on Isham et al.³⁶ and Feng et al.,³⁸ resulting in an average number of IMFs equal to 5. Out of the large number of samples treated, the ball bearing defects and the gear defects of the CWRUDS and LMSDS, respectively, are presented below.

Figure 4 depicts the analysis procedure, with Figure 4(a) showing the time domain representation of three signals with small ball defects, Figure 4(b) presenting the resulted IMFs for each signal, and Figure 4(c) displaying the signals’ envelope spectrum. By analyzing the envelope spectrum in Figure 4(c), we can clearly see that there are no fault characteristics of the ball defect BPF. Despite the presence of some peaks in IMFs 2 and 3 of the first signal and IMF 3 of the second signal, these peaks correspond to the rotation frequency Fr and do not provide us with a diagnosis of the bearing’s condition.

Figure 4.

Vibratory analysis of small ball defect of CWRUDS: (a) time domain representation, (b) resulting IMFs, and (c) envelope spectrum of the IMFs.

Figure 5 shows the envelope spectrum of the remaining gravities in the ball defect. Figure 5(a) depicts various samples of the average defect, while Figure 5(b) examines the critical defect. The absence of the BPF is also evident in these envelope spectrums. Despite the VMD’s robustness for fault detection, which allowed it to detect both inner and outer ring defects in the same dataset, diagnosing the ball defect in all three severity levels proved impossible in this case. These findings are highly consistent and confirm the works of Nouioua et al.³⁹ and Smith and Randall.⁴⁰

Figure 5.

IMFs’ envelope spectrum of ball bearing defect signals from CWRUDS: (a) average ball defect and (b) critical ball defect.

The signals of the LMSDS contain noise, making the detection more difficult. It was impossible to detect in all cases. The VMD is an effective signal processing tool, but the measured signals in this case presented a challenge. In the case of gear defects, analysis of the small defect signals shown in Figure 6 yielded no results. Figure 6(a) depicts the time domain representation of the three signals, and it is clear that the signals are corrupted with high levels of noise.

Figure 6.

Vibratory analysis of small gear defect of LMSDS: (a) time domain representation, (b) resulting IMFs, and (c) envelope spectrum of the IMFs.

The resulting IMFs in Figure 6(b) are analyzed using an envelope analysis to produce the spectrum in Figure 6(c). The analysis of these spectra reveals the presence of some peaks corresponding to Fr₁ and its harmonics (x2, x3, …) that correspond to the first shaft, which can be translated into the presence of misalignment on the input shaft, whereas the spectrum in Figure 6(c) revealed the absence of Fr₂, which is related to the defected gear.

The analysis of the remaining gear defects in Figure 7 continues without yielding any clear indications of the defects. In Figure 7(a) and (b), the presence of F_r₁ and its harmonics confirms the misalignment diagnosis for average and critical gear defects where the harmonics’ amplitude exceeds the rotation frequency.

Figure 7.

IMFs’ envelope spectrum of gear defect signals from LMSDS: (a) average gear defect and (b) critical gear defect.

Figure 8(a) to (c) show the analysis of combined gear defects, which represent critical and small, critical and average, and critical and critical defects, respectively. Some F_r₂ peaks can be seen, but there are no harmonics, and F_r₃ is completely absent, making it impossible to detect gear defects.

Figure 8.

IMFs’ envelope spectrum of combined gear defect signals from LMSDS: (a) critical and small defects, (b) critical and average defects, and (c) critical and critical defects.

The diagnostic process for the two datasets containing gear and bearing defects revealed that traditional VMD analysis struggled to accurately detect the majority of fault types, with the exception of inner and outer ring defects. While acknowledging these limitations, it is important to emphasize the inherent complexity and variability of fault signatures, which may render traditional methods ineffective in certain scenarios.

Application of the VMD-LSTM combination

This section presents the results of the classification and fault diagnosis of gear and bearing defects found in both sets. This manuscript provides a comprehensive investigation of the sets using multiple recursive neural networks (RNNs) to detect faults correctly. The RNNs are created by feeding the feature matrix obtained from the calculation of the scalar indicators of the IMFs produced by the VMD decomposition to the LSTM networks. Table 4 provides a comprehensive overview of the feature matrix extracted from CWRUDS signals that are free of defects. The feature matrix includes a wide range of signal-derived parameters, providing useful information about the machine’s condition. The feature matrix captures a wide range of parameters, allowing the LSTM network to learn complex relationships and temporal dependencies in data. Each element of the feature matrix adds valuable information to the underlying vibrational characteristics of the machinery. Using Table 4 as the feature matrix input allows the LSTM network to effectively learn and adapt to the complexities of the data, resulting in robust fault detection capabilities. The model’s predictive accuracy can be refined through iterative training and validation processes, resulting in actionable insights for proactive maintenance and condition monitoring of rotating machinery.

Table 4.

Feature matrix of a signal without defect from the CWRUDS.

Scalar indicator	Signal without defect
	IMF1	IMF2	IMF3	IMF4	IMF5
STD	0.03958896	0.0368447	0.02334455	0.00817395	0.00174969
Skewness	0.05681834	−0.00012252	−1.85E-05	0.00533201	4.48E-05
Kurtosis	2.71165859	2.89355662	1.57581672	3.15807748	3.13994139
Peak to Peak	0.25788425	0.24297596	0.09014244	0.08313773	0.01543696
RMS	0.04057073	0.03684446	0.02334432	0.00817412	0.00174968
Crest Factor	3.35643633	3.32929993	1.91088351	4.96755348	4.41690562
Shape Factor	1.24073652	1.26023958	1.11796921	1.25710614	1.2617955
Impulse Factor	4.16445314	4.19571555	2.13630892	6.24474199	5.57323165
Margin Factor	127.357565	143.51159	102.308736	960.385072	4019.17476
Energy	79.0072305	65.1606772	26.1579428	3.20717998	0.14694662
Crest Value	0.13617307	0.12266625	0.04553417	0.04253234	0.00772818
K Factor	0.00552464	0.00451957	0.00106296	0.00034766	1.35E-05
Entropy	451.797539	382.196711	187.878762	28.4404843	1.75466756

Figure 9 shows that the RNNs trained for the CWRUDS start with defect type classification (RNN1) and then proceed to severity classification of each type of defect (RNN2, RNN3, RNN4). Finally, to test the proposed approach further, a classification of all classes from this set was performed (RNN5). This last section includes the procedure used by the previous RNNs (RNN1, RNN2, and RNN3) to provide a direct fault type and its corresponding severity.

Figure 9.

Trained RNNs for fault detection of CWRUDS.

The RNNs trained for the LMSDS shown in Figure 10 were built using the same concept as the previous one. The initial step is to determine the defect type (RNN6). The next step was to determine the defect’s severity of the bearing and gear defects (RNN7, RNN8). Finally, a classification of the combination of the two defects was conducted (RNN9) to obtain a complete diagnosis.

Figure 10.

Trained RNNs for fault detection of LMSDS.

This study presents the RNNs that were used to classify the defect types in both sets (RNN1 and RNN6). For the severity classification, we chose ball defects from CWRUDS (RNN2) and gear defects from LMSDS (RNN8) in accordance with the previous vibratory analysis. Figure 11 shows the training progress of RNN1. The total number of iterations was set to 60, and the training accuracy reached 100% before the end of the process, despite some disruption at the start of the training process, as shown in Figure 11. The model initially had high variance, but as progress was made and the performance of the LSTM networks in sequential training improved, the RNN converged, and both training accuracy and validation (Figure 11(a)) reached 100% and stabilized before the end of learning. The loss function in Figure 11(b) confirms the model’s convergence, as we can see a good concordance with the training process and lower loss values.

Figure 11.

Training progress of RNN1: (a) training accuracy and (b) loss.

Figure 12 depicts the trained RNN1’s confusion matrix for detecting CWRUDS fault types. The strong diagonal concentration demonstrates the RNN1’s performance in identifying various states of the machine’s health. Even with multiple states, the trained network correctly predicted each target class. Despite the limitations of traditional vibratory analysis using VMD, the proposed approach accurately diagnosed the health condition, providing an intelligent fault detection approach for bearing faults.

Figure 12.

The confusion matrix resulted from the trained RNN1 for defect type classification of CWRUDS.

Figure 13 displays the performance of the remaining trained RNNs for the CWRUDS. Figure 13(a) to (c) show the classification results for the severity detection of various bearing defects found in the CWRUDS, which supplement the results from RNN1. Figure 13(a) depicts the confusion matrix for RNN2 detection of the ball defect. The classification accuracy was 100%, demonstrating the proposed approach’s ability to detect ball defects accurately, even in early stages, such as small defects. Figure 13(b) and (c) show the results of RNNs 3 and 4, and the classification is 100% accurate. This result confirms those obtained from classical vibratory analysis, in which VMD was able to detect defects in both the inner and outer rings.

Figure 13.

Confusion matrix resulted from the classification of CWRUDS: (a) RNN2, (b) RNN3, (c) RNN4, and (d) RNN5.

Figure 13(d), on the other hand, shows the results for the classification using RNN4, in which the proposed approach performed admirably in detecting bearing defects in the CWRUDS. The classification was performed on all 10 cases in the data set, and the accuracy was 100%, demonstrating the proposed approach’s ability to monitor machine conditions and diagnose defects. The output class correctly classified each target class, from healthy bearing to various defects (ball, inner ring, outer ring), as well as the three gravities of the defect (small, average, critical).

Figure 14 shows the progress of the LMSDS’s RNN6 training for defect-type classification. The learning accuracy reached 100% before the end of iterations, which is consistent with the validation accuracy shown in Figure 14(a). Both training and validation results indicate that the trained RNN6 has learned the characteristics. The loss shown in Figure 14(b) is also consistent with both of them, as we can see that the loss has dropped to nearly zero.

Figure 14.

Training progress of RNN6: (a) training accuracy and (b) loss.

Figure 15 shows the classification results for RNN6. The confusion matrix shows a complete concentration on the diagonal, all of the target classes were correctly predicted, and the proposed approach was able to identify and classify different machine cases, even in the presence of gears and bearings defects.

Figure 15.

The confusion matrix resulted from the trained RNN6 for defect type classification of LMSDS.

Figure 16 illustrates the performance evaluation of the LMSDS using confusion matrices generated from three different fault severity categorization scenarios. First, Figure 16(a) depicts the confusion matrix for bearing severity classification generated by RNN7, which demonstrates an impressive 100% accuracy. This demonstrates the effectiveness of combining VMD and LSTM in accurately classifying the severity levels of bearing defects.

Figure 16.

Confusion matrix resulted from the classification of LMSDS: (a) results of RNN7, (b) results of RNN8, and (c) results of RNN9.

Similarly, Figure 16(b) depicts the confusion matrix from RNN8 for gear defect severity classification, which achieves a remarkable accuracy of 100% and detects all gear faults. Even when combined defects are present, the proposed approach is capable of correctly classifying the severity levels of gear defects. However, when confronted with the challenge of classifying mixed gears and bearing defects, as shown in Figure 16(c), RNN9 results in a slight decrease in accuracy to 83.3%. This decrease in accuracy is due to the inherent complexity associated with the combined nature of gears and bearing defects. Overall, the use of VMD-LSTM as a combined approach is robust and reliable for fault type detection and severity classification in rotating machinery. Its ability to accurately identify fault types and severity levels, even in the presence of multiple defects, demonstrates its potential as a powerful tool for proactive maintenance and condition monitoring in industrial applications.

This approach’s key novelty is its ability to achieve high fault detection accuracy using a small dataset for training, reducing the need for extensive data that conventional deep learning models require. Using the VMD’s ability to extract meaningful features from signals, the LSTM can efficiently classify fault types and severity levels, even with a limited training dataset. This makes the proposed method especially useful in industrial applications where obtaining large labeled datasets is impractical. The method’s ability to handle small datasets is demonstrated by its high classification accuracy, which was validated by two separate datasets: bearing defects from CWRUDS and combined gear and bearing defects from LMSDS.

Performance evaluation of the proposed approach

Table 5 summarizes the results of vibratory analysis and the VMD-LSTM method for fault detection across two different datasets: CWRUDS and LMSDS. In the case of vibratory analysis with VMD, it is clear that while the method was effective in detecting inner and outer ring defects in the CWRUDS dataset, it had limitations in diagnosing other fault types.

Table 5.

Results of vibratory analysis and VMD-LSTM approach.

Dataset	Vibratory analysis		VMD-LSTM approach
	Defect	Detection	RNNs	Test accuracy (%)
CWRUDS			RNN1	100
	Ball defects	No detection	RNN2	100
	Inner ring defects	Fault detected	RNN3	100
	Outer ring defects	Fault detected	RNN4	100
			RNN5	100
LMSDS			RNN6	100
	Bearing defects	No detection	RNN7	100
	Gear defects	No detection	RNN8	100
	Bearing + Gear defects	No detection	RNN9	83.3

Specifically, VMD failed to diagnose ball defects in the CWRUDS because ball defects are difficult to detect, and it did not provide a diagnosis for any fault cases within the LMSDS. This set’s signals were corrupted with noise. In contrast, the proposed VMD-LSTM approach proved to be a strong substitute, outperforming both datasets. The VMD-LSTM method outperformed traditional vibratory analysis by detecting all fault types in the CWRUDS with 100% accuracy, including ball faults that classical VMD analysis failed to detect. Furthermore, the VMD-LSTM approach maintained its strong performance in fault detection within the LMSDS, achieving 100% accuracy for all fault types except combined gears and bearings, where it still achieved 83.3% accuracy because gear defects are typically more prevalent than bearing defects.

Using a small set of training samples, the proposed VMD-LSTM approach is compared to several other well-established methods, including Long Short-Term Memory (LSTM),⁴¹ Multi-Layer Perceptron (MLP),⁴² One-Dimensional Convolutional Neural Network (1D-CNN),⁴³ and Two-Dimensional Convolutional Neural Network (2D-CNN).⁴⁴ The parameters for each method were obtained from previous works to ensure a fair and consistent comparison.

The comparison demonstrates the effectiveness of the VMD-LSTM approach in fault detection and severity classification, especially in data-constrained scenarios. Table 6 summarizes the performance of each classifier on the CWRUDS dataset across different defect types, demonstrating that the VMD-LSTM model outperforms other classifiers by achieving 100% accuracy across all defect types and the entire dataset.

Table 6.

Results for the classification models of the CWRUDS.

Classifier	CWRUDS’ test accuracy
Classifier	Defect type (%)	Ball defect (%)	Inner ring defects (%)	Outer ring defect (%)	All dataset (%)
MLP	41.66	33.33	58.33	66.66	46.66
1D-CNN	50.00	40.66	58.33	58.33	53.33
2D-CNN	66.66	41.66	80.00	83.33	75.00
LSTM	58.33	50.00	66.66	66.66	73.33
VMD-LSTM	100	100	100	100	100

While MLP and 1D-CNN have relatively low overall accuracy (46.66% and 53.33%), with particular difficulties in detecting inner ring defects, 2D-CNN and LSTM perform better, with total accuracies of 75.00% and 73.33%, respectively. However, neither can match the perfect classification achieved by VMD-LSTM, demonstrating its efficacy in fault detection in rotating machinery.

Table 7, on the other hand, compares the classification performance of various models on the LMSDS dataset for bearing defects, gear defects, and combined gear and bearing defects. The VMD-LSTM method achieves 100% accuracy for bearing and gear defects and 83.3% accuracy for combined defects, demonstrating its robustness. In contrast, MLP has the lowest overall performance, particularly in detecting bearing defects (55.56%) and combined defects (60.41%). 1D-CNN and LSTM perform moderately, with accuracies ranging from 66.70% to 77.77% for bearing and gear defects but lower for combined defects. 2D-CNN achieves high accuracy, particularly for gear defects (94.44%) and bearing defects (88.88%), but falls short of VMD-LSTM performance. Overall, VMD-LSTM is the most effective model for fault detection in this dataset.

Table 7.

Results for the classification models of the LMSDS.

Classifier	LMSDS’ test accuracy
Classifier	Defect type (%)	Bearing defects (%)	Gear defects (%)	Combined gears and bearings defects (%)
MLP	58.33	55.56	66.66	60.41
1D-CNN	75.00	66.70	77.77	68.75
2D-CNN	91.66	88.88	94.44	79.16
LSTM	83.34	77.77	83.33	72.91
VMD-LSTM	100	100	100	83.3

The VMD-LSTM approach combines the strengths of VMD’s signal decomposition with LSTM’s sequential learning capabilities to provide a powerful and dependable framework for identifying fault patterns and assessing fault severity. This method not only overcomes the limitations of traditional vibratory analysis techniques but also excels at dealing with complex fault scenarios. Furthermore, it demonstrates consistent and superior diagnostic performance across a wide range of defect types, even when only minimal training data is used, making it especially useful in real-world applications where data is often limited.

Conclusion

This study described and evaluated a novel method for detecting defects in rotating machinery gears and bearings that combines Variational Mode Decomposition (VMD) with Long Short-Term Memory (LSTM) networks. The proposed VMD-LSTM framework outperformed classical vibratory methods and conventional classifiers in detecting a wide range of gear and bearing defects. The proposed approach was tested on two distinct datasets: the first (CWRUDS) from Case Western Reserve University, which contained bearing defects, and the second (LMSDS) from the Laboratory of Mechanics and Structure at the University of Guelma in Algeria, which contained gear and bearing defects.

VMD performed well in fault detection, detecting inner and outer ring defects in the CWRUDS, but it had limitations in identifying specific defect types, such as ball defects in the same set and all defects in the LMSDS noise signals. The limitations of conventional vibratory analysis are overcome by combining VMD’s adaptive signal decomposition capabilities with LSTM’s sequential learning abilities. The method accurately detects complex fault signatures and classifies defect types and severity. The VMD-LSTM combination produced promising results, particularly in detecting complex defects that are difficult to detect using individual methods.

In comparison to traditional classifiers such as MLP, 1D-CNN, 2D-CNN, and standalone LSTM, the VMD-LSTM model consistently performed better, particularly when dealing with complex fault signatures. The intelligent combination of advanced signal processing and deep learning techniques enables the proactive monitoring of rotating machinery.

This study presented a fundamental approach to fault detection in rotating machinery, particularly in data-limited scenarios. However, several promising research directions can help to increase its effectiveness. One important area is to optimize the VMD-LSTM fusion technique by investigating additional scalar indicators in the time, frequency, and time-frequency domains, which could improve feature extraction and classification accuracy. Another potential advancement is the incorporation of traditional classifiers, such as Support Vector Machines (SVM) and k-Nearest Neighbors (kNN), to assess their compatibility with deep learning algorithms. Finally, implementing the proposed approach in real-time industrial settings would demonstrate its practical applicability and enable proactive fault detection in machinery systems.

Footnotes

Handling Editor: Divyam Semwal

ORCID iDs

Ammar Mrabti

Ramdane Younes

Nouredine Ouelaa

Tarek Kebabsa

Zakarya Ouelaa

Funding

The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: The current research was conducted by the “Structural Dynamics & Industrial Maintenance” research group at the Mechanics & Structures Laboratory (LMS) of the University 8 Mai 1945, Guelma, Algeria, under the funding of the General Directorate of Scientific Research and Technological Development (DGRSDT) through the PRFU research project: A11N01UN240120220004.

Declaration of conflicting interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

References

Kebabsa

Babouri

Djebala

, et al. Advanced diagnostic techniques for turbo compressors: a spectral analysis approach for preventive maintenance. Adv Mech Eng 2024; 16: 16878132241252329.

Staszewski

Tomlinson

GR.

Application of the wavelet transform to fault detection in a spur gear. Mech Syst Signal Process 1994; 8: 289–307.

Gai

The processing of rotor startup signals based on empirical mode decomposition. Mech Syst Signal Process 2006; 20: 222–235.

. Gear fault detection based on ensemble empirical mode decomposition and Hilbert-Huang transform. In: 2008 fifth international conference on fuzzy systems and knowledge discovery, Jinan, China, 18–20 October 2008, pp.173–177. New York: IEEE.

Lei

Liu

Ouazri

, et al. A fault diagnosis method of rolling element bearings based on CEEMDAN. Proc IMechE, Part C: J Mechanical Engineering Science 2015; 231: 1804–1815.

Qin

Fei

, et al. Fault diagnosis of rolling bearing based on ICEEMDAN and SSA-RVM. J Phys Conf Ser 2023; 2419: 012077.

Geropp

Envelope analysis - a signal analysis technique for early detection and isolation of machine faults. IFAC Proc Vol 1997; 30: 977–981.

Babouri

Ouelaa

Kebabsa

, et al. Application of the cyclostationarity analysis in the detection of mechanical defects: comparative study. Int J Adv Manuf Technol 2019; 103: 1681–1699.

Sheng

Zhang

Applications in bearing fault diagnosis of an improved Kurtogram algorithm based on flexible frequency slice wavelet transform filter bank. Measurement 2021; 174: 108975.

10.

Dragomiretskiy

Zosso

Variational mode decomposition. IEEE Trans Signal Process 2014; 62: 531–544.

11.

Choudhury

Hong

Dhupia

. A comparative analysis between EMD- and VMD-based tacho-less order tracking techniques for fault detection in gears. In: Oberst

Halkon

, et al. (eds) Vibration engineering for a sustainable future. Springer International Publishing, 2021, pp.203–209.

12.

Mohanty Gupta

Raju

KS.

Bearing fault analysis using variational mode decomposition. In: 2014 9th international conference on industrial and information systems (ICIIS), Gwalior, India, 15–17 December 2014, pp.1–6. New York: IEEE.

13.

Sharma

Parey

Extraction of weak fault transients using variational mode decomposition for fault diagnosis of gearbox under varying speed. Eng Fail Anal 2020; 107: 104204.

14.

Sharma

Gear fault detection based on instantaneous frequency estimation using variational mode decomposition and permutation entropy under real speed scenarios. Wind Energy 2021; 24: 246–259.

15.

Chen

Liu

, et al. Gearbox fault diagnosis based on VMD and acoustic emission technology. In: 2019 IEEE international instrumentation and measurement technology conference (I2MTC), Auckland, New Zealand, 20–23 May 2019, pp.1–6. New York: IEEE.

16.

Zhang

Shao

Ding

, et al. An improved VMD approach for sensitive feature extraction in the application of gears fault classification. In: Ball

Gelman

Rao

(eds) Advances in asset management and condition monitoring. Springer International Publishing, 2020, pp.223–232.

17.

Dike

Zhou

Deveerasetty

, et al. Unsupervised learning based on artificial neural network: a review. In: 2018 IEEE international conference on cyborg and bionic systems (CBS), Shenzhen, China, 25–27 October 2018, pp.322–327. New York: IEEE.

18.

Mnassri

El Adel

Ananou

, et al. Fault detection and diagnosis based on PCA and a new contribution plot. IFAC Proc Vol 2009; 42: 834–839.

19.

Babbar

Syrmos

. Data driven approach for fault detection and identification using competitive learning techniques. In: 2007 European control conference (ECC), Kos, Greece, 2–5 July 2007, pp.2280–2287. New York: IEEE.

20.

Saucedo-Dorantes

Delgado-Prieto

Romero-Troncoso

, et al. Multiple-fault detection and identification scheme based on hierarchical self-organizing maps applied to an electric machine. Appl Soft Comput 2019; 81: 105497.

21.

Rajakarunakaran

Venkumar

Devaraj

, et al. Artificial neural network approach for fault detection in rotary system. Appl Soft Comput 2008; 8: 740–748.

22.

Widodo

Yang

B-S.

Support vector machine in machine condition monitoring and fault diagnosis. Mech Syst Signal Process 2007; 21: 2560–2574.

23.

Janssens

Slavkovikj

Vervisch

, et al. Convolutional neural network based fault detection for rotating machinery. J Sound Vib 2016; 377: 331–345.

24.

Qian

, et al. Enhanced K-nearest neighbor for intelligent fault diagnosis of rotating machinery. Appl Sci 2021; 11: 919.

25.

Yang

Huang

, et al. Rotating machinery fault diagnosis using long-short-term memory recurrent neural network. IFAC Pap OnLine 2018; 51: 228–232.

26.

Xie

Zhang

. A long short term memory recurrent neural network approach for rotating machinery fault prognosis. In: 2018 IEEE CSAA guidance, navigation and control conference (CGNCC), Xiamen, China, 10–12 August 2018, pp.1–6. New York: IEEE.

27.

Anwarsha

Narendiranath Babu

Intelligent fault detection of rotating machinery using long-short-term memory (LSTM) network. In: Al-Sharafi

Al-Emran

Al-Kabi

, et al. (eds) Proceedings of the 2nd international conference on emerging technologies and intelligent systems. Springer International Publishing, 2023, pp.76–83.

28.

Cao

Zhang

Wang

, et al. Intelligent fault diagnosis of wind turbine gearbox based on long short-term memory networks. In: 2019 IEEE 28th international symposium on industrial electronics (ISIE), Vancouver, BC, Canada, 12–14 June 2019, pp.890–895. New York: IEEE.

29.

Masri

Al-Jabi

LSTM neural network techniques-based analytical predictive models for wind energy and mechanical power. Adv Mech Eng 2022; 14: 16878132221143633.

30.

Damou

Ratni

Benazzouz

Intelligent multi-fault identification and classification of defective bearings in gearbox. Adv Mech Eng 2024; 16: 16878132241246673.

31.

Moumene

Ouelaa

Gears and bearings combined faults detection using optimized wavelet packet transform and pattern recognition neural networks. Int J Adv Manuf Technol 2022; 120: 4335–4354.

32.

Peng

, et al. A novel fault diagnosis method of rotating machinery via VMD, CWT and improved CNN. Measurement 2022; 200: 111635.

33.

Chen

, et al. Independence-oriented VMD to identify fault feature for wheel set bearing fault diagnosis of high speed locomotive. Mech Syst Signal Process 2017; 85: 512–529.

34.

Almutairi

Sinha

JK.

Experimental vibration data in fault diagnosis: a machine learning approach to robust classification of rotor and bearing defects in rotating machines. Machines 2023; 11: 943.

35.

Tong

Cao

Han

, et al. A fault diagnosis approach for rolling element bearings based on dual-tree complex wavelet packet transform-improved intrinsic time-scale decomposition, singular value decomposition, and online sequential extreme learning machine. Adv Mech Eng 2017; 9: 1687814017737721.

36.

Isham

Leong

Lim

, et al. Variational mode decomposition: mode determination method for rotating machinery diagnosis. J Vibroeng 2018; 20: 2604–2621.

37.

Case Western Reserve University (CWRU) Bearing Data Center. Download a data file, https://engineering.case.edu/bearingdatacenter/download-data-file (accessed April 2023)

38.

Feng

Zhang

Zuo

MJ.

Planetary gearbox fault diagnosis via joint amplitude and frequency demodulation analysis based on variational mode decomposition. Appl Sci 2017; 7: 775.

39.

Nouioua

Younes

Mrabti

, et al. Self-organizing maps and VMD for accurate diagnosis of bearing defects. J Vib Eng Technol 2024; 12(3): 5241–5255.

40.

Smith

Randall

RB.

Rolling element bearing diagnostics using the Case Western Reserve University data: a benchmark study. Mech Syst Signal Process 2015; 64–65: 100–131.

41.

Hochreiter

Schmidhuber

Long short-term memory. Neural Comput 1997; 9: 1735–1780.

42.

Rumelhart

Hinton

Williams

RJ.

Learning representations by back-propagating errors. Nature 1986; 323: 533–536.

43.

Kiranyaz

Ince

Gabbouj

Real-time patient-specific ECG classification by 1-D convolutional neural networks. IEEE Trans Biomed Eng 2016; 63: 664–675.

44.

Lecun

Bottou

Bengio

, et al. Gradient-based learning applied to document recognition. Proc IEEE 1998; 86: 2278–2324.

Robust fault detection and severity classification in rotating machinery using VMD-LSTM for limited data scenarios

Abstract

Keywords

Introduction

Proposed approach

Theoretical background

Variational mode decomposition

Long short-term memory

Time domain features

Experimentation

CWRUDS

LMSDS

Results and discussion

Vibratory analysis

Application of the VMD-LSTM combination

Performance evaluation of the proposed approach

Conclusion

Footnotes

ORCID iDs

Funding

Declaration of conflicting interests

References