New criteria for wrapper feature selection to enhance bearing fault classification

Abstract

Classification is a critical task in many fields, including signal processing and data analysis. The accuracy and stability of classification results can be improved by selecting the most relevant features from the data. In this paper, a new criterion for feature selection using wrapper method is proposed, which is based on the evaluation of the classification results according to the accuracy and stability (standard deviation) of each class and the number of selected features. The proposed method is evaluated using Random Forest (RF) and Ant Colony Optimization (ACO) algorithms on a benchmark dataset. Results show that the proposed method outperforms classical feature selection methods in terms of accuracy and stability of classification results, especially for the difficult-to-classify combined damage class. This study demonstrates the effectiveness of the proposed new wrapper feature selection criterion to improve the performance of classification algorithms with higher stability (STD: C1 = 0.5, C2 = 0.8, C3 = 0.6, C4 = 1.8) and better accuracy (average C1 = 98.5%, C2 = 96.6%, C3 = 9.5%, C4 = 93) for the both; the statoric current and the vibration signal compared to other techniques. Machine learning methods had proven their efficiency in time-varying machines fault diagnosis when taking vibration signals and statoric currents extracted features as inputs. However, the use of the both demonstrated a higher robustness and a remarkable superiority.

Keywords

Vibration signature stator current bearings faults classification new criteria for wrapper feature selection Random Forest Ant Colony Optimization adaptive time-varying morphological filtering

Introduction

Rotating machinery, such as pumps, fans, compressors, generators, and motors, plays a crucial role in various industrial and commercial applications.¹ Despite their importance, the continuous operation of rotating machinery can result in various types of faults, including bearing faults.² Bearings are critical components in rotating machinery and their failure can cause significant damage to the machinery and surrounding equipment.³ Bearing faults can occur due to a variety of causes, including improper lubrication, misalignment, excessive loads, and manufacturing defects.¹ These faults can result in increased vibration, noise, and heat, which can be detected by monitoring the vibration signals or current signals of the machinery. The early detection of bearing faults is essential for ensuring the safe and efficient operation of the machinery and preventing catastrophic failures.⁴

To address this issue, many condition monitoring technique are proposed.⁵ Most diagnostic techniques are based on signal processing techniques, which is divided into three steps: processing the raw data, extracting the features, and creating the model,^6,7 but these classical techniques have several limitations.^8,9 Since signal processing techniques are based on the frequency signature, they cannot be used for fault identification and location (two faults may have the same frequency signature, such as gears).¹⁰ Recently, many researchers have focused to develop a model, based upon a new feature extraction techniques in signal processing using morphological filter such as enhanced morphological difference filtering (EMDF) to diagnosis bearings failures,¹¹ a novel multi-scale morphological filtering algorithm based on the entropy threshold (IET-MMF) to extract effectiveness fault feature information and perform early fault detection of bearings,¹² and unbiased-autocorrelation morphological filter (UAMF) to remove random impulse interference by combining a morphological filter with an autoregressive filter,¹¹ these techniques are successfully applied in fault feature extraction information. But they still have given certain restrictions, because the conventional selection of the structural element (SE) scale parameters of morphological based on sliding window, lead to complicated and inappropriate select morphological filtering calculation process for bearings under high background noise.

To improve the performance of developing the robust and adaptive SE scale for bearings fault feature extraction information, an adaptive time varying morphological filtering (ATVMF) algorithm is proposed for adaptive extraction of impulse features.¹³ The ATVMF uses a new time-varying SE strategy to adaptively determine the shape and scale of SE according to the inherent characteristics of the vibration signal and stator currents, and adopts a new morphology hat product operator (MHPO)¹⁴ to extract fault-related impulse features from vibration signals and stator currents. This model allows the ATVMF to display significantly enhanced impulse feature extraction capability and higher computational efficiency compared to on traditional methods.

In order to enhance the robustness of numbers and nature of features extraction from ATVMF algorithm that can’t affect classification performance and accuracy. For this purpose, the researchers proposed to use feature selection algorithms before the classification stage.^15–17 Feature selection, as a dimensionality reduction techniques, such as a particle swarm optimization,¹⁸ Genetic Algorithm,¹⁹ Marine Predators Algorithm (MPA),²⁰ Henry Gas Solubility Optimization (HGSO),²¹ Emperor Penguin Optimizer (EPO),²² Slime Mould Algorithm (SMA),²³ and Tree Seed Algorithm (TSA).²⁴ However, these algorithms have some drawbacks and suffering from a high computational complexity. By contrast, Wrapper methods called Ant Colony Optimization (ACO)²⁵ it is require a learning algorithm and overcome the previous limitations which leads to a higher accuracy but also a much higher computation time and aims to choose a small subset of the relevant features from the original ones by removing irrelevant, redundant, or noisy features. Therefore, the obtained feature selection are feed into the Random forest learning algorithms for feature classification.²⁶

In order to overcome the shortcomings of conventional signal processing-based feature extraction techniques and to enhance the reliability of fault identification, a new wrapper feature selection criterion based on Adaptive time-varying morphological filtering (ATVMF) is proposed, which combines with ant colony optimization algorithm (ACO) and Random forest (RF), to perform operating condition monitoring and fault classification of the bearings. The contributions of this research can be summarized in the following four points:

We propose a new feature selection criterion based on stability and accuracy of each class. To achieve this, we conduct 10 tests on the classifier to evaluate its performance.

The average accuracy and stability are then calculated by combining two costs: cost 1 and 2, which represent measures of mean accuracy and stability for each class. Cost 3 is the ratio of selected features to the total number of features.

The final cost is calculated using three previous equations of cost. The feature selection process is carried out using the ant colony optimization algorithm (ACO), which is a swarm intelligence optimization method that mimics the behavior of ants searching for food. Finally, the classification results are evaluated using the random forest classifier (RF), which is an ensemble learning method that creates multiple decision trees and combines their results to obtain a final prediction and compare it with similar classifier.

The method was tested on data collected under varying loads, force, and speeds and showed a high performance.

Theoretical background

Adaptive time-varying morphological filtering

Feature extraction is commonly concerned with highlighting important information to help the classification task. In the proposed system we used a new method of extraction named Adaptive time-varying morphological filtering (ATVMF) to extract time and frequency domains features.

The ATVMF algorithm is a method for extracting impulse features from vibration signals in rolling bearings. It uses an adaptive time-varying structure element (SE) strategy and a new morphology hat product operator (MHPO) to identify fault-related impulse features. The process of the ATVMF algorithm is as follows:

Collect the vibration acceleration signal.

Create a series of adaptive SEs based on the vibration signal.

Divide the vibration signal into small segments and use the MHPO and time-varying SE to extract fault-related impulse features using the ATVMF method.¹⁴

This approach provides improved impulse feature extraction and computational efficiency compared to traditional methods.

Here’s a mathematical representation of the ATVMF process:

Initialization: The ATVMF algorithm starts by initializing the morphological filter, usually by defining a structuring element (SE) and an operation (such as dilation or erosion).

Time-varying adaptation: At each time step, the morphological filter is adapted based on the current signal conditions. The adaptation process is typically implemented as a time-varying filter, given by:

h (t) = g {(t)}^{*} h (t - 1)

(1)

Where:

$h (t)$ : is the time-varying filter at time t.

$g (t)$ : is the time-varying gain.

$h (t - 1)$ : is the filter from the previous time step.

The time-varying gain can be computed based on various criteria, such as signal properties or statistical models.

Filtering: The signal is filtered using the adapted morphological filter:

y (t) = \max {x (t - d) - h (d)}

(2)

Where:

$y (t)$ = output signal at time t.

$x (t$ ) = input vibration signal at time t.

$h (d)$ = structuring element at time lag d.

This equation represents the erosion operation, where the minimum value in the neighborhood defined by the structuring element is subtracted from the current value of the input signal. The dilation operation is performed by reversing the order of subtraction in the above equation.¹³

ATVMF is a powerful signal processing technique that can effectively handle changing signal conditions and produce high-quality results. The key to the success of ATVMF is the ability to adapt the morphological filter over time to changing signal conditions, which makes it well suited for many signal processing tasks (Figure 1).¹³

Figure 1.

Flowchart of the ATVMF method for bearing fault diagnosis.

Ant Colony Optimization for feature selection

Feature selection means selecting and retaining only the most important features in the model. In wrapper method, the feature selection algorithm exits as a wrapper around the predictive model algorithm and uses the same model to select best features. Though computationally expensive and prone to over fitting, gives better performance.^27,28

The Ant Colony Optimization algorithm is inspired by the behavior of ants in finding food. Ants use pheromones to communicate the path to food, with the quantity of pheromones laid depending on the distance, quantity, and quality of the food. When an ant detects a pheromone trail, it is likely to follow it and lay more pheromones, strengthening the trail and making it more attractive to other ants. This creates a positive feedback loop where the most attractive path is the one with the highest concentration of pheromones. This behavior is applied in optimization problems, where the algorithm finds the most efficient path.

Ant Colony Optimization (ACO) is a meta-heuristic algorithm proposed to solve combinatorial optimization problems. It is based on the behavior of ants in finding food sources, using pheromones to communicate the quality of a path.²⁹ The ACO was initially used to solve the Traveling Salesman Problem and has since been applied to other optimization problems like data mining, telecommunications networks, and vehicle routing. It has been shown to be effective in finding good solutions (Figure 2).³⁰

Figure 2.

Ant Colony Optimization algorithm flowchart.

The ACO algorithm consists of the following steps:

Initialization: The algorithm starts by initializing the pheromone trail on all edges in the solution space to a small positive value.

Solution construction: At each iteration, each ant constructs a solution by visiting a series of cities (or nodes) based on a probability rule that balances exploitation and exploration. The probability of choosing a particular edge is given by:

p (i, j) = (τ (i, j)^α^{*} η (i, j)^β) / \sum k (τ (i, k)^α^{*} η (i, k)^β)

(3)

Where:

$τ (i, j)$ is the pheromone trail on edge $(i, j)$ , $η (i, j)$ is a heuristic information about the edge $(i, j)$ , and α and β are user-defined parameters that control the relative importance of pheromone trail and heuristic information. The denominator is a normalization term that ensures that the probabilities add up to 1.

Pheromone update: After all ants have finished constructing solutions, the pheromone trail is updated using the following rule:

τ (i, j) = {(1 - ρ)}^{*} τ (i, j) + Δ τ (i, j)

(4)

Where:

$ρ$ is the evaporation rate (a value between 0 and 1 that controls how quickly the pheromone trail evaporates) and $Δ τ (i, j)$ is the amount of pheromone to be deposited on edge $(i, j)$ , which depends on the quality of the solution constructed by the ants.

The above equations describe the three categories of the Ant System (AS) optimization algorithm: ant-quantity, ant-density, and ant-cycle.

The ant-quantity algorithm updates the pheromone trail after each step, and the pheromone quantity per unit of length is given by equation (5). $Δ τ_{ij}^{k}$ is the quantity of pheromone laid on edge (i, j) by the kth ant between time t and t + 1. Q is a constant, $d_{ij}$ is the Euclidean distance between i and j, and $L^{k}$ is the tour length of the kth ant.

The ant-density algorithm updates the pheromone trail after each step, and the pheromone quantity per unit of length is given by equation (6). Q is a constant.

The ant-cycle algorithm updates the pheromone trail at the end of the tour. The pheromone quantity per unit of length is given by equation (7). $Δ τ_{ij}^{k}$ is the quantity of pheromone laid on edge (i, j) by the kth ant after n tours. Q is a constant, $d_{ij}$ is the Euclidean distance between i and j, and $L^{k}$ is the tour length of the kth ant.

Note: Equations (5) and (7) both use the same formula for the pheromone update rule, but they apply it at different times.

ANT-Quantity:

Δ τ_{ij}^{k} (t, t + 1) = {\begin{matrix} \frac{Q_{s}}{L^{k}} \\ 0 \end{matrix}

(5)

ANT-Density:

Δ τ_{ij}^{k} (t, t + 1) = {\begin{matrix} Q_{2} \\ 0 \end{matrix}

(6)

ANT-cycle:

Δ τ_{ij}^{k} (t, t + 1) = {\begin{matrix} \frac{Q_{s}}{L^{k}} \\ 0 \end{matrix}

(7)

The ACO algorithm is a simple and effective approach for solving optimization problems that can be applied to a wide range of real-world problems.³¹

In our work we interest our focus at classification using Random forest to get the best results of accuracy.

Random Forest for automatically classification

In the past few years, Deep Neural Networks (DNNs) have shown remarkable results in various application domains, including bearing problems. However, they have several disadvantages, including the need for a large labeled dataset, the complexity of hyper parameter tuning, the difficulty of training, and sensitivity to outliers or missing data.

In this paper, we adopted the random forest algorithm, which is a supervised machine-learning method suitable for binary and multiclass problems.³² It has been applied to a range of problems in various domains and has several desirable properties: robust to noise/outliers, fast, provides useful error information, simple to deploy (few hyper parameters), easy to parallelize, and can handle missing data. The main characteristics of this algorithm are:

Ensemble method combining multiple decision trees

Each tree in the forest votes for the best prediction

Avoids over-fitting by de-correlating trees through random feature selection

Handles missing data and non-linear relationships between features and target variable.

Can handle both binary and multi-class problems

Has been applied to various domains, such as gene selection, remote sensing, and protein prediction.

The equation used in Random Forest is the Gini Impurity, which is used to determine the quality of a split in the decision tree. The Gini Impurity is a measure of the probability of misclassifying a randomly chosen element from the dataset. It is defined as:

Gini Index = 1 - \sum_{i = 1}^{n} {P_{i}}^{2}

(8)

Where $P_{i}$ denote an element’s probability to be classified for a distinct class.

The random forest algorithm is an ensemble learning technique that combines multiple decision trees to form a stronger model. The trees are trained on different subsets of the data and features, reducing over fitting and improving accuracy.³³ During prediction, the class with the majority vote from all trees is assigned to the input sample as shown in Figure 3.

Figure 3.

General schematic diagram of the random forest classifier.

Experimental study

Data set

The benchmark dataset for bearing fault diagnosis described in Lessmeier et al.³⁴ consists of synchronously measured currents and vibration signals from six healthy bearings and 26 faulty bearings. Vibration and the stator currents were filtered and sampled at 64 kHz. The dataset includes both artificially induced and real damages (12 have artificially induced damages and 14 have real damages).

This paper focuses on diagnosing real bearing damages using the test data provided by a valuable dataset. The dataset includes four classes: healthy with severity condition operating, outer race, and inner race damage, as well as combined damage (see Tables 1 and 2). The detailed description of the datasets can be found in Lessmeier et al.³⁴ (Figure 4).

Table 1.

Categorization of datasets.

Healthy	Inner ring damage	Outer ring damage	Combined damage
K001	KI04	KA04	KB23
K002	KI14	KA15	KB24
K003	KI16	KA16	KB27
K004	KI17	KA22
K005	KI18	KA30
K006	KI21

Table 2.

Operating parameters.

No.	Rational speed (rpm)	Load torque (Nm)	Radial force (N)	Name of setting
0	1500	0.7	1000	N15_M07_F10
1	900	0.7	1000	N09_M07_F10
2	1500	0.1	1000	N15_M01_F10
3	1500	0.7	400	N15_M07_F04

Figure 4.

Experimental test ring.

The data sets (e.g. K001) consist of 80 measurements, each lasting 4 s, for the operating conditions listed in Table 2. The operating conditions (such as speed, torque, and radial force) vary among the diverse datasets.

Discussion results and comparative study

The suggested method for classifying bearing faults consists of three main steps, summarized in the flowchart in Figure 5.

Figure 5.

Flowchart of proposed method.

Features extraction based on adaptive ATVMF

Our method is carried on current and vibration database noted that the damage of this dataset is real damage caused by accelerated lifetime test. The dataset consists of healthy, inner ring damage, outer ring damage, and combined damage. As shown from Figures 6 and 7 we cannot distinguish the state of bearings because the fault-related impulse hidden by the noise. Firstly, we start by processing the signals by ATVMF, the signals of the four states are then decomposed by ATVMF after processing these signals. Secondly, the 20 statistical feature as illustrated in Table 3 are extracted in the time and frequency domain to construct the fault-related impulse features for stators currents and vibrations signals.

Figure 6.

Original stator currents.

Figure 7.

Original vibration signal.

Table 3.

Time domain and frequency domain features extracted.

Time domain features		Frequency domain features
Maximum	Max/ $x_{i}$ /	Mean	$F_{1}$ = $\frac{\sum_{k = 1}^{K} s (k)}{k}$
Minimum	Min/ $x_{i}$ /	Variance of mean frequency	$F_{2}$ = $\frac{\sum_{k = 1}^{K} {(s (k) - F_{1})}^{2}}{k - 1}$
Mean square value	$\frac{1}{N} \sum_{i = 1}^{N} {x_{i}}^{2}$	Skewness power spectrum	$F_{3}$ = $\frac{\sum_{k = 1}^{K} {(s (k) - F_{1})}^{2}}{k {(\sqrt{F_{2}})}^{2}}$
Rms	$\sqrt{\frac{1}{N} \sum_{i = 1}^{N} {x_{i}}^{2}}$	Kurtosis power spectrum	$F_{4}$ = $\frac{\sum_{k = 1}^{K} {(s (k) - F_{1})}^{4}}{k {F_{2}}^{2}}$
Kurtosis	$\frac{N \sum_{i = 1}^{N} {(x_{i} - \bar{x})}^{4}}{{(\sum_{i = 1}^{N} {(x_{i} -)}^{2})}^{2}}$	Frequency center	$F_{5} = \frac{\sum_{k = 1}^{K} f_{k} s (k)}{\sum_{k = 1}^{k} s (k)}$
Skewness	$\frac{N \sum_{i = 1}^{N} {(x_{i} - \bar{x})}^{3}}{(n - 1) σ^{3}}$	Root variance	$F_{6}$ = $\sqrt{\frac{\sum_{k = 1}^{K} {(f_{k} - F_{5})}^{2} s (k)}{k}}$
Mean	$\frac{\sum_{i = 1}^{N} x_{i}}{N}$	Root-mean square	$F_{7}$ = $\sqrt{\frac{\sum_{k = 1}^{K} {f_{k}}^{2} s (k)}{\sum_{k = 1}^{k} s (k)}}$
Variance	$\frac{\sum_{i = 1}^{N} {(x_{i} - \bar{x})}^{2}}{N}$	Mean frequency that cross the mean of time-domain signal	$F_{8}$ = $\sqrt{\frac{\sum_{k = 1}^{K} {f_{k}}^{4} s (k)}{\sum_{k = 1}^{K} {f_{k}}^{2} s (k)}}$
Standard deviation	$\sqrt{\frac{\sum_{i = 1}^{N} {(x_{i} - \bar{x})}^{2}}{N}}$	Stabilization factor	$F_{9} = \frac{\sum_{k = 1}^{K} {f_{k}}^{2} s (k)}{\sqrt{\sum_{k = 1}^{k} s (k) \sum_{k = 1}^{k} {f_{k}}^{4} s (k)}}$
Rssq	$\sqrt{\sum_{i = 1}^{N} / x_{i} /^{2}}$	Coefficient of variability	$F_{10} = \frac{F_{6}}{F_{5}}$
Peak2peak	Max(x)–Min(x)	Skewness	$F_{11}$ = $\frac{\sum_{k = 1}^{K} {(f_{k} - f_{s})}^{2} s (k)}{K {F_{6}}^{3}}$
Entropy	E(S) = − $\sum {S_{i}}^{2}$ log ( ${S_{i}}^{2}$ )	Kurtosis	$F_{12}$ = $\frac{\sum_{k = 1}^{K} {(f_{k} - f_{s})}^{4} s (k)}{K {F_{6}}^{4}}$

Feature selection and classification using classical wrapper method

Statistical features are extracted from ATVMF in the time and frequency domain in currents stators and vibration signal, lead to high dimensionality and high complexity of the extracted feature set, it is inherently difficult to select the most effective features that can be applied for bearings condition monitoring, which generally leads to inadequate detection and classification. To address this issue, the optimization step is introduced in order to selecting the relevant parameters and discarding those that are redundant. Hence, in order to enhance the robustness and effectiveness of proposed approach, a comparison carried out between RF in tandem with ACO and multiple classifiers (KNN, DT, RF, NB, and SVM) in tandem with optimization algorithms (EPO, HGSO, MPA, and TSA). The obtained results of these techniques were evaluated based on global accuracy and standard deviation (std) for 10 tests.

The classification results are divided in three processing tests, were tested using only the current signal as shown in Table A1, only the vibration signals, see Table A2, and the combined between both signals as illustrated in Table A3. It’s can observed from the Table 4, that the best results were obtained using the currents and vibration signals, with the RF classifier and ACO optimization, achieving an accuracy of 98.38 and stability a STD value of 0.31

Table 4.

Fault classification performance globally for vibration signal, current and both signals.

	Vibration signal	Current signal	Both
	RF-EPO	RF-ACO	RF-ACO
Max	95.42	96.88	98.75
Min	92.5	93.75	97.71
Mean	93.94	95.63	98.38
STD	0.94	0.95	0.31

To perform an analysis of the defect recognition ability of the proposed method for four different faults modes (C1, C2, C3, and C4), a set of faults predictions on four faults modes of the test set is implemented. The confusion matrix of the prediction results is shown in Figure 8.

Figure 8.

Recognition results for different bearings faults of RF in tandem ACO.

It is can be seen form Figure 8 the global accuracy is successfully predicted with higher percentage ranges is 98.33%. This means that the degree of fault differentiation in the identical class mode from other class fault modes is the highest. By contrast, we can see that the accuracy of combined damage (fourth class) is 90% and the stability std in the combined damage case (fourth class) is 2.7.

We concluded that, while obtain the best accuracy and stability classification result, the accuracy and stability classification results for each class can be different, as illustrated in Table 5.

Table 5.

Fault classification performance of RF in tandem ACO for each class.

ATVMF-RF-ACO
	C1	C2	C3	C4
Max	99.5	98.2	98.6	94
Min	100	99.1	99.3	97.3
Mean	97.8	96.8	97.8	90
STD	0.8	0.8	0.6	2.7

To improve the performance of accuracy and stability of system, a new criterion to select optimal features based on Wrapper method, not only improves the classification accuracy denoted as cost₁, but also enhances the stability for each class by decreasing the std for each class denoted as cost₂, it also reduces the number of inputs by selecting the relevant feature denoted as cost₃, which results in the decrease of the processing time and the increase of the speed of the classification process.

The flowchart of the proposed criterion is shown in the Figure 9. The number of selected features is used by a classifier, which is then simulated N times to test the stability of the classification results. The accuracy and the stability of classification (measured by standard deviation) for each class are calculated N times, and the average accuracy and std are calculated using the Cost₁ and Cost₂ equations.

Cos t_{1} = \sum_{i = 1}^{C} α_{i} μ_{i}

(9)

Cos t_{2} = \sum_{i = 1}^{C} β_{i} σ_{i}

(10)

Cos t_{3} = \frac{number of Selected feat}{total number of feat}

(11)

Figure 9.

Flowchart of the proposed criteria for feature selection.

Where:

N = Number of tests.

C = Number of classes.

$μ_{i} = (\frac{1}{N} \sum_{j = 1}^{N} 1 - Acc ij)$ Acc ij is the Mean Accuracy for each class.

$σ_{i} = \sqrt{\frac{\sum_{j = 1}^{N} {(Acc ij - μ_{i})}^{2}}{N}}$ is the STD for each class.

$Acc ij$ = The accuracy value for class $i$ in the test $j$ (e.g. $Acc 1, 2$ = Accuracy value for class 1 in the test 2).

$α_{i}$ = The weight of Cost 1.

$β_{i}$ = The weight of Cost 2.

The final cost is calculated by adding the results from $Cos t_{1}$ , $Cos t_{2}$ , and the number of features ( $Cos t_{3})$ , as shown in equation (8). The results are then evaluated to determine if the cost is minimal, indicating good accuracy in the classification results. If the cost is not minimal, the selected features are adjusted.

Cost = W_{1} \times Cos t_{1} + W_{2} \times Cos t_{2} + W_{3} \times Cos t_{3}

(12)

Where: $(W_{1}, W_{2}, W_{3})$ = The weight of final Cost.

Tables 6 to 10 shows obtained results for several classifiers (KNN, DT, RF, NB, and SVM) combined with several optimization algorithms (ACO, EPO, HGSO, MPA, and TSA). Table 6 shows the best obtained results. From these tables we can notice that the proposed criterion for feature selection was able to improve the accuracy and stability of classification especially at the fourth class (combined damage) we obtain an accuracy of 93% and stability of classification 1.8.

Table 6.

Classification results using ACO with several classifiers.

		RF				KNN				SVM
		C1	C2	C3	C4	C1	C2	C3	C4	C1	C2	C3	C4
ACO	Mean	99.5	98	99.4	95.5	92.3	87.6	92	81.2	96.6	90.1	93.2	91
	Max	100	99.2	100	98.6	95.4	93.2	94.6	85.5	98.6	94.4	96.2	97.3
	Min	98.5	96.6	98.5	93	87.2	81.1	88.7	77.8	93.7	83.6	89.6	83.1
	STD	0.5	0.8	0.6	1.8	2.7	4.2	1.8	2.5	1.5	3.3	2.1	4.8
		DT				NB
		C1	C2	C3	C4	C1	C2	C3	C4
Mean	97.1	93.5	94.6	85	88.9	53.7	55.9	48.7
	Max	100	96.3	97.3	90.2	92.8	62	64.6	61.6
	Min	93.2	88.1	90.6	77.9	84.3	47.9	43.4	37.5
	STD	2.2	2.5	2.4	4.7	2.9	3.9	7.4	7

Table 7.

Classification results using EPO with several classifiers.

		RF				KNN				SVM
		C1	C2	C3	C4	C1	C2	C3	C4	C1	C2	C3	C4
EPO	Mean	92.3	83.2	91.1	87.9	92.4	86.9	92.3	86.8	92.5	82.1	91.5	81.1
	Max	97.1	90	95.3	91.8	95.1	89.6	94.9	91	95.3	86.5	97.3	90.5
	Min	87.4	76.1	87.5	82.5	87.5	83.3	90.1	82.8	88.5	76.2	87.5	68.8
	STD	2.6	4.3	2.4	2.7	2.1	1.6	1.5	2.8	2.3	3.8	3	6.9
		DT				NB
		C1	C2	C3	C4	C1	C2	C3	C4
	Mean	93	78	92.1	75.9	89.5	35.3	44.9	36.3
	Max	95.2	83.1	96	82.8	92.8	39.7	50.4	44.3
	Min	90.2	72.7	87.2	69.7	85.1	32.2	37.7	27.4
	STD	1.6	3.8	2.9	4.9	2.5	2.4	3.6	5.2

Table 8.

Classification results using HGSO with several classifiers.

		RF				KNN				SVM
		C1	C2	C3	C4	C1	C2	C3	C4	C1	C2	C3	C4
HGSO	Mean	94.4	93.4	95.7	91.7	93.5	93.2	92.2	90.7	94.5	91.6	92.9	87.9
	Max	97.2	97.5	98.6	100	97.6	96.8	95	93.5	97.9	96.4	96	94.7
	Min	91.3	90.5	93.9	85.9	88.5	90.9	89.7	86.8	91.4	82.6	89.2	79.5
	STD	1.9	2	1.6	4	2.4	2	1.7	2.5	1.9	4.1	2.4	5.9
		DT				NB
		C1	C2	C3	C4	C1	C2	C3	C4
	Mean	95.9	87.1	90.6	85.4	78.2	61.1	52.7	41.3
	Max	98.6	91.9	96.1	91	88.1	75.4	64.2	51.4
	Min	90.8	81.1	82.7	77.3	70.2	46.6	43	33.3
	STD	2.6	3.8	4.1	4.4	5.5	9.5	7.3	5.4

Table 9.

Classification results using MPA with several classifiers.

		RF				KNN				SVM
		C1	C2	C3	C4	C1	C2	C3	C4	C1	C2	C3	C4
MPA	Mean	97.2	81.9	94	81.8	92.6	78.9	87.9	74	95	90.7	90.9	83.1
	Max	99.2	88.2	94.9	86.3	95	85.7	91.1	86.3	97.3	94.9	97.1	89.2
	Min	95.9	77.5	91.2	73.1	89.8	73.6	86.3	66.7	91.5	87.2	85.4	75.8
	STD	1.1	3.2	1.1	4.6	1.4	3.9	1.6	6.4	2.1	2.4	4.5	4.3
		DT				NB
		C1	C2	C3	C4	C1	C2	C3	C4
	Mean	92.1	82.3	91.8	83.8	94.9	34.2	49.1	45.8
	Max	97.1	87.3	94.3	94.6	99.3	41	64.1	53.7
	Min	86.7	77.7	85	72.7	89.8	27.9	32.9	38.8
	STD	2.9	3	2.8	6.8	2.6	4.7	9.8	5

Table 10.

Classification results using TSA with several classifiers.

		RF				KNN				SVM
		C1	C2	C3	C4	C1	C2	C3	C4	C1	C2	C3	C4
TSA	Mean	98.7	95.7	98.4	90.2	90.5	86.4	88.9	81.9	95.9	90.5	90.5	87.6
	Max	100	98.4	100	94.3	93.1	91.8	93.4	90	98.4	93.6	94.6	92.9
	Min	96.3	93.1	96.8	84.3	85.7	82.9	83.4	75.9	89.2	86.2	83.3	82.9
	STD	1.4	1.7	1.1	3.2	2.2	2.7	2.9	4.3	2.9	2.5	4	3.6
		DT				NB
		C1	C2	C3	C4	C1	C2	C3	C4
	Mean	96.2	86.9	88.8	81.9	85.8	62.7	42	41.5
	Max	98.7	90.5	95.9	89.5	93.2	69	50.4	47.8
	Min	94.3	83	85.4	70.3	79.9	52.3	26.6	34.4
	STD	1.8	2.7	3.4	6.3	4.3	5.4	6.6	4.6

Concluded that, the proposed criteria it is not only improve the overall accuracy but also enhance the classification of each class of faults, making it a more effective method for fault diagnosis in bearings. The results obtained through the proposed method, Table 11 have shown that it has the potential to accurately identify and classify different types of bearing faults for different operating condition (speed, torque, and force) and can provide valuable information for the bearing condition monitoring.

Table 11.

Fault classification performance of RF in tandem Cost-ACO for each class.

ATVMF-RF-Cost-ACO
	C1	C2	C3	C4
Max	99.5	98	99.4	95.5
Min	100	99.2	100	98.6
Mean	98.5	96.6	98.5	93
STD	0.5	0.8	0.6	1.8

It can be observed from Figure 10 that, the huge majority of samples for four bearing fault modes are successfully predicted with higher percentage ranges between 93% and 98.5%. This means that the degree of fault differentiation in the identical class mode from other class fault modes is the highest.

Figure 10.

Recognition results for different bearings faults of RF in tandem Cost-ACO.

Additionally, the proposed method shows improvement in terms of accuracy and std for each class, especially for the fourth class (combined defect) which was previously misclassified in the classical method also the other classes as show in Table 12.

Table 12.

Comparison between results of classic and proposed method.

		ACO CLASS				Cost-ACO
		C1	C2	C3	C4	C1	C2	C3	C4
RF	Max	99.5	98.2	98.6	94	99.5	98	99.4	95.5
	Min	100	99.1	99.3	97.3	100	99.2	100	98.6
	Mean	97.8	96.8	97.8	90	98.5	96.6	98.5	93
	STD	0.8	0.8	0.6	2.7	0.5	0.8	0.6	1.8

This improvement in results highlights the effectiveness of the proposed method in providing better results for the classification of bearing faults for different operating condition (speed, torque, and force).

Conclusion

In this paper, which is based on the evaluation of the classification results according to the accuracy and stability (stander deviation) of each class and the number of selected features for bearings components. It is based on the constructed of a new criterion for feature selection using wrapper for classification. In detail, the proposed technique based on Adaptive TVMF in tandem with the optimization algorithm ACO and the Random forest algorithm for bearings faults classification for currents and vibrations signals.

The performance and robustness of the proposed approach have been established through different comparative study cases. Firstly, a set of relevant features information’s was extracted using Adaptive TVMF in time and frequency domain from the stator currents and vibrations signals. In a second step, the feature obtained are used to create a new criterion for feature selection combined with optimization algorithm Cost-ACO that allow separating the different states of the bearing, taking into account the impact of the operating conditions of the synchronous machine bearings. Finally, the constructed wrapper method is fed into an RF model to classify the different health states of the bearings.

The proposed criterion for feature selection was able to improve the accuracy and stability of classification especially at the fourth class (combined damage) we obtain an accuracy of 93% and stability of classification 1.8 as we have seen that the defect combiner is difficult to classify, our method has succeeded not only to classify it but also to improve its results in relation to the conventional methods, which proves the efficiency of our new criteria for wrapper feature selection approach to improve the performance of classification algorithms with higher stability (STD: C1 = 0.5, C2 = 0.8, C3 = 0.6, C4 = 1.8) and better accuracy (average C1 = 98.5%, C2 = 96.6%, C3 = 9.5%, C4 = 93) for the both; the statoric current and the vibration signal compared to other techniques. Overall, combined methods had proven their efficiency in time-varying machines fault diagnosis when taking current and vibration signals. However, criterion for feature selection demonstrated a higher robustness and a remarkable superiority.

Footnotes

Appendix

Table A3.

Classification results using vibration signals and currents with different classifier and optimization algorithm.

		ACO	EPO	HGSO	MPA	TSA
KNN	Max	91.25	94.17	87.29	88.33	88.33
	Min	85.00	90.83	84.79	83.75	83.54
	Mean	87.81	91.96	86.08	86.79	86.10
	STD	1.67	1.11	0.68	1.50	1.62
SVM	Max	96.04	89.79	94.58	93.54	91.67
	Min	91.67	85.42	89.79	89.58	87.08
	Mean	94.31	86.85	92.75	90.96	89.58
	STD	1.51	1.41	1.55	1.33	1.33
NB	Max	68.96	62.71	65.00	62.29	65.63
	Min	60.83	52.71	56.46	57.08	57.71
	Mean	64.83	59.79	59.63	60.29	62.63
	STD	3.13	2.85	2.51	1.79	2.24
RF	Max	98.75	96.88	98.54	95.21	94.79
	Min	97.71	95.21	96.04	91.67	92.08
	Mean	98.38	95.92	97.35	93.71	93.48
	STD	0.31	0.54	0.80	1.11	1.04
DT	Max	94.79	88.33	94.58	91.46	93.96
	Min	92.08	83.13	90.63	87.92	84.79
	Mean	93.63	86.17	92.44	89.73	89.21
	STD	0.98	1.47	1.24	1.04	2.42

Handling Editor: Chenhui Liang

Declaration of conflicting interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) received no financial support for the research, authorship, and/or publication of this article.

ORCID iDs

Mohammed Amine Sahraoui

Chemseddine Rahmoune

Toufik Bettahar

References

Malla

Panigrahi

Review of condition monitoring of rolling element bearing using vibration analysis and other techniques. J Vib Eng Technol 2019; 7: 407–414.

Liu

Zhang

A review of failure modes, condition monitoring and fault diagnosis methods for large-scale wind turbine bearings. Measurement 2020; 149: 107002.

Althubaiti

Elasha

Teixeira

JA.

Fault diagnosis and health management of bearings in rotating equipment based on vibration analysis – a review. J Vibroengineering 2022; 24: 46–74.

Liu

Jiang

, et al. Rolling bearing fault diagnosis using variational autoencoding generative adversarial networks with deep regret analysis. Measurement 2021; 168: 108371.

Rao

Sheng

Guo

, et al. A review of online condition monitoring and maintenance strategy for cylinder liner-piston rings of diesel engines. Mech Syst Signal Process 2022; 165: 108385.

Tiboni

Remino

Bussola

, et al. A review on vibration-based condition monitoring of rotating machinery. Appl Sci 2022; 12: 972.

Cui

Guan

Chen

, et al. A novel advancing signal processing method based on coupled multi-stable stochastic resonance for fault detection. Appl Sci 2021; 11: 5385.

Ding

, et al. Intelligent fault diagnosis for rotating machinery using deep Q-network based health state classification: a deep reinforcement learning approach. Adv Eng Inform 2019; 42: 100977.

Tao

Ming

, et al. Intelligent monitoring and diagnostics using a novel integrated model based on deep learning and multi-sensor feature fusion. Measurement 2020; 165: 108086.

10.

Walther

Fuerst

Reduced data volumes through hybrid machine learning compared to conventional machine learning demonstrated on bearing fault classification. Appl Sci 2022; 12: 2287.

11.

Yan

Liu

, et al. Bearing fault feature extraction method based on enhanced differential product weighted morphological filtering. Sensors 2022; 22: 6184.

12.

Yao

Guo

Deng

, et al. A novel mathematical morphology spectrum entropy based on scale-adaptive techniques. ISA Trans 2022; 126: 691–702.

13.

Chen

Song

Zhang

, et al. A performance enhanced time-varying morphological filtering method for bearing fault diagnosis. Measurement 2021; 176: 109163.

14.

Chen

Cheng

Zhang

, et al. Investigation on enhanced mathematical morphological operators for bearing fault feature extraction. ISA Trans 2022; 126: 440–459.

15.

Zair

Rahmoune

Benazzouz

Multi-fault diagnosis of rolling bearing using fuzzy entropy of empirical mode decomposition, principal component analysis, and SOM neural network. Proc IMechE, Part C: J Mechanical Engineering Science 2019; 233: 3317–3328.

16.

Al-Yaseen

Idrees

Almasoudy

FH.

Wrapper feature selection method based differential evolution and extreme learning machine for intrusion detection system. Pattern Recognit 2022; 132: 108912.

17.

Bettahar

Chemseddine

Benazzouz

Faults’ diagnosis of time-varying rotational speed machinery based on vibration and acoustic signals features extraction, and machine learning methods. J Vib Eng Technol. Epub ahead of print 21 September 2022. DOI: 10.1007/s42417-022-00705-7.

18.

El Aboudi

Benhlima

Review on wrapper feature selection approaches. In: 2016 international conference on engineering & MIS (ICEMIS), Agadir, Morocco, 22–24 September 2016, pp.1–5. New York: IEEE.

19.

Lee

C-Y

Hsieh

Y-J

TA.

Induction Motor Fault classification based on combined genetic algorithm with symmetrical uncertainty method for feature selection task. Mathematics 2022; 10: 230.

20.

Houssein

Hassaballah

Ibrahim

, et al. An automatic arrhythmia classification model based on improved marine predators algorithm and convolutions neural networks. Expert Syst Appl 2022; 187: 115936.

21.

Islam

Awal

Laboni

, et al. HGSORF: Henry Gas Solubility Optimization-based Random Forest for C-Section prediction and XAI-based cause analysis. Comput Biol Med 2022; 147: 105671.

22.

Khan

Almalaise Alghamdi

Abushark

, et al. Recycling waste classification using emperor penguin optimizer with deep learning model for bioenergy production. Chemosphere 2022; 307: 136044.

23.

Chauhan

Vashishtha

A synergy of an evolutionary algorithm with slime mould algorithm through series and parallel construction for improving global optimization and conventional design problem. Eng Appl Artif Intell 2023; 118: 105650.

24.

Beşkirli

Temurtaş

Özdemir

Determination with linear form of Turkey’s energy demand forecasting by the tree seed algorithm and the modified tree seed algorithm. Adv Electr Comput Eng 2020; 20: 27–34.

25.

Ghosh

Guha

Sarkar

, et al. A wrapper-filter feature selection technique based on ant colony optimization. Neural Comput Appl 2020; 32: 7839–7857.

26.

Disha

Waheed

Performance analysis of machine learning models for intrusion detection system using Gini Impurity-based Weighted Random Forest (GIWRF) feature selection technique. Cybersecurity 2022; 5: 1.

27.

Meddour

Messekher

Younes

, et al. Selection of bearing health indicator by GRA for ANFIS-based forecasting of remaining useful life. J Braz Soc Mech Sci Eng 2021; 43: 1–14.

28.

Vashishtha

Kumar

Feature selection based on Gaussian ant lion optimizer for fault identification in centrifugal pump. In: Gupta

Amarnath

Tandon

, et al. (eds) Recent advances in machines and mechanisms. Singapore: Springer, 2022, pp.295–310.

29.

Chauhan

Singh

Modified ant colony optimization based PID controller design for coupled tank system. Eng Res Express 2021; 3: 045005.

30.

Sun

Kong

, et al. A hybrid gene selection method based on ReliefF and ant colony optimization algorithm for tumor classification. Sci Rep 2019; 9: 8978.

31.

Ikhlef

Rahmoune

Toufik

, et al. Gearboxes fault detection under operation varying condition based on MODWPT, ant colony optimization algorithm and random forest classifier. Adv Mech Eng 2021; 13: 16878140211043004.

32.

Imane

Rahmoune

Zair

, et al. Bearing fault detection under time-varying speed based on empirical wavelet transform, cultural clan-based optimization algorithm, and random forest classifier. J Vib Control 2023; 29: 286–297.

33.

Marins

Barros

Santos

, et al. Fault detection and classification in oil wells and production/service lines using random forest. J Pet Sci Eng 2021; 197: 107879.

34.

Lessmeier

Kimotho

Zimmer

, et al. Condition monitoring of bearing damage in electromechanical drive systems by using motor current signals of electric motors: a benchmark data set for data-driven classification. In: Proceedings of the European conference of the prognostics and health management society, 2016.