Abstract
This paper proposes a novel fault diagnosis framework for rolling bearings, integrating improved composite multi-scale fuzzy entropy (ICMFE), t-distributed stochastic neighbor embedding (t-SNE), and beetle antennae search-optimized support vector machine (BAS-SVM). To address signal homogenization in conventional methods, ICMFE introduces a refined coarse-graining strategy with weighted averaging to quantify signal irregularity and self-similarity, generating discriminative high-dimensional features. The t-SNE then optimizes these features via nonlinear dimensionality reduction, preserving critical local relationships while eliminating redundancy. Finally, the support vector machine (SVM) with hyperparameters tuned by the beetle antennae search (BAS) algorithm achieves intelligent fault classification, outperforming particle swarm, simulated annealing, and artificial fish swarm optimization methods. Two fault diagnosis experimental results of actual rolling bearings demonstrate that ICMFE improves fault recognition rates by 3.12% at most compared to traditional multi-scale fuzzy entropy. The proposed framework achieves the accuracy of more than 99.58% when identifying different fault states of bearings. Furthermore, the proposed fault diagnosis model achieves the highest performance in term of accuracy, precision, recall, and F1-score, with significantly lower pattern recognition time compared to other classifiers, further confirming its superiority in classification accuracy and robustness. All the above analysis proves the effectiveness, universality, and application value of the proposed framework.
Keywords
Introduction
As a key component of rotating machinery, rolling bearings are a component that is prone to failure. Once the bearing fails, a series of failures will be triggered, affecting the financial gains of the enterprise and even causing major accidents.1,2
In recent years, some scholars have applied deep learning and machine learning algorithms to the field of fault diagnosis and achieved some progress. 3 For example, Feng et al. 4 proposed an indicator based on cyclic-correntropy, and verified the effectiveness of this indicator in gear wear monitoring through experiments. Li et al. developed an interpretable cross-modal zero-sample diagnostic framework for industrial gearbox health monitoring, which synergizes infrared thermal imaging and acoustic emission data through a composite neuro-fuzzy architecture. This non-contact multi-sensing methodology enables reliable fault identification under unseen operating conditions while providing transparent decision-making processes. 5 Moreover, Feng et al. proposed a digital twin methodology. This methodology integrates the geographically distributed wear coefficients and a gear-box updating mechanism, enabling real-time monitoring and accurate prediction of the non-uniform wear progression of gears. 6
Since the vibration signals of rolling bearings exhibit instability and nonlinearity characteristics, nonlinear theory similar to entropy-based techniques for measuring the degree of chaos has advanced, such as approximate entropy (AE), sample entropy (SE), and fuzzy entropy (FE). These sophisticated nonlinear analysis techniques have found widespread application in the fault diagnosis of rolling bearings.7–9 AE, as a nonlinear dynamic analysis method, is extensively applied to assess the regularity of vibration signals in mechanical systems, providing a quantitative measure for early-stage fault identification through pattern recognition of non-stationary signals. 10 Shang et al. 11 developed an innovative diagnostic framework employing AE quantification of dissolved gas analysis data, which achieves probabilistic fault identification in power transformers through nonlinear dynamic analysis of insulation degradation patterns under partial discharge, overheating, and insulation failure conditions. However, AE is sensitive to the length of the data, and short data may lead to inaccurate estimation, which is a problem in situations where real-time monitoring or data collection is limited. As an enhanced version of AE, SE demonstrates superior algorithmic consistency, particularly in processing short-length datasets. 12 Zhuang et al. 13 proposed a bearing fault diagnosis method by decomposing equipment vibration signals through variational mode decomposition (VMD), extracting the SE of each component as feature vectors, and inputting these features into a classifier for fault identification. SE is restricted by limitations such as parameter sensitivity, weak noise resistance, and binarization defects, which has promoted the development of FE.
FE application change employs an exponential function with the alternative single-phase function, overcoming AE and SE disadvantages. Zhou et al. 14 proposed that FE could be used as a health indicator for mechanical fault diagnosis. Zheng et al. 15 presented a multi-scale fuzzy entropy (MFE) method to address the shortcomings associated with single-scale FE analysis. Owing to the unique capability of entropy measures in quantifying system complexity, disorder, and uncertainty, numerous intelligent fault diagnosis methodologies employing entropy-based analysis have been documented in mechanical system monitoring.3,16–19
The experimental analysis indicates that the characteristics extracted by MFE are more comprehensive than single-scale fuzzy entropy. MFE calculates fuzzy entropy values at multiple scales of a time series. However, the coarse-grained algorithm definition will increase the entropy error with the scale factor. 20 This article proposes an improved composite multi-scale fuzzy entropy (ICMFE). The average value of each coarse-grained sequence is calculated with identical scaling factors. 21 Then, the fuzzy entropy value is computed by constructing multiple coarse-grained sequences with identical scaling factors. Consequently, the likelihood of ineffective fuzzy entropy value caused by directly averaging multiple coarse-grained fuzzy values is reduced. Subsequently, it is applied to the process of extracting fault features in rolling bearings.
The bearing fault features extracted using ICMFE exhibit high dimensionality and nonlinearity. Feeding them directly into the classifier may increase its computational burden and reduce classification efficiency. Fortunately, the t-SNE algorithm is capable of uncovering low-dimensional manifold structures embedded within high-dimensional data. It is widely used in various applications, including text classification, image recognition, and facial localization. 22 For example, Wang and Jiang used t-SNE to diminish the dimensionality of the collected fault characteristic matrix and get a sensitive low-dimensional characteristic matrix. Experiments with two gearboxes show that the effect after dimensionality reduction is better. 23 Serna-Serna et al. 24 used real-life data sets to validate that t-SNE is superior to additional semi-supervised dimensionality reduction methods in processing information visualization and sorting missions. Hu et al. used t-SNE to save the local structure and global classification information of fault records in a low-dimensional characteristic space. This method ensures the robustness and effectiveness of fault type diagnosis and is applied in the field of power transformer fault diagnosis. 25 Xiao et al. reduced the dimensions of vibration and current feature vectors via t-SNE. The emulation and laboratory bench results suggest the proposed model’s precision compared with other models. 26 Therefore, this paper introduces t-SNE into the dimensionality reduction process of high-dimensional bearing fault characteristics to collect sensitive fault characteristic sets that can be easily identified.
The essence of rolling bearing fault diagnosis lies in pattern recognition. SVM, as a machine learning algorithm, is especially effective for analyzing small datasets and identifying nonlinear patterns in data. 27 However, the algorithm’s property is readily impacted by the penalty factor and kernel function parameters. The beetle antennae search (BAS) algorithm represents an innovative meta-heuristic algorithm with fast solution speed and precision. 28 This algorithm can acquire the all optimal solution by simulating the longhorn beetle to search for food according to the strength of the smell received by its antennae. Compared with optimization algorithms such as particle swarm optimization and artificial fish swarm, BAS only requires a single individual, thereby mitigating the computational burden inherent in the optimization process. Accordingly, this study integrates BAS into the parameter optimization process of SVM, proposing the BAS-SVM algorithm for application in rolling bearing fault pattern recognition.
On the basis of theoretical analysis, ICMFE is proposed to collect the fault characteristics and deeply draw the fault pattern message of complex vibration signals of rolling bearings. Then, this article introduces the t-SNE algorithm for characteristic dimensionality diminish to obtain effective fault information, reducing the burden of classifier identification, and raising the accuracy and efficiency of diagnosis. Furthermore, the acquired low-dimensional characteristics are subsequently input into the multiclass SVM classifier. Given that the critical parameters of SVM influence the classifier identification accuracy, a novel meta-heuristic approach, BAS, is employed to discern and establish the optimal values for these critical SVM parameters. Finally, the article develops a new fault diagnosis method by ICMFE, t-SNE, and BAS-SVM. The experimental results of fault diagnosis of 1210 self-aligning ball bearings and aviation bearings show that this method can accurately identify various faults of rolling bearings.
Based on theoretical analysis, this paper proposes ICMFE to extract fault features and effectively capture fault patterns from complex vibration signals of rolling bearings. To further enhance performance, t-SNE is introduced for dimensionality reduction, allowing for the extraction of meaningful fault information. The resulting low-dimensional features are then fed into the multiclass BAS-SVM classifier. In summary, this study develops a new fault diagnosis framework combining ICMFE, t-SNE, and BAS-SVM. Experimental results on 1210 self-aligning ball bearings and aerospace bearings demonstrate that the proposed framework can accurately identify rolling bearing faults.
The content of this article is summarized as follows. Section 2 introduces the principle and detailed calculation method of ICMFE, followed by simulation experiments. Section 3 presents the optimization principle of BAS-SVM and includes corresponding simulation experiments. Section 4 provides an experimental analysis based on the established fault diagnosis model. Finally, the main conclusions are summarized in Section 5.
Improved composite multi-scale fuzzy entropy for feature extraction
Multi-scale fuzzy entropy
Aiming at the deficiency of single-scale analysis of fuzzy entropy, a concept of multi-scale fuzzy entropy is proposed, with the specific process being as follows. 14
Performing a coarse-grained reconstruction of the time series
where N is the signal length, and s is the scale factor.
Determine the FE of the coarse-grained sequence at various scale factors through systematic calculation.
where FE is the fuzzy entropy, m means the embedding dimension, r is the similarity tolerance, and n is the gradient parameter of the exponential function.
In evaluating the complexity of time series data, MFE addresses the limited accuracy of traditional single-scale fuzzy entropy. However, its coarse-graining process is sensitive to the length of the time series, and the deviation of entropy values increases with the scale factor s.
Improved composite multi-scale fuzzy entropy
The ICMFE method initiates its operation by generating numerous coarse-grained sequences sharing a common scale factor. This strategic approach is adopted to address the inherent challenge associated with the MFE coarse-grained process, wherein the entropy error exhibits an escalating tendency in tandem with the scale factor. Secondly, a methodological refinement is introduced to mitigate the inherent invalidity stemming from the direct averaging of multiple coarse-grained FE values within identical scales. This enhancement involves the averaging of both
The coarse-graining sequence reconstruction
Figure 1 shows the results of coarse-grained sequence construction using MFE and the proposed ICMFE, based on equations (1) and (3), respectively, at a scale of 3. It can be observed that the original MFE method generates only one coarse-grained sequence at this scale. In contrast, ICMFE constructs three distinct subsequences at the same scale. This comparison demonstrates that the proposed method can effectively extract signal feature information across multiple levels.

Coarse-grained sequence reconstruction at scale 3: (a) MFE and (b) ICMFE.
Determine the quantity of vectors in m-dimensional and m + 1-dimensional spaces corresponding to each coarse-grained sequence, denoted as
In the range of
where
Simulation experiment
The ICMFE algorithm has five parameters that need to be set manually: scale factor s, embedding dimension m, similarity tolerance r, exponential function gradient parameter n, and data length N.
The scale factor s is usually higher than 10. The present study establishes the parameter s at a numerical value of 25.
In the process of dynamically reconstructing the initial time series, the higher the embedding dimension m, the more information it contains. However, the higher the m, the longer the required sequence length (N = 10 m –30 m ). Therefore, m is usually defined as 2.
The similarity tolerance r has a minor influence on the algorithm and is usually equal to 0.1–0.25SD, where SD is the standard deviation of one-dimensional time series.
The pivotal role of the gradient parameter n in the exponential function is evident in the computation of template similarity. Notably, the magnitude of n significantly influences the outcome; specifically, a larger value of n results in the attenuation or disappearance of finer details within the information under consideration. Consequently, a smaller integer is usually thought n (2 in this paper) to obtain more useful information.
The time series length N has a minor effect on ICMFE. When the embedding dimension is set to 2, N is between 100 and 900, the corresponding time series length is
In pursuit of comprehensively scrutinizing the ramifications of likeness and delving into the ramifications of similarity tolerance on ICMFE and MFE, both algorithms are employed across varying similarity tolerance values denoted by r (specifically, set to 0.05SD, 0.1SD, 0.15SD, and 0.2SD). This experimental investigation is conducted within the framework of white noise and the

Analysis results of two types of noise using MFE and ICMFE under different tolerances r: (a) 0.05SD, (b) 0.1SD, (c) 0.15SD, and (d) 0.2SD.
Based on the data depicted in Figure 2, several conclusions can be drawn.
On most scales, the entropy values of
The entropy value curve of ICMFE shows smaller fluctuations under both noise compared to MFE. This demonstrates that the ICMFE algorithm addresses the limitation of MFE, where valuable information is often lost during the coarse-grained process, thereby highlighting the advantages of the ICMFE method.
Although an increase in the similarity tolerance will gradually reduce the MFE and ICMFE entropy values at the same scale, the MFE and ICMFE variation curves of white noise and
Beetle antennae search optimization support vector machine for fault identification
Beetle antennae search optimization algorithm
BAS is a novel meta-heuristic algorithm known for its fast solution speed and high precision. It finds the global optimal solution by simulating a beetle’s food-searching behavior based on the intensity of the odor detected by its antennae. 28 Compared with optimization algorithms such as particle swarm optimization (PSO), genetic algorithm (GA), simulated annealing (SA), and artificial fish swarm algorithm (AFSA), BAS only requires a single individual, thereby markedly diminishing the computational complexity inherent in the optimization procedure.29–31 The detailed process of BAS is listed as below.
Initialize the following setting parameters: variable step size parameter Eta, the distance between two whiskers d0, step size, number of iterations n, optimization dimension D, and random initial solution a = rands(D, 1).
Compute the coordinates of the two whiskers. The spatial coordinates denoting the positions of the left and right whiskers are expressed.
where dir = rands(D, 1).
Compute the odor intensity of the two whiskers according to the fitness function.
where
Use the variable step size method to determine the next beetle step.
Evaluate whether the master number of iterations is fulfilled. If so, the calculation is terminated; otherwise, the loop continues.
Beetle antennae search optimization support vector machine
Since the performance of SVM is highly sensitive to the parameters C and
Input and normalize the training set and test set.
Initialize the SVM and BAS parameters, including the variable step size parameter Eta, the distance between two whiskers d0, the step size, and the number of iterations n.
Use equation (5) to compute the coordinates of the beetle’s left and right whiskers.
The odor intensity of the left and right beetles of the longhorn beetle is calculated according to equation (6). The fitness function value of the model is established based on the average recognition rate of the training samples, as determined through a threefold cross-validation process.
Determine the next step of the longhorn beetle using the variable step length method defined in equation (7).
Evaluate whether the maximum value of iterations is satisfied. If so, the calculation is terminated; Furthermore, the loop continues.
Output the global optimal food position (i.e. C and

Flowchart of the BAS-SVM.
Simulation experiment
Simulation experiment 1: Comparison of optimization methods
In order to verify the convergence of BAS, a simulation experiment analysis is carried out by using the benchmark function of equation (8). The global minimum value of F(x) is approximately 1.
Moreover, BAS is compared with other methods, including PSO, GA, SA, and AFSA. The detailed parameter settings for all five methods are provided in Table 1, and the average convergence results from 20 independent experiments are presented in Figure 4.
Parameters of different optimization algorithms.

Average convergence curves of different methods.
As demonstrated in Figure 4, the BAS algorithm exhibits superior capability in escaping local optima and identifying the most promising search regions with fewer iterations compared to conventional optimization methods (i.e. GA, PSO, SA, and AFSA). The convergence curve of the BAS algorithm maintains stability throughout subsequent iterations, thereby demonstrating its superior convergence performance.
Simulation experiment 2: Comparison of different classifiers
To demonstrate its superiority, BAS-SVM is applied to the classification task using the wine dataset from the University of California Irvine database (a description of the dataset is provided in Table 2). Additionally, the proposed classifier is compared with PSO-SVM, GA-SVM, SA-SVM, and AFSA-SVM, with the parameter settings for each classifier listed in Table 1. The average accuracy and average recognition time of the four classifiers after 100 repeated recognitions on the wine dataset are shown in Figure 5. MATLAB’s operating environment is Intel(R) Core(TM) i5-3230M CPU@2.60 GHz, 12 GB RAM.
Description of the wine dataset.

Average recognition accuracy and recognition time.
According to Figure 5, the following results can be obtained.
The average identification rate of the BAS-SVM classifier to the data set reaches 99.47%, which is 0.49%, 0.74%, 0.46%, and 0.32% better than that of the PSO-SVM, GA-SVM, SA-SVM, and AFSA-SVM classifiers, respectively.
The recognition time of the BAS-SVM classifier for the data set is 0.128 s, which is 11.5, 10.1, 44.9, and 51.1 times better than that of the PSO-SVM, GA-SVM, SA-SVM, and AFSA-SVM classifiers, respectively. The above analysis verifies the high efficiency of BAS-SVM.
Proposed rolling bearing fault diagnosis framework
T-distributed stochastic neighbor embedding for feature dimension reduction
The t-SNE represents a contemporary algorithm within the realm of manifold learning. The algorithm employs joint probability assessments between high-dimensional data points and their corresponding low-dimensional simulation counterparts to articulate the inherent similarity among these data points. Then, t-SNE uses the minimum Kullback-Leibler (KL) divergence to ascertain optimal low-dimensional outcomes. 32
Define the similar conditional probability of data points x
i
and x
j
for a high-dimensional data set
where
The joint probability between high-dimensional points x i and x j and low-dimensional points y i and y j is expressed by p ij and q ij , respectively.
The KL divergence serves as a metric for assessing the precision of low-dimensional simulation points in relation to their corresponding high-dimensional counterparts.
The gradient descent method finds the minimum KL divergence, ensuring the maximum simulation accuracy. The expression is as follows.
The dimensionality reduction outcome
Proposed fault diagnosis model
A fault diagnosis model for rolling bearing faults is proposed, integrating ICMFE, t-SNE, and BAS-SVM. The overall procedure is illustrated in Figure 6, and the detailed steps are described below.
Step 1: Signal acquisition. Vibration signals from Q groups of rolling bearings under E different operating conditions are collected at a sampling frequency of fs. In total, E × Q groups are obtained. For each operating condition, P groups of signals are randomly selected as training samples, while the remaining Q−P groups are used as test samples.
Step 2: Feature extraction. In the first feature extraction stage, the ICMFE entropy values of each group of vibration signal are calculated, and the original high-dimensional fault feature set is constructed. In the second feature extraction stage, the t-SNE algorithm is employed to diminish the dimensionality of the original high-dimensional fault characteristic and extract low-dimensional and susceptible fault characteristic sets.
Step 3: Pattern recognition. The BAS-SVM classifier uses a one-versus-one strategy to build the multi-fault classification model. Specifically, the training feature set is used to construct the classification model. Then, the test feature set is input into the trained model to enable intelligent fault classification.

Fault diagnosis model.
Fault diagnosis experimental analysis
Case 1
In order to verify the correctness of the fault model, it was introduced into the test analysis process of rolling bearing faults. The trial bench is shown in Figure 7. The 1210-style self-aligning ball bearings were chosen as the test target in this experiment. An electric engraving device was used to machine the inner ring, outer ring, and rolling ball of the bearing to simulate pitting failure. We collected 50 groups of rolling bearing acceleration vibration data for each condition: normal (NOR), inner ring fault (IRF), outer ring fault (ORF), and rolling ball fault (RBF), totaling 200 vibration signals. The sampling frequency was set to 5120 Hz. Each group of signals contains 4096 sampling points. The time-domain waveforms of acceleration vibration information in four states are indicated in Figure 8. Among them, 10 groups of signals from each type are randomly selected as training samples. The remaining 40 signal groups serve as test samples, resulting in a total of 40 training samples and 160 test samples.

Rolling bearing fault diagnosis platform.

Waveforms of rolling bearings.
According to Step 2 of the proposed fault diagnosis model, the high-dimensional fault feature set of the rolling bearing is constructed using the ICMFE algorithm and compared with the MFE. The entropy values and deviation curves of the four states are shown in Figure 9. The parameters are defined as follows: r = 0.15SD, s = 25, m = 2, and n = 2.

Entropy values and deviation values of MFE and ICMFE: (a) NOR, (b) IRF, (c) ORF, and (d) RBF.
Based on the data presented in Figure 9, several conclusions can be derived.
The MFE and ICMFE entropy curves of the four states of rolling bearings are relatively close. The entropy deviation values of ICMFE are smaller than those of MFE, verifying the stability of ICMFE.
When the scale factor is set to 1, the entropy value in NOR (0.163) is lower than that of IRF (0.207), ORF (0.401), and RF (0.370), demonstrating that the fuzzy entropy value is suitable for fault monitoring.
Although the fuzzy entropy at a single scale can monitor whether a fault occurs, it cannot determine the specific fault type. Hence, the fuzzy entropy still needs to be analyzed from multiple scales.
The two feature extraction sets are input to the BAS-SVM multi-fault classifier for fault recognition, aiming to quantify the effectiveness of the ICMFE and MFE algorithms. The identification results are presented in Figure 10 and Table 3.

Recognition results of MFE and ICMFE by BAS-SVM: (a) MFE and (b) ICMFE.
Accuracy of MFE and ICMFE test samples by BAS-SVM multi-fault classifier.
From Figure 10 and Table 3, BAS-SVM achieves an accuracy on the ICMFE test set samples that is 3.12% higher than on the MFE test set samples, confirming the advantages of ICMFE for feature extraction. However, misclassification of some individual categories occurs due to redundant feature information. Therefore, applying a dimensionality reduction method for secondary feature extraction is recommended.
The t-SNE algorithm is applied to reduce the dimension of the ICMFE fault characteristic set. The intrinsic dimension is set to 3 for data visualization, and the perplexity parameter is set to 30. Dimensionality reduction outcomes are indicated in Figure 11. It can be observed that the four state samples are separated in the three-dimensional visualization outcome of the t-SNE algorithm. Moreover, the aggregation of various samples is good, verifying the feasibility of the dimensionality reduction. Additionally, BAS-SVM is used for training and testing. The confusion matrix of the test samples is shown in Figure 12. It demonstrates that BAS-SVM can accurately diagnose each fault type in the ICMFE + t-SNE test samples, achieving a recognition accuracy of 100%. The above analysis verifies the superiority of feature extraction based on ICMFE + t-SNE.

Three-dimensional visualization result using the t-SNE algorithm

Confusion matrix of the ICMFE + t-SNE feature extraction method.
Finally, BAS-SVM is compared with GA-SVM, PSO-SVM, SA-SVM, and AFSA-SVM. The parameter settings are the same as in Table 1. Table 4 and Figure 13 show the identification effects of four classifiers on ICMFE and ICMFE + t-SNE feature sets.
Optimization results and recognition time by different classifiers.

Recognition metrics of different classifiers: (a) ICMFE, and (b) ICMFE + t-SNE.
The following outcome can be achieved according to Table 4 and Figure 13.
For the ICMFE feature set, the BAS-SVM multi-fault classifier achieved the highest accuracy, precision, recall, and F1-score on the test samples. Moreover, the pattern recognition time is significantly lower than that of other classifiers, verifying the efficiency of using the BAS algorithm for SVM parameter modification.
The accuracy, precision, recall, and F1-score of all five classifiers on the ICMFE + t-SNE feature set test samples reached 100%, demonstrating that ICMFE + t-SNE can accurately extract fault features that clearly distinguish between different operating states.
Case 2
To further verify the robustness of the proposed fault diagnosis method, it was applied to the analysis of aerospace bearing fault data, 33 with the experimental test rig depicted in Figure 14. The fault data of the bearing in five states were collected under the sampling frequency of 51,200 Hz. The detailed information is shown in Table 5. About 200 samples are collected for each state. Each group of signals contains 4096 sampling points. The time-domain waveforms of acceleration vibration information in five states are indicated in Figure 15. Among these, 10 samples are randomly selected from each state as training samples, and the remaining 190 are used as test samples. There are a total of 50 training samples and 950 test samples across the five bearing states.

Experimental platform.
Fault description of aerospace bearings.

Waveforms of rolling bearings.
Based on the proposed fault diagnosis model, ICMFE is applied to extract fault features from the aerospace bearing dataset, and a comparative analysis is conducted with the MFE method. The entropy value results and the standard deviation curves are shown in Figures 16 and 17, respectively.

The entropy value curves extracted by MFE and ICMFE.

Standard deviation comparison of features extracted by MFE and ICMFE.
The following outcome can be achieved according to Figures 16 and 17.
The entropy curves of MFE and ICMFE for each aerospace bearing condition are relatively similar. However, the standard deviations of the entropy values obtained using ICMFE are consistently smaller than those from MFE under the same conditions, confirming the greater stability of the ICMFE algorithm.
When the scale is set to 1, the fuzzy entropy values of the five aerospace bearing states demonstrate significant variability, thereby validating their potential applicability in fault monitoring systems.
Although the fuzzy entropy under a single scale can monitor whether a fault occurs, it cannot determine the specific type of fault. Therefore, the fuzzy entropy still needs to be analyzed on multiple scales.
The feature extraction results from ICMFE and MFE are fed into the BAS-SVM multi-classifier, and the corresponding classification outcomes are shown in Figure 18 and Table 6. In the identification results, MFE misclassifies 58 test samples, while ICMFE misclassifies 35. Accordingly, ICMFE achieves an accuracy that is 2.43% higher than MFE, demonstrating the advantages of using ICMFE for feature extraction. However, the presence of redundant features may still lead to misclassification in certain categories. Therefore, incorporating dimensionality reduction techniques into the secondary feature extraction process is recommended.

Recognition results of MFE and ICMFE by BAS-SVM multi-fault classifier.
Accuracy of MFE and ICMFE test samples by BAS-SVM multi-fault classifier.
The t-SNE algorithm is used for dimensionality reduction of the ICMFE feature set. The perplexity is set to 30, and the intrinsic dimensionality is set to 3 for data visualization. The dimensionality reduction results are presented in Figure 19. As shown in the three-dimensional visualization, although there is some overlap between a few IRF1 and RF1 samples, as well as between IRF2 and RF2 samples, the samples from the five different states remain generally well-separated and distinguishable overall.

Three-dimensional visualization result using the t-SNE algorithm.
Furthermore, the samples from each state demonstrate strong clustering, which validates the feasibility of the dimensionality reduction approach. Subsequently, the BAS-SVM multi-fault classifier was utilized to train and test the samples, and the confusion matrix for the test samples is presented in Figure 20. As demonstrated in Figure 20, the BAS-SVM multi-fault classifier achieves highly accurate diagnosis of fault types in the ICMFE + t-SNE test samples, with an average accuracy of 99.58%. This analysis validates the superiority of the proposed feature extraction approach.

Multi-class confusion matrix of ICMFE + t-SNE feature extraction method.
Finally, BAS-SVM is compared with GA-SVM, PSO-SVM, SA-SVM, and AFSA-SVM. The parameter settings are the same as in Table 1. Table 7 and Figure 21 show the identification effects of four classifiers on ICMFE and ICMFE + t-SNE feature sets.
The optimization parameters and recognition times of different classifiers.

The recognition indicators of different classifiers: (a) ICMFE, and (b) ICMFE + t-SNE.
The following outcome can be achieved according to Table 7 and Figure 21.
For the ICMFE and ICMFE + t-SNE feature sets, the accuracy, precision, recall, and F1-score of the BAS-SVM multi-fault classifier are all the highest. In addition, the pattern recognition time of BAS-SVM is significantly lower than that of other classifiers. These phenomena verify the efficiency and effectiveness of using the BAS algorithm to modify the parameters of SVM.
For the same classifier, the indicators such as accuracy, precision, recall, and F1-score of ICMFE + t-SNE are better than those of ICMFE, indicating that ICMFE + t-SNE can accurately extract fault features that are easy to distinguish the operating states.
Parameter influence experiment
(1) Influence of the adjustable initial step size parameter in BAS-SVM on diagnostic results.
To evaluate the impact of the adjustable initial variable step size parameter in BAS-SVM on diagnostic performance, we conducted experiments using the ICMFE feature set from Case 2. BAS-SVM with different initial step size parameters (0.45, 0.55, 0.65, 0.75, 0.85, and 0.95) were trained and tested on this feature set. The recognition results are presented in Figure 22.

Influence of different initial step size parameters on diagnosis performance.
As shown in Figure 22, the four recognition metrics of the BAS-SVM classifier on the ICMFE feature set exhibit an overall upward trend as the adjustable initial step size parameter increases. This can be explained by the analogy that the adjustable initial step size parameter resembles the beetle’s search range in the initial phase. Setting this parameter to a sufficiently large value enables the algorithm to mimic the beetle’s extensive exploration of the surrounding environment, thereby increasing the likelihood of discovering the global optimal solution. Therefore, the adjustable initial step size parameter was set to 0.95 in this study.
(2) Influence of the perplexity parameter in t-SNE on diagnostic performance.
To examine the effect of the perplexity parameter in t-SNE on diagnostic performance, an experimental analysis was performed using the ICMFE feature set from Case 2. The feature set underwent dimensionality reduction with t-SNE using various perplexity values (10, 20, 30, 40, 50, and 60). The resulting lower-dimensional feature sets were then used to train and test by the BAS-SVM classifier. The recognition results are presented in Figure 23.

Influence of perplexity on diagnostic performance.
As demonstrated in Figure 23, the four recognition metrics of BAS-SVM on the dimensionality-reduced feature set exhibit an overall trend of first increasing and then decreasing as the perplexity parameter increases. The optimal recognition metrics are achieved at a perplexity of 30. This can be explained by the fact that an overly small perplexity value may lead to excessive focus on local structures while neglecting the global distribution of the data, resulting in dimensionality-reduced features that lack representativeness in capturing the overall data patterns. Furthermore, an excessively large perplexity value may obscure local details, resulting in indistinguishable local features that were originally separable. Therefore, after a comprehensive evaluation, the perplexity parameter of t-SNE in this study was set to 30.
Conclusions
The ICMFE was presented to overcome the limitation of invalid entropy values in MFE, improving the stability of entropy results. The bearing diagnosis experimental results showed that the entropy feature obtained by ICMFE has a lower standard deviation value, and its recognition accuracy is 3.12% at most higher than that of MFE.
The BAS-SVM classifier was proposed to enable fast and accurate sample classification. It outperforms existing classifiers such as PSO-SVM, SA-SVM, and AFSA-SVM. While achieving higher recognition accuracy, it also significantly improves recognition speed by factors of 11.5, 44.9, and 51.1, respectively. Experimental results from bearing fault diagnosis further demonstrate that the BAS-SVM classifier offers superior diagnostic accuracy and efficiency.
A rolling bearing fault diagnosis model via ICMFE, t-SNE, and BAS-SVM was established. The experimental results of fault diagnosis of 1210 self-aligning ball bearings and aerospace bearings showed that the proposed framework achieves the accuracy of more than 99.58% when identifying different fault states of bearings.
In the future, the proposed ICMFE method can be combined with the multi-modal theory to fully extract data from multiple sensors. Moreover, the proposed fault diagnosis model can be extended to more fields of rotating machinery fault diagnosis, such as gear fault diagnosis.
Footnotes
Ethical considerations
Ethical statement is not applicable for this article.
Informed consent/patient consent
There are no human subjects in this article and informed consent is not applicable.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported in part by General Project of Natural Science Research in Universities of Anhui Province (Grant No. KJ2021B04), Postdoctoral Fellowship Program of China Postdoctoral Science Foundation (Grant No. GZC20231284), and China Postdoctoral Science Foundation (Grant No. 2024M751643).
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Data availability statement
The datasets used or analyzed during the current study available from the corresponding author on reasonable request.
Trial registration number/date
Trial registration number/date is not applicable for this article.
