Condition monitoring of water pump bearings using ensemble classifier

Abstract

The bearings faults are reported to be the major reason for centrifugal pump (CPs) failures. Limited literature is available to diagnose the minor scratches in the bearing surface through non-intrusive condition monitoring techniques. Recent research on the analysis of bearing scratches through non-intrusive motor current analysis (MCA) has shown encouraging results where the comparison of machine learning and convolutional neural networks (CNNs) was performed in the classification of healthy bearings and faulty bearings (holes and scratches). The fault classification accuracy of 89.26% through MCA combination with machine learning and CNN algorithm was reported which is very low. The key factors of low accuracies were identified as low amplitudes of the harmonics in the MCA spectrum, the magnitude of environmental noise, and utilization of conventional feature extraction techniques. This problem has been tackled in this paper by developing a novel feature extractor (NFE) that extracts powerful features from the integrated current and voltage sensors data. The NFE has been derived using the threshold-based decision mechanism which has the capability to identify the location of the feature harmonic, feature extraction, measure the amplitude of the fault component, and compare it with the derived threshold. The experimental data has been collected for the bearing balls (BB), bearing cage (BC), inner race (IR) and the outer race (OR) faults, and the performance of the NFE has been tested on an ensemble classifier (CatBoost) and the better classification accuracy (99.2% for an individual feature and 100% with the combination of two or more features) of NFE has been achieved as compared to previously reported methods.

Keywords

Centrifugal pump performance fault amplitudes scratches in bearing current measurement voltage measurement feature extraction

Introduction

CPs usually operate in the industrial environment where there are issues of high temperatures, humidity, dust, and noise.^1–3 The bearing is one of the critical components of CPs which prone to various faults. The Figure 1 indicates that the CPs maintenance cost is highest in the industry and Figure 2 shows that the malfunction in bearings is highest among other components.⁴ Thus, the fault diagnosis of bearings of the CPs has a great importance.^5–10 The bearings are the reason for more than 41% of machine breakdowns, thus, this paper investigates various faults in bearings. The bearing structure and its components are shown in Figure 3.^11,12 Most of the literature has focused on diagnosing techniques for holes and cracks in IR and OR.^13–16 Only two papers are found in the published literature which focuses on the diagnosis of mini scratches in OR.^17,18 The diagnosis of mini scratches in IR is not reported in the literature.^19–22 Thus, the scope of this study is to examine the mini scratches in IR and OR along with faults in BB and BC.

Figure 1.

The maintenance cost of the petrochemical plant.

Figure 2.

The statistics of the machine failure due to various components.

Figure 3.

The schematic of the bearing.

The typical sensors used in the fault diagnosis of the CPs are accelerometers (vibration sensors), current transducers (CTs), voltage transducers (VTs), noise sensors, temperature sensors, and magnetic flux sensors.²³ The fault diagnosis techniques are usually named based on the sensor type used for data collection. The vibration analysis, thermal analysis, magnetic flux analysis, noise analysis, and acoustic emission techniques are categorized as intrusive techniques as the sensors used in intrusive techniques are installed on the machine surface. Although intrusive techniques are well known in the industry and ISO standards are available to categorize the machine failures based on the sensors data. However, some machines are located in such positions that access to the machine is not easy for sensor fitting and status monitoring. Thus, the intrusive techniques are not suitable for such applications. Furthermore, the high cost of the sensors is another major disadvantage associated with intrusive techniques.^24–29 In the past, some researchers had developed a non-intrusive condition monitoring technique through fast Fourier transform of the motor line current data and analyzing the fault-related harmonics. This technique was named motor current analysis (MCA).^30–34 Several papers have been published in the past decade to improve the performance of the MCA by developing pre-processing algorithms to make the condition monitoring system reliable, efficient, economical, and to reduce the complexity.^35–38 Although MCA is a better alternative to intrusive techniques but harmonics associated with defects are suppressed by the amplitude of line frequency and give a false alarm. Another issue reported with MCA is the high tendency of false alarms in a highly noisy environment.^39–41

Fault diagnosis of machines through artificial intelligence (AI) has been the trend in the last couple of years and various AI algorithms such as support vector machine (SVM), Naive Bayes classifier (NBC), k-nearest neighbor (k-NN), and neural networks (NN) have been used in the literature for the diagnosis and classification of bearing faults. However, low classification accuracies have been observed in conventional machine learning models for bearing scratch type of faults diagnostics and classification.^{17,18,42–47} Furthermore, the capability of individual features has never been tested in the past to determine fault classification accuracy.

The literature review given in earlier paragraphs highlights the limitations of intrusive and non-intrusive condition monitoring techniques and gives a direction for the significant improvements for non-intrusive condition monitoring techniques so that it could be capable of reliable fault diagnosis in a highly noisy environment. The low classification accuracy has been reported to be the main issue in conventional machine learning models. The literature on bearing fault diagnosis gives a potential research direction for reliable fault diagnosis and fault classification for scratches in OR and IR faults. Thus, the contributions of this paper are:

A novel feature extractor (NFE) has been developed which extracts only powerful features from the IPS data with the objective to enhance the classification accuracy of the ensemble classifiers.

The ensemble classifier (CatBoost) algorithm has been developed to identify the effectiveness of the NFE and to test the capability of the individual features as well as a combination of features of IPS for fault diagnosis.

The rest of the paper has been structured as: Section 2 presents the mathematical steps for the feature identification, selection, extraction, and NFE development. Section 3 describes the condition monitoring setup. The results and discussions are provided in Section 4. Finally, the conclusion has been presented in Section 5.

Development of novel feature extractor

The features for bearing ball (BB), bearing cage (BC), inner race frequency (IRF), and outer race frequency (ORF) are shown in Table 1. The derivation of the fault features and the sample calculations are shown in Appendix A.

Table 1.

The harmonic locations for various faults.

Bearing features	Slip	Frequency of rotation (Hz)	The location of harmonics in the IPS (Hz)
			Principle harmonics	Lower sideband	Upper sideband
BB $f_{b 1}, f_{b 2}, f_{b 3}$	0.082	22.96	90	10	190
BC $f_{c 1}, f_{c 2}, f_{c 3}$			91.4	8.6	108.6
IRF $f_{i 1}, f_{i 2}, f_{i 3}$			110.2	10.2	210.2
ORF $f_{o 1}, f_{o 2}, f_{o 3}$			73.4	26.6	173.4

In Table 1, $f_{x 1}$ represents the principle harmonics. $f_{x 2}$ is the lower side band and $f_{x 3}$ is the upper sideband.

The novel feature extractor (NFE) has been developed using the following steps:

The data has been collected from voltage and current sensors and has been converted into frequency spectrum using IPS algorithm.

The normal bearings data has been used as a benchmark for amplitude calculation and comparison.

The frequencies associated with the faults such as $f_{x 1}, f_{x 2}, f_{x 3}$ are identified.

An algorithm has been constructed to automatically extract the $f_{x 1}, f_{x 2}, f_{x 3}$ from the spectrum. The magnitude of the extracted components has been identified.

The amplitude difference between benchmark values and values of $f_{x 1}, f_{x 2}, f_{x 3}$ has been calculated. The zero difference represents the case of healthy bearing. The non-zero amplitude difference gives the indication of the presence of the bearing scratch.

The final comparison of the magnitudes has been performed between the threshold value and those $f_{x 1}, f_{x 2}, f_{x 3}$ features whose amplitude difference is greater than zero. Those features whose values are greater than the threshold are segregated and used in the ensemble learning algorithm for the bearing fault classification.

The flow chart of the NFE development has been shown in Figure 4.

Figure 4.

The flow chart of the NFE.

Experimental procedure

The system developed for the performance monitoring of the centrifugal pump has been shown in Figure 5. The current and voltage sensors are placed on the electric power line. The data collected from the sensors is interfaced through NI PXIe 6363 and is examine in LabVIEW. The four faults are simulated in the bearing: Type 1 is ball defect, Type 2 is broken cage, Type 3 is a scratch of 0.5 mm width, 0.5-mm depth, and 5-mm length in the inner surface of the bearing, Type 4 is a scratch in the outer surface of bearing with the same dimensions as of Type 3. The simulated bearing faults have been shown in Figure 6.

Figure 5.

The system developed for the condition monitoring of the centrifugal pump.

Figure 6.

Faults in bearing: (a) BBF, (b) BCF, (c) IRF, and (d) ORF.

Results and discussions

Figure 7 to Figure 10 show the IPS plots. In Figure 7(a) and (b), the amplitude difference of 10 dB has been observed by comparing features ( $f_{x 1}, f_{x 2}, f_{x 3}$ ) of the healthy bearing and bearing with ball fault. The amplitude difference of 13 dB has been observed at features ( $f_{x 1}, f_{x 2}, f_{x 3}$ ) of Figure 8(a) and (b) for cage faults. Similarly, the amplitude difference of 18 dB has been observed at features ( $f_{x 1}, f_{x 2}, f_{x 3}$ ) of Figure 9(a) and (b) which indicates the presence of inner race (IR) faults. Finally, Figure 10(a) and (b) shows the amplitude difference of 16 dB which is an indicator of bearing outer race (OR) faults.

Figure 7.

The IPS of the bearing for (a) normal case and (b) ball fault.

Figure 8.

The IPS of the bearing for (a) normal case and (b) cage fault.

Figure 9.

The IPS of the bearing for (a) normal case and (b) IR fault.

Figure 10.

The IPS of the bearing for (a) normal case and (b) OR fault.

The NFE has been used to extract amplitudes from each feature ( $f_{x 1}, f_{x 2}, f_{x 3}$ ) shown in Table 1. The NFE calculates the amplitude difference for each type of fault. If the output of NFE is zero, then it represents the normal bearing case. If the output of the NFE is non-zero, then it is the case of bearing fault. The Figure 7 to Figure 10 shows that the various faults cause the amplitude difference in the range of 10–18 dB. Such small amplitude differences could be miss detected when the machines are operating in the industrial environment and the noise variations are sometimes much larger than the fault amplitudes. Such type of scenario will cause a misdetection or false detection in an automatic fault detection system. This issue has been addressed here by comparing the amplitude differences with the threshold value. The threshold has been derived keeping into consideration of the noise variations. Those harmonic components whose amplitudes are higher than the threshold value are considered as robust features and are segregated and fed to ensemble learning classifier. Those harmonic components whose amplitudes are lower than the threshold value are weak features and are neglected. The feature segregation has been shown in Table 2. The threshold derivation has been shown in Appendix B.

Table 2.

The features selection using NFE.

Fault type	Normal condition (dB)	Fault condition (dB)	Amplitude difference (dB)	Threshold (dB)	Comments
BBF	−82.03	−72.14	9.89	−73.5	Select this feature
	−81.28	−70.93	10.35		Select this feature
	−80.47	−70.2	10.27		Select this feature
BCF	−87.24	−74.2	13.04	−73.5	Neglect this feature
	−86.9	−73.25	13.65		Select this feature
	−86.67	−73.01	13.66		Select this feature
IRF scratch	−80.98	−62.09	18.89	−73.5	Select this feature
	−80.35	−62.03	18.32		Select this feature
	−80.73	−62.07	18.66		Select this feature
ORF scratch	−84.31	−68.25	16.06	−73.5	Select this feature
	−84.14	−68.53	15.61		Select this feature
	−82.17	−66.38	15.79		Select this feature

Performance of fault classification algorithm

The features of Normal Bearing (NB) and four fault classes shown in Table 1 are extracted from the IPS through the NFE algorithm and are utilized by ensemble learning algorithm for the bearing fault classification. The total number of samples is 640 out of which 70% samples are used for training the algorithm and 20% of the samples are used for testing the algorithm and measuring the performance. The total number of features for classification is 3 which are defined as A1, A2, A3. Where, A1 is the amplitude at feature $f_{x 1}$ , A2 is the amplitude at feature $f_{x 2}$ , A3 is the amplitude at feature $f_{x 3}$ . These are shown in IPS spectrum from Figure 7 to Figure 10.

The CatBoost is a decision tree gradient-boosting method. The uniqueness of the CatBoost algorithm is mainly compromised of three main points. Firstly, it reduces target leaking by modifying gradient boosting with an ordered boosting technique. Secondly, the algorithm works efficiently with small datasets. Thirdly, the algorithm can handle a wide range of data and formats. Since its inception, CatBoost has been used in many other areas, including finance and with many different datasets. These include time-series data and other similar kinds of datasets. Each category gets a new binary feature in place of the original variable. Additionally, the algorithm uses random permutations to estimate leaf values while selecting the tree structure in order to avoid overfitting that is common with conventional gradient boosting methods. When dealing with categorical features during model training, the CatBoost method uses efficient modified target-based statistics that handle them properly, which saves a significant amount of computational time. The CatBoost algorithm’s ordered boosting process is another key feature. A prediction model is built by performing multiple boosting steps on all of the training data in conventional GBTs. As a result of this strategy, the model’s predictions change, creating a new kind of target leakage issue. The ordered boosting architecture used by the CatBoost algorithm overcomes the previously mentioned problem.

For a given feature set F = {f1, f2, …, fN} the feature importance of fi (i = 1, 2, …, N) in the trained CatBoost model has been calculated using equations (1) and (2).

\begin{matrix} Feature set = f_{i} = \\ \sum_{Leaf S_{fi}}^{.} {(v_{1} - avr)}^{2} . c_{1} + {(v_{2} - avr)}^{2} . c_{2} \end{matrix}

(1)

avr = \frac{v_{1} . c_{1} + v_{2} . c_{2}}{c_{1} + c_{2}}

(2)

where S denotes the different paths to the leaf nodes in the decision tree, c1 and c2 denote the total weight coefficient in the left and right leaves, respectively, and υ1 and υ2 denote the formula value in the left and right leaves, respectively. The block diagram of the Catboost classifier has been shown in Figure 11. The performance of the Catboost classifier has been shown in Table 3.

Figure 11.

The block diagram of the CatBoost classifier.

Table 3.

The performance of the CatBoost classifier.

Test	Test description	Features selected			Accuracy (%) (training)	Accuracy (%) (10-fold)
		A1	A2	A3
1	Only A1 feature used for classification	1	0	0	99.2	90
2	Only A2 feature used for classification	0	1	0	100	90
3	Only A3 feature used for classification	0	0	1	100	87
4	A1, A2 features used for classification	1	1	0	100	91
5	A1, A3 features used for classification	1	0	1	100	93
6	A2, A3 features used for classification	0	1	1	100	99
7	All features used for classification	1	1	1	100	100

The accuracies of the individual features and the combination of features fed to the Catboost classifier have been shown in Table 3. The confusion matrix for the various combinations of features has been shown in Figure 12. The Catboost achieves a classification accuracy of 100% for the individual as well as with various combinations of features.

Figure 12.

The confusion matrix of the CatBoost classifier for various combination of features: (a) Test 1, (b) Test 2, (c) Test 3, (d) Test 4, (e) Test 5, (f) Test 6, and (g) Test 7.

Performance comparison

Comparison of the performance with other machine learning techniques

The classification accuracy of the Catboost classifier has been compared with other well-known machine classifiers using the same dataset and the summary of the results has been shown in Table 4. It has been concluded that the Catboost classifier is giving better accuracy than the Support Vector Machine, Naïve Bayes Classifier, and Gradient Boost Classifier.

Table 4.

The summary of the performance comparison of various algorithms.

Sr. No	Classifier name	Accuracy
1	Support Vector Machine	96.87
2	Naïve Bayes Classifier	96
3	Gradient Boost Classifier	96.87
4	Catboost Classifier	100

Comparison of the performance with other published papers

The comparison with other published papers indicates that the performance of the XGB and CatBoost have been significantly improved. For example,¹⁷ has investigated the minor scratches in the bearing outer surface. They have used the non-intrusive MCA as a data collection, frequency analysis for feature extraction, and several classification algorithms such as SVM, k-NN, NBC, and CNN were used to measure the classification accuracy. However, they could achieve a maximum of 89.26% accuracy. Vakharia et al.⁴⁸ have used minimum permutation entropy based best wavelet feature extraction technique for the analysis and classification of bearing faults. They reported a classification accuracy of 97.5% using ANN and SVM techniques. However, vibration analysis was used which is an intrusive method. Recently, Davo et al.⁴⁹ have developed the multi-fusion signal processing and mutual information technique on vibration data for the classification of bearing faults. They have reported the comparison of four algorithms and random forest (RF) has shown better classification accuracies. Yuan et al.⁵⁰ have used feature ranking and selection method to classify bearing faults through Catboost classifier. They have reported a classification accuracy of 99.17%. A comparison of SVM and CatBoost classifiers was performed by Gareev et al.⁵¹ to diagnose mechanical faults and they conclude that SVM gives lower accuracy (85.3%) while CatBoost gives higher classification accuracy up to 99.3%. Long et al.⁵² has used an improved AdaBoost classifier fed with multi-sensors data for motor fault diagnosis and has achieved a classification accuracy of 92.38%. The multi-sensory data collection setup has a high cost and AdaBoost performance was not satisfactory. Zhang et al.⁵³ have used time-domain vibrational analysis of gearbox. The wavelet packet decomposition was used as a feature extraction method and AdaBoost classifier was used to achieve a classification accuracy of 96.94%. However, the CatBoost used in the present work has shown better accuracy. This improvement in classification accuracy reflects the significance of the proposed novel feature extractor named NFE. The comparison of the proposed work with other published papers has been summarized in Table 5.

Table 5.

The comparison of various algorithms performance using conventional feature extraction techniques and the proposed NFE method.

Published studies						This study
Reference	Year	Feature extraction technique	Classification algorithm	Accuracy (%)	Limitations	Feature extraction technique	Classification algorithm	Accuracy (%)
Esakimuthu Pandarakone et al.¹⁷	2019	Frequency feature extraction	SVMk-NNNBCCNN	83.04%87.85%84.31%89.26%	Low accuraciesIndividual features not investigated	NFE (accuracy calculation of individual as well as combined features, frequency domain features, non-intrusive data collection)	CatBoost	100%
Vakharia et al.⁴⁸	2015	Minimum permutation entropy based best wavelet selection	SVMANN	97.5%97.5%	Intrusive data collection techniqueTime-domain analysis	NFE (accuracy calculation of individual as well as combined features, frequency domain features, non-intrusive data collection)	CatBoost	100%
Davea et al.⁴⁹	2020	Multi fusion signal processing	RFSVMANNIBK	Training (100%), 10-fold (98.43%)Training (98.43%), 10-fold (93.75%)Training (96.87%), 10-fold (89.06%)Training (100%), 10-fold (96.87%)	Intrusive data collection techniqueTime-domain analysis	NFE (accuracy calculation of individual as well as combined features, frequency domain features, non-intrusive data collection)	CatBoost	Training (100%), 10-fold (99%)
Yuan et al.⁵⁰	2021	Feature ranking	Catboost	99.17%	Intrusive technique for data acquisitionIndividual features not investigated	NFE (accuracy calculation of individual as well as combined features, frequency domain features, non-intrusive data collection)	CatBoost	100%
Gareev et al.⁵¹	2019	Time domain extraction	Catboost	99.3%	Intrusive technique for data acquisitionIndividual features not investigated	(Accuracy calculation of individual as well as combined features, frequency domain features, non-intrusive data collection)	CatBoost	100%
Long et al.⁵²	2021	Weight distribution matrix	Adaboost	92.38%	Hight cost data acquitionIndividual features not investigatedLow accuracy	NFE (accuracy calculation of individual as well as combined features, frequency domain features, non-intrusive data collection)	CatBoost	100%
Zhang et al.⁵³	2020	Wavelet packet decomposition	Adaboost	96.94%	Intrusive technique for data acquisitionIndividual features not investigatedLow accuracy	(Accuracy calculation of individual as well as combined features, frequency domain features, non-intrusive data collection)	CatBoost	100%
Nishat Toma and Kim⁵⁴	2020	Wavelet packet decomposition and line frequency filteration	XGBoost	99.3%	Individual features not investigatedLow accuracy	NFE (accuracy calculation of individual as well as combined features, frequency domain features, non-intrusive data collection)	CatBoost	100%

Conclusions

This paper has developed a novel feature extractor for the classification of various faults in bearings. The derivation of the location of features ( $f_{x 1}, f_{x 2}, f_{x 3}$ ) has been done through mathematical models. A novel NFE method has been developed to extract the features ( $f_{x 1}, f_{x 2}, f_{x 3}$ from the IPS plot). The NFE eliminates the noise impact by adopting the thresholding technique. The amplitudes (A1, A2, A3) of extracted fault features ( $f_{x 1}, f_{x 2}, f_{x 3}$ ) are measured and the comparison with the benchmark data has been performed to verify the amplitude variation. The ensemble learning approach, CatBoost classifiers have been developed to classify various machine health conditions using individual features as well as the combination of extracted features. It has been concluded that the proposed method gives satisfactory classification accuracies as compared to previously published techniques. The performance (classification accuracy) comparison of the developed feature extraction technique with other state-of-the-art techniques reported in the literature proves the significance of this research.

Footnotes

Appendix A

The mathematical model, values of the parameters, and sample calculations for Table 1 are given below:

\begin{matrix} Synchronus speed = 1500 rpm \\ Measured speed = 1377 rpm \\ Fundamental frequency = f_{f} = 50 Hz \\ Pitch diameter = D_{p} = 25 mm \\ Ball diameter = D_{b} = 6 mm \\ θ = 0^{0} \\ Slip = \frac{Synchronus speed - Measured speed}{Synchronus speed} \\ = \frac{1500 - 1377}{1500} = 0.082 \\ Shaft Rotational Frequency = \\ \frac{f_{f}}{2} (1 - slip) = \frac{50}{2} (1 - 0.082) = 22.96 Hz \end{matrix}

Calculations for bearing ball features:

(A1)

\begin{matrix} BB feature (f_{b 1}) = \frac{D_{p}}{D_{b}} * \\ Shaft Rotational Frequency * (1 - \frac{D_{b}^{2}}{D_{p}^{2}} * \cos θ) \\ BB feature (f_{b 1}) = \frac{25}{6} * 22.96 * (1 - \frac{6^{2}}{25^{2}} * \cos 0) \\ = 90 Hz \end{matrix}

(A2)

\begin{matrix} BB Sidebands (f_{b 2}, f_{b 3}) = | 2 * f_{f} \pm f_{b 1} | \\ = | 2 * 50 \pm 90 | \\ = 10 Hz, 190 Hz \end{matrix}

Calculations for bearing cage features:

(A3)

\begin{matrix} BC Feature (f_{c 1}) = \frac{Shaft Rotational Frequency}{2} \\ * (1 - \frac{D_{b}}{D_{p}} * \cos θ) \end{matrix}

(A4)

\begin{matrix} BC Feature (f_{c 1}) = \frac{22.96}{2} * (1 - \frac{6}{25} * \cos 0) = 8.6 Hz \\ BC Sideband (f_{c 2}, f_{c 3}) = | 2 * f_{f} \pm f_{c 1} | \\ = | 2 * 50 \pm 8.6 | = 91.4 Hz, 108.6 Hz \end{matrix}

Calculations for bearing inner race features:

(A5)

IRF Feature (f_{i 1}) = 0.6 * 8 * Shaft Rotational Frequency

(A6)

\begin{matrix} IRF Feature (f_{i 1}) = 0.6 * 8 * 22.96 = 110.2 Hz \\ IRF Sidebands (f_{i 2}, f_{i 3}) = | 2 * f_{f} \pm f_{i 1} | \end{matrix}

\begin{matrix} IRF Sidebands (f_{i 2}, f_{i 3}) \\ = | 2 * 50 \pm 110.2 | = 10.2 Hz, 210.2 Hz \end{matrix}

Calculations for bearing outer race features:

(A7)

ORF Feature (f_{o 1}) = 0.4 * 8 * Shaft Rotational Frequency

(A8)

\begin{matrix} ORF Feature (f_{o 1}) = 0.4 * 8 * 22.96 = 73.4 Hz \\ ORF Sidebands (f_{o 2}, f_{o 3}) = | 2 * f_{f} \pm f_{o 1} | \end{matrix}

\begin{matrix} ORF Sidebands (f_{o 2}, f_{o 3}) \\ = | 2 * 50 \pm 73.4 | = 26.6 Hz, 173.4 Hz \end{matrix}

Appendix B

The threshold (γ) has been calculated using the following procedure.¹⁹

(B1)

\emptyset = \sqrt{σ^{2} / N} = \sqrt{σ_{1}^{2}}

(B2)

k = \frac{γ}{\emptyset}

(B3)

γ = k \emptyset

Where:

$γ,$ is the designed threshold,

$\emptyset,$ is the noise variance of the measured signal which is 1.06565e⁻⁴,

k is the design parameter. Its value is selected to get appropriate probability of the false detection. The value of k in this paper is selected at 1.98 with probability of wrong detection 2.3%.

$γ = k \emptyset$ =1.98 × 1.06565e⁻⁴= 0.000211=−73.5 dB

Appendix C

The details of the parameters of the classifier are given below:

learning_rate 0.1

reg_lambda 0.43

(Coefficient at the L2 regularization term of the cost function.)

depth 7min_data_in_leaf 20 (The minimum number of training samples in a leaf. CatBoost does not search for new splits in leaves with samples count less than the specified value).

Acknowledgements

Authors would like to acknowledge the support of the Deputy for Research and Innovation- Ministry of Education, Kingdom of Saudi Arabia for funding this research through a grant code (NU/IFC/ENT/01/011) under the institutional Funding Committee at Najran University, Kingdom of Saudi Arabia.

Handling Editor: Chenhui Liang

Author contributions

M.I has performed experiments, data analysis, visualization, and paper writing. M.I, A.S.A, A.G have performed project and resource management. K.S.Q, A.A have contributed in algorithm development. A.G, S.R, S.A, F.S.A, F.A, S.M.G, H.A have performed editing, resource management, and data visualization. M.K.A, O.A has performed editing and re-writing of the paper.

Declaration of conflicting interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This research was supported by the Ministry of Education, Saudi Arabia through a grant (NU/IFC/ENT/01/011) under the institutional funding committee of the Najran University Saudi Arabia.

ORCID iDs

Muhammad Irfan

Adam Glowacz

Saifur Rahman

Omar Alshorman

References

Gülich

JF.

Centrifugal pumps. Berlin/Heidelberg: Springer, 2010.

Jiang

Heng

Liu

, et al. A review of design considerations of centrifugal pump capability for handling inlet gas-liquid two-phase flows. Energies 2019; 12: 1078.

Stel

Ofuchi

Sabino

RHG

, et al. Investigation of the motion of bubbles in a centrifugal pump impeller. J Fluid Eng 2019; 141: 031203.

Irfan

Glowacz

Design of a novel electric diagnostic technique for fault analysis of centrifugal pumps. Appl Sci 2019; 9: 5093.

Cui

Zhang

, et al. Investigation on centrifugal pump performance degradation under air-water inlet two-phase flow conditions. Houille Blanche 2018; 104: 41–48.

Dalvand

Kang

Dalvand

, et al. Detection of generalized-roughness and single-point bearing faults using linear prediction-based current noise cancellation. IEEE Trans Ind Electron 2018; 65: 9728–9738.

AlShorman

Alkahatni

Masadeh

, et al. Sounds and acoustic emission-based early fault diagnosis of induction motor: a review study. Adv Mech Eng 2021; 13: 1–19.

Glowacz

Kozik

, et al. Detection of deterioration of three-phase induction motor using vibration signals. Meas Sci Rev 2019; 19: 241–249.

Silvestri

Forcina

Introna

, et al. Maintenance transformation through industry 4.0 technologies: a systematic literature review. Comput Ind 2020; 123: 103335.

10.

Tortorella

Fogliatto

Cauchick-Miguel

, et al. Integration of Industry 4.0 technologies into total productive maintenance practices. Int J Prod Econ 2021; 240: 108224.

11.

Irfan

Modeling of fault frequencies for distributed damages in bearing raceways. J Nondestruct Eval 2019; 38: 1–10.

12.

Riera-Guasp

Antonino-Daviu

Capolino

GA.

Advances in electrical machine, power electronic, and drive condition monitoring and fault detection: State of the art. IEEE Trans Ind Electron 2015; 62: 1746–1759.

13.

Kumar

Singh

Outer race defect width measurement in taper roller bearing using discrete wavelet transform of vibration signal. Measurement 2013; 46: 537–545.

14.

Kulkarni

Bewoor

Vibration based condition assessment of ball bearing with distributed defects. J Meas Eng 2016; 4: 87–94.

15.

Kuruppu

Kulatunga

NA.

D-Q current signature-based faulted phase localization for SM-PMAC machine drives. IEEE Trans Ind Electron 2015; 62: 113–121.

16.

Irfan

Saad

Alwadie

, et al. An automated feature extraction algorithm for diagnosis of gear faults. J Fail Anal Prev 2019; 19: 98–105.

17.

Esakimuthu Pandarakone

Mizuno

Nakamura

. A comparative study between machine learning algorithm and artificial intelligence neural network in detecting minor bearing fault of induction motors. Energies 2019; 12: 2105.

18.

Irfan

Alwadie

Glowacz

, et al. A novel feature extraction and fault detection technique for the intelligent fault identification of water pump bearings. Sensors 2021; 21: 4225.

19.

Saad

Irfan

Ibrahim

. Condition monitoring and faults diagnosis of induction motors: electrical signature analysis. Boca Raton, FL: CRC Press & Routledge – Taylor & Francis Group, 2018.

20.

Irfan

Saad

Ibrahim

, et al. An assessment on the non-invasive methods for condition monitoring of induction motors. In: Fault diagnosis and detection. London: InTech Publishing, 2017.

21.

Irfan

Saad

Ibrahim

, et al. An intelligent diagnostic condition monitoring system for AC motors via instantaneous power analysis. Int Rev Electr Eng 2013; 8: 664–672.

22.

Sheikh

Nor

Ibrahim

. Non-Invasive methods for condition monitoring and electrical fault diagnosis of induction motors. In: Fault diagnosis and detection. London: InTech Publishing, 2017.

23.

Singh

Kumar

Detection of bearing faults in mechanical systems using stator current monitoring. IEEE Trans Ind Inform 2017; 13: 1341–1349.

24.

Vakharia

Gupta

Kankar

PK.

A comparison of feature ranking techniques for fault diagnosis of ball bearing. Soft Comput 2016; 20: 1601–1619.

25.

Gao

Cecati

Ding

SX.

A survey of fault diagnosis and fault-tolerant techniques—Part I: Fault diagnosis with model-based and signal-based approaches. IEEE Trans Ind Electron 2015; 62: 3757–3767.

26.

Vakharia

Gupta

Kankar

PK.

Ball bearing fault diagnosis using supervised and unsupervised machine learning methods. Int J Acoust Vib 2015; 20: 244–250.

27.

Dolenc

Boškoski

Pfajfar

, et al. Vibration based diagnosis of distributed bearing faults. In: Vibration engineering and technology of machinery, proceedings of VETOMAC X, 2014. Manchester, UK: University of Manchester.

28.

Irfan

Saad

Ibrahim

, et al. An online fault diagnosis system for induction motors via instantaneous power analysis. Tribol Trans 2017; 60: 592–604.

29.

Irfan

Saad

Ibrahim

, et al. Condition monitoring of induction motors via instantaneous power analysis. J Intell Manuf 2017; 28: 1259–1267.

30.

Medrano Hurtado

Tello

Sarduy

. A review on location, detection and fault diagnosis in induction machines. J Eng Sci Technol Rev 2015; 8: 185–195.

31.

Irfan

Saad

Ibrahim

, et al. A non-invasive method for condition monitoring of induction motors operating under arbitrary loading conditions. Arab J Sci Eng 2016; 41: 3463–3471.

32.

Eftekharnejad

Carrasco

Charnley

, et al. The application of spectral kurtosis on acoustic emission and vibrations from a defective bearing. Mech Syst Signal Process 2011; 25: 266–284.

33.

Glowacz

, et al. Fault diagnosis of three phase induction motor using current signal, MSAF-Ratio15 and selected classifiers. Arch Metall Mater 2017; 62: 2413–2419.

34.

Liang

, et al. Application of bandwidth EMD and adaptive multiscale morphology analysis for incipient fault diagnosis of rolling bearings. IEEE Trans Ind Electron 2017; 64: 6506–6517.

35.

Irfan

Saad

Ibrahim

, et al. An intelligent fault diagnosis of induction motors in an arbitrary noisy environment. J Nondestruct Eval 2016; 35: 1–13.

36.

Gunasekaran

Esakimuthu Pandarakone

Asano

, et al. Condition monitoring and diagnosis of outer raceway bearing fault using support vector machine. In: Proceedings of the international conference on condition monitoring and diagnosis (CMD 2018), 23–26 September 2018, Perth, WA, Australia, pp.1–6.

37.

Frosini

Harlisca

Szabo

Induction machine bearing fault detection by means of statistical processing of the stray flux measurement. IEEE Trans Ind Electron 2015; 62: 1846–1854.

38.

AlShorman

Irfan

Saad

, et al. A review of artificial intelligence methods for condition monitoring and fault diagnosis of rolling element bearings for induction motor. Shock Vib 2020; 2020: 1–20.

39.

Faiz

Takbash

Mazaheri-Tehrani

A review of application of signal processing techniques for fault diagnosis of induction motors – Part I. AUT J Electr Eng 2017; 49: 109–122.

40.

Nandi

Toliyat

Condition monitoring and fault diagnosis of electrical motors: a review. IEEE Trans Energy Convers 2005; 20: 719–729.

41.

Faiz

Ebrahimi

Sharifian

MBB

. Different faults and their diagnosis techniques in three-phase squirrel-cage induction motors: a review. Electromagnetics 2006; 26: 543–569.

42.

Elforjani

Shanbr

Prognosis of bearing acoustic emission signals using supervised machine learning. IEEE Trans Ind Electron 2018; 65: 5864–5871.

43.

Soualhi

Razik

Clerc

, et al. Prognosis of bearing failures using hidden Markov models and the adaptive neuro-fuzzy inference system. IEEE Trans Ind Electron 2014; 61: 2864–2874.

44.

Tayyab

Asghar

Pennacchi

, et al. Intelligent fault diagnosis of rotating machine elements using machine learning through optimal features extraction and selection. Procedia Manuf 2020; 51: 266–273.

45.

Orrù

Zoccheddu

Sassu

, et al. Machine learning approach using MLP and SVM algorithms for the fault prediction of a centrifugal pump in the oil and gas industry. Sustainability 2020; 12: 4776.

46.

Jin

Yan

Chen

, et al. Light neural network with fewer parameters based on CNN for fault diagnosis of rotating machinery. Measurement 2021; 181: 109639.

47.

Liu

Yang

Zio

, et al. Artificial intelligence for fault diagnosis of rotating machinery: A review. Mech Syst Signal Process 2018; 108: 33–47.

48.

Vakharia

Gupta

Kankar

A multiscale permutation entropy based approach to select wavelet for fault diagnosis of ball bearings. J Vib Control 2015; 21: 3123–3131.

49.

Davea

Singhb

Vakharia

Diagnosis of bearing faults using multi fusion signal processing techniques and mutual information. Ind J Eng Mater Sci 2020; 27: 878–888.

50.

Yuan

Zhou

Liu

, et al. Fault diagnosis approach for rotating machinery based on feature importance ranking and selection. Shock Vib 2021; 2021: 1–17.

51.

Gareev

Minaev

Stadnik

, et al. Machine-learning algorithms for helicopter hydraulic faults detection: model based research. J Phys Conf Ser 2019; 1368: 1–6.

52.

Long

Zhang

, et al. Motor fault diagnosis using attention mechanism and improved adaboost driven by multi-sensor information. Measurement 2021; 170. DOI: 10.1016/j.measurement.2020.108718.

53.

Zhang

Jia

, et al. A diagnosis method for the compound fault of gearboxes based on multi-feature and BP-AdaBoost. Symmetry 2020; 12: 461.

54.

Nishat Toma

Kim

. Bearing fault classification of induction motors using discrete wavelet transform and ensemble machine learning algorithms. Appl Sci 2020; 10: 5251.