Sage Journals: Discover world-class research

Abstract

In the bearing fault diagnosis process using the convolution neural network (CNN), there are some problems, such as complex signal data processing and the complex network parameter setting. A rolling bearing fault diagnosis method is proposed to solve these problems based on improved particle swarm optimization and convolution neural networks with wide kernels in first-layer (IPSO-WCNN). The particle self-adaptive jump out algorithm is proposed to overcome particle swarm optimization (PSO) shortcomings. The adaptive inertia weight and the linear change acceleration coefficients are adopted for improved particle swarm optimization (IPSO). The convolution neural networks with wide kernels in first-layer (WCNN) fault diagnosis method is proposed for one-dimensional rolling bearing vibration signals, and the parameters of the WCNN is optimised by IPSO. According to the verification experiments, the proposed method can get higher accuracy than others with good adaptability.

Keywords

Particle swarm optimization convolution neural network rolling bearing fault diagnosis

Introduction

Rolling bearing is widely used in rotating machinery. About 50% of rotating machinery failures are related to bearings.^1,2 The failure and degradation of bearings can result in unexpected failures of machines, leading to large economic losses and even serious human injuries. Therefore, effective fault diagnosis technology can help identify bearing faults in the early detection and improve the reliability of the mechanical equipment.

The complex operating environment and huge data size in modern equipment made traditional methods unable to meet the requirements of fault diagnosis.³ With the rapid development of machine learning, intelligent fault diagnosis methods based on data-driven technologies have been widely used in recent years. Limited by the shallow network structure, traditional intelligent fault diagnosis methods make it difficult to diagnose the deep micro fault features.⁴ Researchers improved the performance of intelligent fault diagnosis by the improved fault feature extraction method and introduced shallow intelligent learning algorithms.^5–7 However, the diagnostic results are unsatisfactory, and the fault diagnosis methods are difficult to realise with huge data sizes.

Hinton⁸ proposed the theory of deep learning to overcome the shortcomings of traditional intelligent diagnosis methods. Recently, deep learning has become a hotspot in the field of fault diagnosis. Shao et al.⁹ designed a deep wavelet auto-encoder network, using an extreme learning machine as a classifier for rolling bearings fault diagnosis. The researchers captured signal features with a wavelet auto-encoder with a wavelet function. Jiang et al.¹⁰ designed an improved deep recurrent neural network for rolling bearings fault diagnosis, using spectrum sequences as inputs. Zhu et al.¹¹ combined principal component analysis with the deep belief network for the bearings fault diagnosis, adopting the principal component analysis method to reduce the dimension of raw bearing vibration signals.

As an important deep learning algorithm, CNN has advantages in feature extraction and processing big data.^12–14 Ahmed and Nandi¹⁵ converted vibration signals into 2D grayscale images for CNN inputs to classify bearing faults. Guo et al.¹⁶ processed vibration signals by wavelet transform and simplified parameter tuning by optimising the pooling layer of CNN. Wen et al.¹⁷ constructed CNN based on Lenet-5 and converted the original signals into two-dimensional images as inputs for fault diagnosis. The above cases prove that CNN is an effective method for fault diagnosis. However, there are some drawbacks. First, the original CNN model is designed for image recognition and classification. When applied to bearing fault diagnosis, the inputs of CNN need two-dimensional images. Secondly, there are large numbers of parameters in CNN. When designing the model, it needs debugging to choose a reasonable model repeatedly with poor self-adaptability.

Researchers constructed new CNNs to meet the structure of the original signals and solve the first problem of CNN. Eren¹⁸ proposed a one-dimensional CNN detection system for motor bearing with better detection accuracy. Chen et al.¹⁹ designed a one-dimensional CNN based on original signals and introduced the dropout layer to improve the accuracy of bearing fault diagnosis. Zhang et al.²⁰ designed a deep one-dimensional CNN and improved the fault diagnosis accuracy by increasing the width of the convolution kernel in the first convolution layer.

Researchers have studied parameters optimisation in CNN to solve the second problem of CNN. Syulistyo et al.²¹ used PSO to optimise the output vector of CNN, improving the recognition accuracy of the handwritten digit from MNIST. Wang et al.²² established CNN with two convolutional and pooling layers for bearing fault diagnosis and used PSO to optimise the key parameters of CNN, improving the adaptability of the model. Chen et al.²³ constructed CNN with three convolutional and pooling layers for bearing fault diagnosis, using short-time Fourier transform for signal processing, and the key parameters of CNN were solved by PSO.

The IPSO-WCNN method for bearing fault diagnosis is proposed to solve these problems simultaneously. The main contributions of this study are as follows:

WCNN is proposed, which adapted to the original signals to simplify the process of bearing fault diagnosis. The original signal data can be taken as the input of WCNN.

IPSO with particle self-adaptive jump out algorithm, adaptive inertia weight and linear change acceleration coefficients is proposed to improve the optimisation performance of the PSO algorithm.

IPSO is used to optimise the key parameters in WCNN to improve the adaptability of WCNN.

The remainder of this paper is organised as follows. Section 2 introduces the vibratory mechanism of rolling bearing and the basic theory of PSO and CNN. In Section 3, the IPSO-WCNN intelligent diagnostic method is proposed. In Section 4, the IPSO-WCNN method is verified by experiments, and the results are discussed. Conclusions are drawn in Section 5.

Related theory

Vibratory mechanism of rolling bearing

The vibration of the rolling bearing always exists when the rolling bearing is working, which is related to the structure, working environment, assembly, running characteristics and other factors of the rolling bearing. The main factors of vibration are detailed as follows: First, the rolling ball and inner or outer ring of the bearing will extrude each other when the rolling bearing is loading, which can lead to a vibration impact. Generally, the load of the rolling ball becomes smaller when it is far away from the load centre. Second, in the manufacturing process, the errors of diameter, offset of axis, excessive clearance, surface quality and other problems can lead to the bearing vibration. Third, the elastic deformation caused by improper bearing installation leads to a misalignment of shafts and an unsatisfactory interference fit degree between the shaft and inner ring. These can lead to the vibration of bearing. Fourth, when deformation faults occur in rolling bearings, such as cracks and peeling, the impact will be generated when the rolling ball runs to the fault location. Fifth, foreign bodies and forces in the working environment will lead to abnormal vibration signals of bearings.

When the bearing failed, new frequency characteristics appeared in vibration signals. The fault information can be diagnosed by spectral analysis. When the contact surfaces of the rolling balls and inner or outer ring of the bearing failed, the impact will be generated when the bearing passes through the fault part. This can generate excitation to the bearing system, and then the vibration caused by the impact excitation is formed. This kind of vibration is characterised as a low-frequency periodic impact, and the corresponding frequency is fault frequency. The fault type of bearing can be calculated by theoretical characteristic fault frequencies.²⁴ Their corresponding characteristic fault frequencies are given as follows:

f_{bf} = \frac{D}{2 d} | f_{ir} - f_{or} | (1 - \frac{d^{2}}{D^{2}} \cos^{2} α)

(1)

f_{if} = \frac{Z}{2} | f_{ir} - f_{or} | (1 + \frac{d}{D} \cos α)

(2)

f_{of} = \frac{Z}{2} | f_{ir} - f_{or} | (1 - \frac{d}{D} \cos α)

(3)

f_{cf} = \frac{1}{2} | f_{ir} - f_{or} | (1 - \frac{d}{D} \cos α)

(4)

where $f_{bf}$ , $f_{if}$ , $f_{of}$ and $f_{cf}$ are the corresponding characteristic fault frequencies of the rolling ball, inner ring, outer ring and cage, respectively. Further, Z is the number of rolling balls, d is the diameter of the rolling ball, D is the pitch diameter of the bearing, $f_{ir}$ and $f_{or}$ are the rotating frequencies of the inner ring and outer ring and $α$ is the contact angle.

Basic theory of PSO

The PSO algorithm has the advantages of simple implementation, practicability and fast computation, which is widely used in optimisation and solution.^25,26 Each particle includes the position $X_{i, t} = (x_{i, t}^{1}, x_{i, t}^{2}, \dots, x_{i, t}^{D},)$ and the velocity $V_{i, t} = (v_{i, t}^{1}, v_{i, t}^{2}, \dots, v_{i, t}^{D},)$ . Where $i = 1, 2, \dots, N$ . Further, N is the population size of the swarm, D is the dimension of the object and t is the current number of iterations. The fitness functions evaluate the particles. In the iteration, the local best position of every particle $P_{ibest} = (p_{i 1}, p_{i 2}, \dots, p_{iD},)$ is searched according to the fitness functions. Additionally, the global best position $P_{gbest} = (p_{g 1}, p_{g 2}, \dots, p_{gD},)$ is reached through the overall local best position in the swarm. The velocity and position of the particles need to be updated as follows to find the optimal position:

\begin{matrix} V_{i} (t + 1) = ω V_{i} (t) + c_{1} r_{1} [P_{ibest} - X_{i} (t)] \\ + c_{2} r_{2} [P_{gbest} - X_{i} (t)] \end{matrix}

(5)

X_{i} (t + 1) = X_{i} (t) + V_{i} (t + 1)

(6)

where $ω$ is the inertia weight of particles, $c_{1}$ and $c_{2}$ are the acceleration coefficients set to 2.0 commonly and $r_{1}$ and $r_{1}$ are two random values in the range [0, 1].

Basic theory of CNN

CNN has a strong feature extraction capability by constructing multiple filters and extracting the input features from the layers.^27,28 Generally, CNN is composed of the input layer, convolutional layer, pooling layer, full-connected layer and output layer.²⁹

(1) Input layer: the original CNN model is designed for image recognition and classification, so the input of CNN usually is two-dimensional image data.

(2) Convolutional layer: convolutional operation is the key of CNN. The convolutional layer traverses the local input regions with convolutional kernels and outputs the corresponding features. The convolutional layer parameters are reduced by weight-sharing, which uses the same kernel function in each filter to extract the features of the local input region.²⁷ The operation of the convolutional layer is described as follows:

c_{i, j}^{l} = K_{i}^{l} * X_{j}^{l - 1} = \sum_{n \in S} k_{i}^{l} (n) \times x_{j}^{l - 1} (n)

(7)

where $K_{i}^{l}$ is the convolutional kernel in channel i of layer l, $X_{j}^{l - 1}$ is the input in the j-th local region of layer l−1, $c_{i, j}^{l}$ is the corresponding output, $k_{i}^{l} (n)$ is the n-th value of $K_{i}^{l}$ , $x_{j}^{l - 1} (n)$ is the n-th value of $X_{j}^{l - 1}$ and S is the size of the convolutional kernel.

(3) Pooling layer: as the convolution operation increases the channels of the output, the dimension of the convolutional layer output increases sharply, leading to the curse of dimensionality. The pooling layer carries out the down-sampling operation, reducing the network parameters without losing the original features. The common pooling methods include maximum pooling and average pooling. The formula of the pooling layer is described as follows:

p_{i, j}^{l} = down (X_{i, j}^{l - 1})

(8)

where $down (\cdot)$ is the down-sampling function (pooling method), $X_{i, j}^{l - 1}$ is the j-th local region of channel i in layer l−1 and $p_{i, j}^{l}$ is the corresponding output.

(4) Full-connected layer: after the convolutional and the pooling operations, one or two full-connected layers integrate different types of local features usually. The formula of a full-connected layer is shown as follows:

f_{n}^{l + 1} = \sum_{i = 1}^{M} W_{m, n}^{l} x_{m}^{l} + b_{n}^{l}

(9)

where $W_{m, n}^{l}$ is the weight between the m-th neuron in layer l and n-th neuron in the next layer, $x_{m}^{l}$ is the m-th neuron in layer l, $b_{n}^{l}$ is the bias value of all neurons in layer l to n-th neuron in next layer, M is the number of neurons and $f_{n}^{l + 1}$ is the corresponding output in layer l + 1.

(5) Output layer: the output layer outputs the output of the final target result. Usually, the input of the output layer is the last full-connected layer.

Proposed fault diagnosis method based on IPSO-WCNN

The CNNs have already been applied to fault diagnosis. However, most of the CNNs are built based on two-dimensional image data as the input, which requires pre-processing original signals before fault diagnosis. The diagnosis process is complex, and the results are unsatisfactory. Therefore, a rolling bearing fault diagnosis method based on IPSO-WCNN is proposed. First, a deep network based on bearing signals named WCNN is designed. Second, the IPSO algorithm is proposed to improve the optimisation ability of the PSO algorithm. Finally, IPSO is used to optimise the parameters of WCNN, and the adaptability of the proposed method is improved.

Architecture of the proposed WCNN model

In two-dimensional image recognition, VGGnet has higher recognition accuracy with the convolutional kernels of $3 \times 3$ .³⁰ However, it is unfeasible for the one-dimensional vibration signal to use the convolutional kernels of $3 \times 3$ for all the convolutional layers. This will result in a very deep network, which is difficult to train, and the result is unsatisfactory. Besides, the vibration signal usually contains high-frequency noises in the industrial environment, and small convolutional kernels are easily disturbed. The function of the wide kernels is similar to that of Short-Time Fourier Transform (STFT). The difference lies in that the window function of STFT is a sine function, while the convolutional kernels of WCNN are obtained through optimisation algorithm by training. This makes the algorithm automatically learn effective features and delete interfering features for diagnosis. Therefore, the wide kernels in the first convolutional layer are used when designing the CNN model for the vibration signals. That’s why in the proposed model named WCNN, W means wide kernels in the first convolutional layer. The framework of the proposed WCNN is shown in Figure 1.

Figure 1.

Framework of proposed WCNN model.

The proposed WCNN model uses the vibration signal as input. The overall convolutional layers and pooling layers are designed based on one-dimensional size. There are four convolutional layers in the proposed WCNN model. The first convolutional layer adopts wide kernels to extract characteristics in the low-frequency signal, and the others adopt small kernels.

After the convolutional operation, the activation operation uses the activation functions to acquire the nonlinear expression of the input, which enhances the representation ability. The sigmoid function, Hyperbolic Tangent (Tanh) function and Rectified Linear Unit (ReLU) function are widely used as activation functions in CNNs.³¹ The ReLU function can overcome the diffusion gradient characteristic in CNN and accelerate the convergence of CNN. The formula of the ReLU function is described as follows:

a_{i, j}^{l} = f (x_{i, j}^{l}) = \max (0, x_{i, j}^{l})

(10)

where $x_{i, j}^{l}$ is the j-th local region of channel i in layer l, and $a_{i, j}^{l}$ is the corresponding output.

There are four pooling layers in the proposed WCNN model. Max-pooling is used in the pooling operation, which outputs the maximum from the perceptual domain to reduce the parameters. The max-pooling operation is described as follows:

\begin{matrix} p_{i, j}^{l} = \max {x_{i, j}^{l - 1} (t)} & t \in S \end{matrix}

(11)

where $x_{i, j}^{l - 1} (t)$ is the t-th value of the j-th local region of channel i in layer l−1. Further, S is the size of the pooling region, and $p_{i, j}^{l}$ is the corresponding output.

Batch normalisation is used after the convolutional layers or full-connected layers before the activation operation to reduce the variance deviation of the input and accelerate the training process of the network. The batch normalisation is described as follows:

μ_{x} = \frac{1}{m} \sum_{i = 1}^{m} x_{i}

(12)

σ_{x} = \sqrt{\frac{1}{m} \sum_{i = 1}^{m} {(x_{i} - μ_{x})}^{2}}

(13)

\hat{x_{i}} = \frac{x_{i} - μ_{x}}{\sqrt{σ_{x}^{2} + ε}}

(14)

y_{i} = γ \hat{x_{i}} + β

(15)

where $x_{i}$ is the i-th value of input x, m is the batch size, $μ_{x}$ is the mean of x, $σ_{x}$ is the standard deviation of x, $ε$ is a small value, which avoids the dividend being zero, $\hat{x_{i}}$ is the normed value of $x_{i}$ , $γ$ and $β$ are the scale and offset parameters in batch normalisation, respectively, and $y_{i}$ is the output of the batch normalisation.

After the multiple convolutional and pooling layers, a full-connected layer is used to integrate the features. However, when there are large numbers of parameters for training with fewer samples, the trained model of CNN is weak in its generalisation ability, known as overfitting. The dropout layer is added after the full-connected layer to reduce the overfitting.³² The formula of the dropout layer is described as follows:

y^{l} = r^{l} x^{l - 1}

(16)

where $x^{l - 1}$ is the neurons in layer l−1, $r^{l}$ is the random dropout function of layer l, $r^{l} ~ Bernoulli (p^{l})$ , $p^{l}$ is the dropout rate in layer l and $y^{l}$ is the corresponding output.

The softmax function normalises the output layer to ensure the output of the model conforms to the form of the probability distribution for bearing health conditions. This function solves the multiple classification problems. The softmax function is described as follows:

y_{i} = softmax (x_{i}) = \frac{e^{x_{i}}}{\sum_{k} e^{x_{k}}}

(17)

where $x_{i}$ is the i-th neuron in the output layer, and $y_{i}$ is the corresponding output.

The cross-entropy between the estimated softmax output distribution and the target class distribution is adopted as the loss function of the proposed WCNN model. The cross-entropy is described as follows:

Loss = H (P, Q) = - \sum_{x} P (x) \log Q (x)

(18)

where $P (x)$ denotes the target class distribution, and $Q (x)$ denotes the estimated distribution.

The proposed IPSO algorithm

The PSO algorithm can search for the optimal solution quickly by updating the position and velocity. However, the basic PSO algorithm has some problems, such as slow convergence speeds and being easily trapped in local optimal.³³ The IPSO algorithm is proposed with the particle self-adaptive jump out algorithm, self-adaptive inertia weight and time-varying acceleration coefficients to improve the optimisation performance of the PSO algorithm. The procedure of the IPSO algorithm is shown in Figure 2.

(1) Particle self-adaptive jump out algorithm: the particle self-adaptive jump out algorithm is proposed to avoid swarm falling into the optimal local solution. The algorithm calculates the distance between each particle and the global best particle. If the distance is close, the particle should jump out from the current position. Generally, the distances between particles are farther at the beginning, and the swarm needs a larger search range. While the distances between particles are closer later, some particles are needed to jump out to prevent the swarm from falling into the optimal local solution. The particle self-adaptive jump out algorithm is described as follows:

L_{i}^{t} = X_{i}^{t} - P_{gbest}^{t}

(19)

R_{i}^{t} = {\begin{matrix} 0 & L_{i}^{t} \geq (1 - \frac{t}{t_{\max}}) \times \sum_{j = 1}^{N} \frac{L_{i}^{t}}{N} \\ 1 & L_{i}^{t} < (1 - \frac{t}{t_{\max}}) \times \sum_{j = 1}^{N} \frac{L_{i}^{t}}{N} \end{matrix}

(20)

Figure 2.

Procedure of IPSO algorithm.

where $X_{i}^{t}$ is the position of particle i in iteration t, $P_{gbest}^{t}$ is the global best position in iteration t, $L_{i}^{t}$ is the distance between particle i and global best particle in iteration t, N is the population size of the swarm, $t_{\max}$ is the maximum iterations and $R_{i}^{t}$ is the jumping out decision of particle i in iteration t. If the value is 1, the particle needs to jump out from the current position. If not, the particle does not need to jump out.

(2) Self-adaptive inertia weight: appropriate parameters can improve the convergence of the PSO algorithm. The inertial weight is very important for obtaining the optimal solution rapidly. At the beginning of the iteration, a greater weight of inertia should be used to make the particles fly to the global best position faster. Meanwhile, the inertia weight should be decreased later in the iteration to prevent the particles from missing the optimal position. Therefore, the self-adaptive inertia weight is adopted as follows:

ω (t) = (ω_{\max} - ω_{\min}) \times \sum_{i = 1}^{N} \frac{p_{i}^{t}}{N} + ω_{\min}

(21)

p_{i}^{t} = {\begin{matrix} 0 & f (P_{ibest}^{i}) < f (P_{ibest}^{i}) \\ 1 & f (P_{ibest}^{i}) \geq f (P_{ibest}^{i}) \end{matrix}

(22)

where $ω_{\min}$ and $ω_{\max}$ are the minimum and maximum of inertial weight, respectively, $p_{i}^{t}$ is the self-adaptive coefficient of the particle i in iteration t, $P_{ibest}^{i}$ is the local best position of the particle i in iteration t, N is the population size of the swarm, $f (\cdot)$ denotes the fitness function and $ω (t)$ is the inertia weight in iteration t.

(3) Time-varying acceleration coefficients: acceleration constants (cognitive component and social component) affect the information exchange between the particles. At the beginning of the iteration, the cognitive component should be greater, enabling particles to find the local best position quickly. Meanwhile, later in the iteration, the social component should be greater, strengthening the information exchange between the particles. Therefore, the time-varying acceleration coefficients are adopted as follows:

c_{1} (t) = c_{1 \max} - (c_{1 \max} - c_{1 \min}) \times \frac{t}{t_{\max}}

(23)

c_{2} (t) = c_{2 \min} + (c_{2 \max} - c_{2 \min}) \times \frac{t}{t_{\max}}

(24)

where $c_{1 \min}$ and $c_{1 \max}$ are the minimum and maximum of the cognitive component, respectively, $t_{\max}$ is the maximum iterations, $c_{1} (t)$ is the cognitive component in iteration t, $c_{2 \min}$ and $c_{2 \max}$ are the minimum and maximum of the social component, respectively, and $c_{2} (t)$ is the social component in iteration t.

IPSO-WCNN intelligent diagnosis method

The key parameters in the CNNs are usually determined by the experience repeatedly, which weakened CNNs in adaptivity. The key parameters of the WCNN are optimised by IPSO in this paper to improve adaptability. The overall framework of IPSO-WCNN is shown in Figure 3, which is mainly included the following three steps: data processing, model building and model testing.

(1) Data processing: data processing contains vibration signals collection and making samples. Compared with the other operational parameters, such as temperature, voltage and pressure signals, vibration signals are widely applied in fault diagnosis due to the monitoring requirements.³⁴ In signals collection, many factors need to be considered, such as sensor type, installation location and collection parameters, which are related to the accuracy of the measurement results. After signals collection, vibration signals should be sliced into samples for the model. There are three types of samples required in the WCNN model. Therefore, the collected vibration signals are sliced into training samples, verification samples and testing samples.

(2) Model building: model building contains WCNN parameters optimisation and WCNN model training. The WCNN parameters optimisation is the key of the IPSO-WCNN method. The parameters of the WCNN model are set as the particle position in the IPSO algorithm. Further, the WCNN method uses the parameters in the particle position for training with the training samples and verification samples. The verification accuracy in the WCNN model is adopted as fitness for the iteration. After the stop conditions are reached, the IPSO algorithm stops the iteration and outputs the optimised parameters. The WCNN model uses the optimised parameters for training.

(3) Model testing: the WCNN model uses the testing samples for testing after model training. The test accuracy in the WCNN model is adopted for the IPSO-WCNN method evaluation.

Figure 3.

Overall framework of IPSO-WCNN method.

Validation of the proposed IPSO-WCNN method

Case 1: Validation based on CWRU bearing dataset

The Case Western Reserve University (CWRU) bearing dataset is used to investigate the effectiveness of the proposed IPSO-WCNN method. The CWRU dataset is the public famous fault diagnosis dataset.³⁵ The data are collected from a test motor driving system, as shown in Figure 4. The original data are obtained from the accelerometers on the drive end, and the specification of the bearing is 6205-2RS JEM SKF. The faults of the bearing are made by electrical discharge machining. There are four health types of bearing: normal, ball fault, inner raceway fault and outer raceway fault. Each fault type contains fault diameters of 0.007, 0.014 and 0.021 inches, respectively. Therefore, there are 10 health conditions in the dataset. The vibration signals are recorded under engine loads and motor speeds at 0 hp/1797 rpm, 1 hp/1772 rpm, 2 hp/1750 rpm and 3 hp/1730 rpm. The sampling frequency is 12 kHz. The data augmentation obtains the samples with overlap.²⁰ In the experiment, each sample contains 2048 data points. A total of 1000 samples are obtained for each vibration signal. There are four datasets (Dataset A, Dataset B, Dataset C and Dataset D) for the load conditions. Each dataset contains 10 health conditions of bearings with 10,000 samples. The dataset is divided into the training set, verification set and testing set according to the ratio of 0.3:0.3:0.4. The details of all the datasets are described in Table 1.

Experiment 1: The details of the optimised parameters in the WCNN model are shown in Table 2. The PSO-WCNN algorithm with the same initial parameters is selected to compare the convergence to evaluate the performances of the proposed IPSO algorithm. The convergence of the PSO algorithm is insensitive to the population size of the swarm.³⁶ The population size of N = 15 is adopted in the experiment. Usually, the maximum inertia weight is between 0.7 and 0.9, and the minimum inertia weight is between 0.4 and 0.6.³⁶ The maximum inertia weight ( $ω_{\max} = 0.8$ ) and minimum inertia weight ( $ω_{\min} = 0 .$ 5) are adopted. The PSO-WCNN algorithm adopts the linearly decreasing inertia weight.³⁷ $c_{1}$ and $c_{2}$ are set to 2.0 in the PSO-WCNN algorithm to balance the cognitive and social components.³⁸ Usually, the cognitive component $c_{1}$ and social component $c_{2}$ are set with same ranges.^39,40 Simulations of different acceleration coefficients are carried out to find the best performance of IPSO algorithm with Dataset A (Table 3). Each acceleration coefficients range is undergoing 20 trials. It can be identified from the results that the best ranges of the acceleration coefficients are from 0.5 to 2.5. Hence, the maximum of the cognitive component $c_{1 \max}$ and social component $c_{2 \max}$ are set to 2.5, and the minimum of the cognitive component $c_{1 \min}$ and social component $c_{2 \min}$ are set to 0.5 in the IPSO-WCNN algorithm. The maximum of iterations $t_{\max}$ is set as 20, and the comparison of the PSO-WCNN and IPSO-WCNN algorithm is shown in Figure 5.

Figure 4.

Testbed for the CWRU dataset.

Table 1.

Description of CWRU rolling bearing datasets.

Health type	Fault diameters (inch)	Load (hp)/motor speed (rpm)
		0/1797 Dataset A	1/1172 Dataset B	2/1750 Dataset C	3/1730 Dataset D
Normal	0	1000	1000	1000	1000
Ball fault	0.007	1000	1000	1000	1000
	0.014	1000	1000	1000	1000
	0.021	1000	1000	1000	1000
Inner raceway fault	0.007	1000	1000	1000	1000
	0.014	1000	1000	1000	1000
	0.021	1000	1000	1000	1000
Outer raceway fault	0.007	1000	1000	1000	1000
	0.014	1000	1000	1000	1000
	0.021	1000	1000	1000	1000

Table 2.

The optimised parameters in the WCNN model.

Parameters	Scope	Description
Kernel number	1–128	The number of kernels in convolutional layers
Kernel width	1–128	The width of kernels in convolutional layers
Kernel stride	1–128	The strides of kernels in convolutional layers
Pooling size	1–8	The pooling size in pooling layers
Dropout rate	0–1	The rate in the dropout layer
Output of full-connected	10–200	The neurons of the first dense layer
Epochs	0–50	The epochs of training in the network

Table 3.

Results of different acceleration coefficients for IPSO algorithm with Dataset A.

Acceleration coefficients	Average of iterations	Acceleration coefficients	Average of iterations
c = 2.0	13.65	c = 0.5–2.5	11.45
c = 0–2.0	13.85	c = 1.0–2.5	11.65
c = 0.5–2.0	12.85	c = 0–3.0	12.70
c = 1.0–2.0	13.25	c = 0.5–3.0	11.95
c = 0–2.5	12.10	c = 1–3.0	12.30

Figure 5.

The iteration process of IPSO-WCNN and PSO-WCNN in experiment 1: (a) Dataset A and Dataset B and (b) Dataset C and Dataset D.

Figure 5 shows that as the iterations increase, IPSO-WCNN has better fitness than PSO-WCNN in the four datasets. Therefore, IPSO-WCNN converges faster than PSO-WCNN. The IPSO algorithm has a better optimisation performance than the PSO algorithm.

Experiment 2: In the application, fault diagnosis needs high diagnostic accuracy with an acceptable operation time. The diagnostic accuracy can be used for the stop condition to accelerate the convergence of the model. If the fitness (verification accuracy) reaches 99%, the model stops the iteration. Other parameters are set the same as in experiment 1. The comparison of the PSO-WCNN and IPSO-WCNN algorithms is shown in Figure 6.

Figure 6.

The iterations of IPSO-WCNN and PSO-WCNN in experiment 2: (a) Dataset A and Dataset B and (b) Dataset C and Dataset D.

As Figure 6 shows, when using the fitness of 99% as the stop condition, PSO-WCNN and IPSO-WCNN have different iterations. In Dataset A, the average iterations of the PSO-WCNN algorithm and IPSO-WCNN algorithm are 17.0 and 11.4, respectively. In Dataset B, the average iterations of the PSO-WCNN algorithm and IPSO-WCNN algorithm are 17.5 and 10.5, respectively. In Dataset C, the average iterations of the PSO-WCNN algorithm and IPSO-WCNN algorithm are 16.8 and 11.4, respectively. In Dataset D, the average iterations of the PSO-WCNN algorithm and IPSO-WCNN algorithm are 15.5 and 10.9, respectively. The IPSO algorithm has a faster optimising speed.

Experiment 3: The PSO-CNN method²² and WDCNN method²⁰ are selected to compare the diagnostic accuracy to evaluate the performances of the proposed IPSO-WCNN method. The IPSO-WCNN and PSO-CNN methods are stopped when the iterations reach 20 or the fitness reaches 99%. Other parameters of IPSO-WCNN are set the same as experiment 1. Other parameters of PSO-CNN are set the same as PSO-WCNN in experiment 1. The WDCNN method contains five convolutional layers and pooling layers. The parameters of the WDCNN method are as follows: the width of kernels in the first convolutional layer is 64, and the width of kernels in other convolutional layers is 3. The kernel stride in the first convolutional layer is 16, and the kernel stride in other convolutional layers is 1. The number of kernels in five convolutional layers are 16, 32, 64, 64 and 64, respectively. The pooling size is 2 in all pooling layers. The results are shown in Figure 7.

Figure 7.

Diagnostic accuracy of the three methods: (a) Dataset A, (b) Dataset B, (c) Dataset C and (d) Dataset D.

The results show that in Dataset A, the average diagnostic accuracy of the IPSO-WCNN method is 99.3%, the average diagnostic accuracy of the PSO-CNN method is 94.6% and the average diagnostic accuracy of the WDCNN method is 95.0%. In Dataset B, the average diagnostic accuracy of the IPSO-WCNN method is 99.4%, the average diagnostic accuracy of the PSO-CNN method is 93.1% and the average diagnostic accuracy of the WDCNN method is 94.1%. In Dataset C, the average diagnostic accuracy of the IPSO-WCNN method is 99.3%, the average diagnostic accuracy of the PSO-CNN method is 93.8% and the average diagnostic accuracy of the WDCNN method is 93.8%. In Dataset D, the average diagnostic accuracy of the IPSO-WCNN method is 99.2%, the average diagnostic accuracy of the PSO-CNN method is 93.0% and the average diagnostic accuracy of the WDCNN method is 93.7%. The IPSO-WCNN method has higher diagnostic accuracy than others in all datasets.

Case 2: Validation based on self-made rolling bearing dataset

For further validation of the effectiveness of the proposed IPSO-WCNN method on other experimental devices, a self-made bearing dataset is collected on the UT6618 fault testbed (Figure 8).

Figure 8.

UT6618 fault testbed.

The bearing fault testbed mainly consists of a motor, console, a belt driving part, couplings, loading device, bearing test device, and acceleration sensor. The console controls the speed of the drive motor via a pulse frequency modulation control system. The drive motor drives the bearing test device through the synchronous belt and couplings. The acceleration sensor is a CA-YD-187T02 piezoelectric acceleration sensor, which is located directly over the bearing seat. The data collector is a UT3408FRS-ICP 24-bit collector. The motor is a frequency control three-phase asynchronous motor with the power of 0.75 kW, and the maximum speed is 1500 rpm. The specification of the bearing is #6205. The faults of the bearing are processed by co-packer. The fault types of the bearing are pitting corrosion and cracks. Each fault type contains fault locations of the outer raceway, inner raceway and rolling balls. The fault degree is divided into three grades in each fault location. Therefore, there are 19 health conditions of bearings in the experiment. Types and positions of bearing faults are shown in Figure 9.

Figure 9.

Types and positions of bearing faults: (a) outer raceway pitting, (b) inner raceway pitting, (c) rolling balls pitting, (d) outer raceway crack, (e) inner raceway crack and (f) rolling balls crack.

In the experiment, the motor speed maintains 1500 rpm, and the sampling frequency of the accelerometer is 20,480 Hz. Vibration signals are recorded under loads at 0, 300 and 600 N, respectively. The samples are obtained from the recorded vibration signals by the data augmentation. Each sample contains 2048 data points, and each vibration signal contains 1000 samples. Therefore, there are 19,000 samples in each dataset in total. The dataset was divided into a training set, verification set and testing set according to the ratio of 0.3:0.3:0.4. The details of all the datasets are described in Table 4.

Experiment 4: The self-made rolling bearing datasets are used to diagnose the fault of bearings with the proposed IPSO-WCNN method to evaluate the adaptability of the proposed IPSO-WCNN method. The stop conditions of IPSO-WCNN are set so that the number of iterations reached 20, or the fitness reached 99%. Other parameters in IPSO-WCNN are set the same as experiment 1. The results are shown in Figure 10.

Table 4.

Description of self-made rolling bearing datasets.

Fault type	Fault degree	Loads (N)
		0	300	600
		Dataset E	Dataset F	Dataset G
Normal	Normal	1000	1000	1000
Outer raceway pitting	Slight	1000	1000	1000
	Moderate	1000	1000	1000
	Severe	1000	1000	1000
Inner raceway pitting	Slight	1000	1000	1000
	Moderate	1000	1000	1000
	Severe	1000	1000	1000
Rolling balls pitting	Slight	1000	1000	1000
	Moderate	1000	1000	1000
	Severe	1000	1000	1000
Outer raceway crack	Slight	1000	1000	1000
	Moderate	1000	1000	1000
	Severe	1000	1000	1000
Inner raceway crack	Slight	1000	1000	1000
	Moderate	1000	1000	1000
	Severe	1000	1000	1000
Rolling balls crack	Slight	1000	1000	1000
	Moderate	1000	1000	1000
	Severe	1000	1000	1000

Figure 10.

Fault diagnosis results of the proposed method with self-made dataset.

As Figure 10 shows, in Dataset E, the average diagnostic accuracy of the proposed IPSO-WCNN method is 99.7%. In Dataset F, the average diagnostic accuracy of the proposed IPSO-WCNN method is 99.4%. In Dataset G, the average diagnostic accuracy of the proposed IPSO-WCNN method is 99.5%. The proposed IPSO-WCNN method has higher diagnostic accuracy and good adaptability between different experimental devices.

Conclusions

In this paper, the IPSO-WCNN method is proposed for rolling bearing fault diagnosis. First, the PSO algorithm is improved by using the particle self-adaptive jump out algorithm, self-adaptive inertia weight and time-varying acceleration coefficients. Second, the WCNN model based on vibration signals is designed to simplify the diagnosis process, and IPSO optimises the key parameters of WCNN. Third, the proposed method is applied to diagnose rolling bearing faults, and the results show that the proposed method has higher diagnostic accuracy and good adaptability between different experimental devices.

Footnotes

Declaration of conflicting interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by National Natural Science Foundation of China (Grant nos. 61463021 and 61963018), Key Natural Science Foundation of Jiangxi Province in China (Research on optimal feature extraction and intelligent visual diagnosis method of gearbox local fault under variable working conditions).

ORCID iD

Yingkui Gu

References

Rai

Upadhyay

SH.

A review on signal processing techniques utilized in the fault diagnosis of rolling element bearings. Tribol Int 2016; 96: 289–306.

Wei

Wang

, et al. A novel intelligent method for bearing fault diagnosis based on affinity propagation clustering and adaptive feature selection. Knowl Based Syst 2017; 116(15): 1–12.

Jiang

An improved EEMD with multiwavelet packet for rotating machinery multi-fault diagnosis. Mech Syst Signal Process 2013; 36(2): 225–239.

Jia

Lei

Lin

, et al. Deep neural networks: a promising tool for fault characteristic mining and intelligent diagnosis of rotating machinery with massive data. Mech Syst Signal Process 2016; 72–73: 303–315.

Kedadouche

Liu

Fault feature extraction and classification based on WPT and SVD: application to element bearings with artificially created faults under variable conditions. Proc IMechE, Part C: J Mechanical Engineering Science 2017; 231(22): 4186–4196.

Hui

Ooi

Lim

, et al. A hybrid artificial neural network with Dempster-Shafer theory for automated bearing fault diagnosis. J Vibroengineering 2016; 18(7): 4409–4418.

Gunerkar

Jalan

Belgamwar

SU.

Fault diagnosis of rolling element bearing based on artificial neural network. J Mech Sci Technol 2019; 33(2): 505–511.

Hinton

Salakhutdinov

RR.

Reducing the dimensionality of data with neural networks. Science 2006; 313(5786): 504–507.

Shao

Jiang

, et al. Intelligent fault diagnosis of rolling bearing using deep wavelet auto-encoder with extreme learning machine. Knowl Based Syst 2018; 140: 1–14.

10.

Jiang

Shao

, et al. Intelligent fault diagnosis of rolling bearings using an improved deep recurrent neural network. Meas Sci Technol 2018; 29: 065107.

11.

Zhu

Jiang

, et al. Intelligent bearing fault diagnosis using PCA–DBN framework. Neural Comput Appl 2020; 32(14): 10773–10781.

12.

Tang

Yuan

Zhu

Convolutional neural network in intelligent fault diagnosis toward rotatory machinery. IEEE Access 2020; 8: 86510–86519.

13.

Jiao

Zhao

Lin

, et al. A comprehensive review on convolutional neural network in machine fault diagnosis. Neurocomputing 2020; 417: 36–63.

14.

Guo

Chen

Shen

Hierarchical adaptive deep convolution neural network and its application to bearing fault diagnosis. Measurement 2016; 93: 490–502.

15.

Ahmed

HOA

Nandi

. Connected components-based colour image representations of vibrations for a two-stage fault diagnosis of roller bearings using convolutional neural networks. Chin J Mech Eng 2021; 34: 1–21.

16.

Guo

Yang

Gao

, et al. An intelligent fault diagnosis method for bearings with variable rotating speed based on pythagorean spatial pyramid pooling CNN. Sensors 2018; 18: 3857.

17.

Wen

Gao

, et al. A new convolutional neural network-based data-driven fault diagnosis method. IEEE Trans Ind Electron 2018; 65(7): 5990–5998.

18.

Eren

Bearing fault detection by one-dimensional convolutional neural networks. Math Probl Eng 2017; 2017: 1–9.

19.

Chen

Liu

Yang

, et al. An improved fault diagnosis using 1D-convolutional neural network model. Electronics 2021; 10: 59.

20.

Zhang

Peng

, et al. A new deep learning model for fault diagnosis with good anti-noise and domain adaptation ability on raw vibration signals. Sensors 2017; 17: 425.

21.

Syulistyo

Jati Purnomo

Rachmadi

, et al. Particle swarm optimization (PSO) for training optimization on convolutional neural network (CNN). J Comput Sci Inf 2016; 9(1): 52–58.

22.

Wang

Jiang

Shao

, et al. An adaptive deep convolutional neural network for rolling bearing fault diagnosis. Meas Sci Technol 2017; 28: 095005.

23.

Chen

Jiang

Guo

, et al. A self-adaptive CNN with PSO for bearing fault diagnosis. Syst Sci Control Eng 2021; 9(1): 11–22.

24.

Khodja

Aimer

Boudinar

, et al. Bearing fault diagnosis of a PWM inverter fed-induction motor using an improved short time Fourier transform. J Electr Eng Technol 2019; 14(3): 1201–1210.

25.

Kennedy

Eberhar

Particle swarm optimization. In: Proceedings of IEEE international conference on neural networks, Perth, WA, Australia, 1995, pp.1942–1948. IEEE.

26.

Liu

Lou

, et al. A fault detection method based on CPSO-improved KICA. Entropy 2019; 21: 668.

27.

LeCun

Bengio

Hinton

Deep learning. Nature 2015; 521: 436–444.

28.

Krizhevsky

Sutskever

Hinton

GE.

Imagenet classification with deep convolutional neural networks. Commun ACM 2017; 60(6): 84–90.

29.

Waziralilah

Abu

Lim

, et al. A review on convolutional neural network in bearing fault diagnosis. MATEC Web Conf 2019; 255: 06002.

30.

Yang

Classification of picture art style based on VGGNET. J Phys Conf Ser 2021; 1774: 012043.

31.

Wang

Song

, et al. The influence of the activation function in a convolution neural network model of facial expression recognition. Appl Sci 2020; 10: 1897.

32.

Poernomo

Kang

DK.

Biased dropout and crossmap dropout: learning towards effective dropout regularization in convolutional neural network. Neural Netw 2018; 104: 60–67.

33.

Benuwa

Ghansah

Wornyo

, et al. A comprehensive review of particle swarm optimization. Int J Eng Res Afr 2016; 23: 141–161.

34.

Chen

Wang

Qiao

, et al. Basic research on machinery fault diagnostics: past, present, and future trends. Front Mech Eng 2018; 13(2): 264–291.

35.

Smith

Randall

RB.

Rolling element bearing diagnostics using the Case Western Reserve University data: a benchmark study. Mech Syst Signal Process 2015; 64–65: 100–131.

36.

Shi

Eberhart

RC.

Empirical study of particle swarm optimization. In: Proceedings of the 1999 congress on evolutionary computation, 1999, pp.1945–1950. IEEE Computer Society.

37.

Rathore

Sharma

Review on inertia weight strategies for particle swarm optimization. In: Proceedings of sixth international conference on soft computing for problem solving, 2017, vol. 546, no. 4, pp.73–83.

38.

Takei

Yasuda

Ishigame

Particle swarm optimization with diverse parameters. IEEJ Trans Electr Electron Eng 2008; 3(4): 449–451.

39.

Chen

Zhou

Yin

, et al. A hybrid particle swarm optimizer with sine cosine acceleration coefficients. Inf Sci 2018; 422: 218–241.

40.

Ratnaweera

Halgamuge

Watson

HC.

Self-organizing hierarchical particle swarm optimizer with time-varying acceleration coefficients. IEEE Trans Evol Comput 2004; 8(3): 240–255.

Rolling bearing intelligent fault diagnosis method based on IPSO-WCNN

Abstract

Keywords

Introduction

Related theory

Vibratory mechanism of rolling bearing

Basic theory of PSO

Basic theory of CNN

Proposed fault diagnosis method based on IPSO-WCNN

Architecture of the proposed WCNN model

The proposed IPSO algorithm

IPSO-WCNN intelligent diagnosis method

Validation of the proposed IPSO-WCNN method

Case 1: Validation based on CWRU bearing dataset

Case 2: Validation based on self-made rolling bearing dataset

Conclusions

Footnotes

Declaration of conflicting interests

Funding

ORCID iD

References