Abstract
In the bearing fault diagnosis process using the convolution neural network (CNN), there are some problems, such as complex signal data processing and the complex network parameter setting. A rolling bearing fault diagnosis method is proposed to solve these problems based on improved particle swarm optimization and convolution neural networks with wide kernels in first-layer (IPSO-WCNN). The particle self-adaptive jump out algorithm is proposed to overcome particle swarm optimization (PSO) shortcomings. The adaptive inertia weight and the linear change acceleration coefficients are adopted for improved particle swarm optimization (IPSO). The convolution neural networks with wide kernels in first-layer (WCNN) fault diagnosis method is proposed for one-dimensional rolling bearing vibration signals, and the parameters of the WCNN is optimised by IPSO. According to the verification experiments, the proposed method can get higher accuracy than others with good adaptability.
Introduction
Rolling bearing is widely used in rotating machinery. About 50% of rotating machinery failures are related to bearings.1,2 The failure and degradation of bearings can result in unexpected failures of machines, leading to large economic losses and even serious human injuries. Therefore, effective fault diagnosis technology can help identify bearing faults in the early detection and improve the reliability of the mechanical equipment.
The complex operating environment and huge data size in modern equipment made traditional methods unable to meet the requirements of fault diagnosis. 3 With the rapid development of machine learning, intelligent fault diagnosis methods based on data-driven technologies have been widely used in recent years. Limited by the shallow network structure, traditional intelligent fault diagnosis methods make it difficult to diagnose the deep micro fault features. 4 Researchers improved the performance of intelligent fault diagnosis by the improved fault feature extraction method and introduced shallow intelligent learning algorithms.5–7 However, the diagnostic results are unsatisfactory, and the fault diagnosis methods are difficult to realise with huge data sizes.
Hinton 8 proposed the theory of deep learning to overcome the shortcomings of traditional intelligent diagnosis methods. Recently, deep learning has become a hotspot in the field of fault diagnosis. Shao et al. 9 designed a deep wavelet auto-encoder network, using an extreme learning machine as a classifier for rolling bearings fault diagnosis. The researchers captured signal features with a wavelet auto-encoder with a wavelet function. Jiang et al. 10 designed an improved deep recurrent neural network for rolling bearings fault diagnosis, using spectrum sequences as inputs. Zhu et al. 11 combined principal component analysis with the deep belief network for the bearings fault diagnosis, adopting the principal component analysis method to reduce the dimension of raw bearing vibration signals.
As an important deep learning algorithm, CNN has advantages in feature extraction and processing big data.12–14 Ahmed and Nandi 15 converted vibration signals into 2D grayscale images for CNN inputs to classify bearing faults. Guo et al. 16 processed vibration signals by wavelet transform and simplified parameter tuning by optimising the pooling layer of CNN. Wen et al. 17 constructed CNN based on Lenet-5 and converted the original signals into two-dimensional images as inputs for fault diagnosis. The above cases prove that CNN is an effective method for fault diagnosis. However, there are some drawbacks. First, the original CNN model is designed for image recognition and classification. When applied to bearing fault diagnosis, the inputs of CNN need two-dimensional images. Secondly, there are large numbers of parameters in CNN. When designing the model, it needs debugging to choose a reasonable model repeatedly with poor self-adaptability.
Researchers constructed new CNNs to meet the structure of the original signals and solve the first problem of CNN. Eren 18 proposed a one-dimensional CNN detection system for motor bearing with better detection accuracy. Chen et al. 19 designed a one-dimensional CNN based on original signals and introduced the dropout layer to improve the accuracy of bearing fault diagnosis. Zhang et al. 20 designed a deep one-dimensional CNN and improved the fault diagnosis accuracy by increasing the width of the convolution kernel in the first convolution layer.
Researchers have studied parameters optimisation in CNN to solve the second problem of CNN. Syulistyo et al. 21 used PSO to optimise the output vector of CNN, improving the recognition accuracy of the handwritten digit from MNIST. Wang et al. 22 established CNN with two convolutional and pooling layers for bearing fault diagnosis and used PSO to optimise the key parameters of CNN, improving the adaptability of the model. Chen et al. 23 constructed CNN with three convolutional and pooling layers for bearing fault diagnosis, using short-time Fourier transform for signal processing, and the key parameters of CNN were solved by PSO.
The IPSO-WCNN method for bearing fault diagnosis is proposed to solve these problems simultaneously. The main contributions of this study are as follows:
WCNN is proposed, which adapted to the original signals to simplify the process of bearing fault diagnosis. The original signal data can be taken as the input of WCNN.
IPSO with particle self-adaptive jump out algorithm, adaptive inertia weight and linear change acceleration coefficients is proposed to improve the optimisation performance of the PSO algorithm.
IPSO is used to optimise the key parameters in WCNN to improve the adaptability of WCNN.
The remainder of this paper is organised as follows. Section 2 introduces the vibratory mechanism of rolling bearing and the basic theory of PSO and CNN. In Section 3, the IPSO-WCNN intelligent diagnostic method is proposed. In Section 4, the IPSO-WCNN method is verified by experiments, and the results are discussed. Conclusions are drawn in Section 5.
Related theory
Vibratory mechanism of rolling bearing
The vibration of the rolling bearing always exists when the rolling bearing is working, which is related to the structure, working environment, assembly, running characteristics and other factors of the rolling bearing. The main factors of vibration are detailed as follows: First, the rolling ball and inner or outer ring of the bearing will extrude each other when the rolling bearing is loading, which can lead to a vibration impact. Generally, the load of the rolling ball becomes smaller when it is far away from the load centre. Second, in the manufacturing process, the errors of diameter, offset of axis, excessive clearance, surface quality and other problems can lead to the bearing vibration. Third, the elastic deformation caused by improper bearing installation leads to a misalignment of shafts and an unsatisfactory interference fit degree between the shaft and inner ring. These can lead to the vibration of bearing. Fourth, when deformation faults occur in rolling bearings, such as cracks and peeling, the impact will be generated when the rolling ball runs to the fault location. Fifth, foreign bodies and forces in the working environment will lead to abnormal vibration signals of bearings.
When the bearing failed, new frequency characteristics appeared in vibration signals. The fault information can be diagnosed by spectral analysis. When the contact surfaces of the rolling balls and inner or outer ring of the bearing failed, the impact will be generated when the bearing passes through the fault part. This can generate excitation to the bearing system, and then the vibration caused by the impact excitation is formed. This kind of vibration is characterised as a low-frequency periodic impact, and the corresponding frequency is fault frequency. The fault type of bearing can be calculated by theoretical characteristic fault frequencies. 24 Their corresponding characteristic fault frequencies are given as follows:
where
Basic theory of PSO
The PSO algorithm has the advantages of simple implementation, practicability and fast computation, which is widely used in optimisation and solution.25,26 Each particle includes the position
where
Basic theory of CNN
CNN has a strong feature extraction capability by constructing multiple filters and extracting the input features from the layers.27,28 Generally, CNN is composed of the input layer, convolutional layer, pooling layer, full-connected layer and output layer. 29
(1) Input layer: the original CNN model is designed for image recognition and classification, so the input of CNN usually is two-dimensional image data.
(2) Convolutional layer: convolutional operation is the key of CNN. The convolutional layer traverses the local input regions with convolutional kernels and outputs the corresponding features. The convolutional layer parameters are reduced by weight-sharing, which uses the same kernel function in each filter to extract the features of the local input region. 27 The operation of the convolutional layer is described as follows:
where
(3) Pooling layer: as the convolution operation increases the channels of the output, the dimension of the convolutional layer output increases sharply, leading to the curse of dimensionality. The pooling layer carries out the down-sampling operation, reducing the network parameters without losing the original features. The common pooling methods include maximum pooling and average pooling. The formula of the pooling layer is described as follows:
where
(4) Full-connected layer: after the convolutional and the pooling operations, one or two full-connected layers integrate different types of local features usually. The formula of a full-connected layer is shown as follows:
where
(5) Output layer: the output layer outputs the output of the final target result. Usually, the input of the output layer is the last full-connected layer.
Proposed fault diagnosis method based on IPSO-WCNN
The CNNs have already been applied to fault diagnosis. However, most of the CNNs are built based on two-dimensional image data as the input, which requires pre-processing original signals before fault diagnosis. The diagnosis process is complex, and the results are unsatisfactory. Therefore, a rolling bearing fault diagnosis method based on IPSO-WCNN is proposed. First, a deep network based on bearing signals named WCNN is designed. Second, the IPSO algorithm is proposed to improve the optimisation ability of the PSO algorithm. Finally, IPSO is used to optimise the parameters of WCNN, and the adaptability of the proposed method is improved.
Architecture of the proposed WCNN model
In two-dimensional image recognition, VGGnet has higher recognition accuracy with the convolutional kernels of

Framework of proposed WCNN model.
The proposed WCNN model uses the vibration signal as input. The overall convolutional layers and pooling layers are designed based on one-dimensional size. There are four convolutional layers in the proposed WCNN model. The first convolutional layer adopts wide kernels to extract characteristics in the low-frequency signal, and the others adopt small kernels.
After the convolutional operation, the activation operation uses the activation functions to acquire the nonlinear expression of the input, which enhances the representation ability. The sigmoid function, Hyperbolic Tangent (Tanh) function and Rectified Linear Unit (ReLU) function are widely used as activation functions in CNNs. 31 The ReLU function can overcome the diffusion gradient characteristic in CNN and accelerate the convergence of CNN. The formula of the ReLU function is described as follows:
where
There are four pooling layers in the proposed WCNN model. Max-pooling is used in the pooling operation, which outputs the maximum from the perceptual domain to reduce the parameters. The max-pooling operation is described as follows:
where
Batch normalisation is used after the convolutional layers or full-connected layers before the activation operation to reduce the variance deviation of the input and accelerate the training process of the network. The batch normalisation is described as follows:
where
After the multiple convolutional and pooling layers, a full-connected layer is used to integrate the features. However, when there are large numbers of parameters for training with fewer samples, the trained model of CNN is weak in its generalisation ability, known as overfitting. The dropout layer is added after the full-connected layer to reduce the overfitting. 32 The formula of the dropout layer is described as follows:
where
The softmax function normalises the output layer to ensure the output of the model conforms to the form of the probability distribution for bearing health conditions. This function solves the multiple classification problems. The softmax function is described as follows:
where
The cross-entropy between the estimated softmax output distribution and the target class distribution is adopted as the loss function of the proposed WCNN model. The cross-entropy is described as follows:
where
The proposed IPSO algorithm
The PSO algorithm can search for the optimal solution quickly by updating the position and velocity. However, the basic PSO algorithm has some problems, such as slow convergence speeds and being easily trapped in local optimal. 33 The IPSO algorithm is proposed with the particle self-adaptive jump out algorithm, self-adaptive inertia weight and time-varying acceleration coefficients to improve the optimisation performance of the PSO algorithm. The procedure of the IPSO algorithm is shown in Figure 2.
(1) Particle self-adaptive jump out algorithm: the particle self-adaptive jump out algorithm is proposed to avoid swarm falling into the optimal local solution. The algorithm calculates the distance between each particle and the global best particle. If the distance is close, the particle should jump out from the current position. Generally, the distances between particles are farther at the beginning, and the swarm needs a larger search range. While the distances between particles are closer later, some particles are needed to jump out to prevent the swarm from falling into the optimal local solution. The particle self-adaptive jump out algorithm is described as follows:

Procedure of IPSO algorithm.
where
(2) Self-adaptive inertia weight: appropriate parameters can improve the convergence of the PSO algorithm. The inertial weight is very important for obtaining the optimal solution rapidly. At the beginning of the iteration, a greater weight of inertia should be used to make the particles fly to the global best position faster. Meanwhile, the inertia weight should be decreased later in the iteration to prevent the particles from missing the optimal position. Therefore, the self-adaptive inertia weight is adopted as follows:
where
(3) Time-varying acceleration coefficients: acceleration constants (cognitive component and social component) affect the information exchange between the particles. At the beginning of the iteration, the cognitive component should be greater, enabling particles to find the local best position quickly. Meanwhile, later in the iteration, the social component should be greater, strengthening the information exchange between the particles. Therefore, the time-varying acceleration coefficients are adopted as follows:
where
IPSO-WCNN intelligent diagnosis method
The key parameters in the CNNs are usually determined by the experience repeatedly, which weakened CNNs in adaptivity. The key parameters of the WCNN are optimised by IPSO in this paper to improve adaptability. The overall framework of IPSO-WCNN is shown in Figure 3, which is mainly included the following three steps: data processing, model building and model testing.
(1) Data processing: data processing contains vibration signals collection and making samples. Compared with the other operational parameters, such as temperature, voltage and pressure signals, vibration signals are widely applied in fault diagnosis due to the monitoring requirements. 34 In signals collection, many factors need to be considered, such as sensor type, installation location and collection parameters, which are related to the accuracy of the measurement results. After signals collection, vibration signals should be sliced into samples for the model. There are three types of samples required in the WCNN model. Therefore, the collected vibration signals are sliced into training samples, verification samples and testing samples.
(2) Model building: model building contains WCNN parameters optimisation and WCNN model training. The WCNN parameters optimisation is the key of the IPSO-WCNN method. The parameters of the WCNN model are set as the particle position in the IPSO algorithm. Further, the WCNN method uses the parameters in the particle position for training with the training samples and verification samples. The verification accuracy in the WCNN model is adopted as fitness for the iteration. After the stop conditions are reached, the IPSO algorithm stops the iteration and outputs the optimised parameters. The WCNN model uses the optimised parameters for training.
(3) Model testing: the WCNN model uses the testing samples for testing after model training. The test accuracy in the WCNN model is adopted for the IPSO-WCNN method evaluation.

Overall framework of IPSO-WCNN method.
Validation of the proposed IPSO-WCNN method
Case 1: Validation based on CWRU bearing dataset
The Case Western Reserve University (CWRU) bearing dataset is used to investigate the effectiveness of the proposed IPSO-WCNN method. The CWRU dataset is the public famous fault diagnosis dataset. 35 The data are collected from a test motor driving system, as shown in Figure 4. The original data are obtained from the accelerometers on the drive end, and the specification of the bearing is 6205-2RS JEM SKF. The faults of the bearing are made by electrical discharge machining. There are four health types of bearing: normal, ball fault, inner raceway fault and outer raceway fault. Each fault type contains fault diameters of 0.007, 0.014 and 0.021 inches, respectively. Therefore, there are 10 health conditions in the dataset. The vibration signals are recorded under engine loads and motor speeds at 0 hp/1797 rpm, 1 hp/1772 rpm, 2 hp/1750 rpm and 3 hp/1730 rpm. The sampling frequency is 12 kHz. The data augmentation obtains the samples with overlap. 20 In the experiment, each sample contains 2048 data points. A total of 1000 samples are obtained for each vibration signal. There are four datasets (Dataset A, Dataset B, Dataset C and Dataset D) for the load conditions. Each dataset contains 10 health conditions of bearings with 10,000 samples. The dataset is divided into the training set, verification set and testing set according to the ratio of 0.3:0.3:0.4. The details of all the datasets are described in Table 1.
Experiment 1: The details of the optimised parameters in the WCNN model are shown in Table 2. The PSO-WCNN algorithm with the same initial parameters is selected to compare the convergence to evaluate the performances of the proposed IPSO algorithm. The convergence of the PSO algorithm is insensitive to the population size of the swarm.
36
The population size of N = 15 is adopted in the experiment. Usually, the maximum inertia weight is between 0.7 and 0.9, and the minimum inertia weight is between 0.4 and 0.6.
36
The maximum inertia weight (

Testbed for the CWRU dataset.
Description of CWRU rolling bearing datasets.
The optimised parameters in the WCNN model.
Results of different acceleration coefficients for IPSO algorithm with Dataset A.

The iteration process of IPSO-WCNN and PSO-WCNN in experiment 1: (a) Dataset A and Dataset B and (b) Dataset C and Dataset D.
Figure 5 shows that as the iterations increase, IPSO-WCNN has better fitness than PSO-WCNN in the four datasets. Therefore, IPSO-WCNN converges faster than PSO-WCNN. The IPSO algorithm has a better optimisation performance than the PSO algorithm.
Experiment 2: In the application, fault diagnosis needs high diagnostic accuracy with an acceptable operation time. The diagnostic accuracy can be used for the stop condition to accelerate the convergence of the model. If the fitness (verification accuracy) reaches 99%, the model stops the iteration. Other parameters are set the same as in experiment 1. The comparison of the PSO-WCNN and IPSO-WCNN algorithms is shown in Figure 6.

The iterations of IPSO-WCNN and PSO-WCNN in experiment 2: (a) Dataset A and Dataset B and (b) Dataset C and Dataset D.
As Figure 6 shows, when using the fitness of 99% as the stop condition, PSO-WCNN and IPSO-WCNN have different iterations. In Dataset A, the average iterations of the PSO-WCNN algorithm and IPSO-WCNN algorithm are 17.0 and 11.4, respectively. In Dataset B, the average iterations of the PSO-WCNN algorithm and IPSO-WCNN algorithm are 17.5 and 10.5, respectively. In Dataset C, the average iterations of the PSO-WCNN algorithm and IPSO-WCNN algorithm are 16.8 and 11.4, respectively. In Dataset D, the average iterations of the PSO-WCNN algorithm and IPSO-WCNN algorithm are 15.5 and 10.9, respectively. The IPSO algorithm has a faster optimising speed.
Experiment 3: The PSO-CNN method 22 and WDCNN method 20 are selected to compare the diagnostic accuracy to evaluate the performances of the proposed IPSO-WCNN method. The IPSO-WCNN and PSO-CNN methods are stopped when the iterations reach 20 or the fitness reaches 99%. Other parameters of IPSO-WCNN are set the same as experiment 1. Other parameters of PSO-CNN are set the same as PSO-WCNN in experiment 1. The WDCNN method contains five convolutional layers and pooling layers. The parameters of the WDCNN method are as follows: the width of kernels in the first convolutional layer is 64, and the width of kernels in other convolutional layers is 3. The kernel stride in the first convolutional layer is 16, and the kernel stride in other convolutional layers is 1. The number of kernels in five convolutional layers are 16, 32, 64, 64 and 64, respectively. The pooling size is 2 in all pooling layers. The results are shown in Figure 7.

Diagnostic accuracy of the three methods: (a) Dataset A, (b) Dataset B, (c) Dataset C and (d) Dataset D.
The results show that in Dataset A, the average diagnostic accuracy of the IPSO-WCNN method is 99.3%, the average diagnostic accuracy of the PSO-CNN method is 94.6% and the average diagnostic accuracy of the WDCNN method is 95.0%. In Dataset B, the average diagnostic accuracy of the IPSO-WCNN method is 99.4%, the average diagnostic accuracy of the PSO-CNN method is 93.1% and the average diagnostic accuracy of the WDCNN method is 94.1%. In Dataset C, the average diagnostic accuracy of the IPSO-WCNN method is 99.3%, the average diagnostic accuracy of the PSO-CNN method is 93.8% and the average diagnostic accuracy of the WDCNN method is 93.8%. In Dataset D, the average diagnostic accuracy of the IPSO-WCNN method is 99.2%, the average diagnostic accuracy of the PSO-CNN method is 93.0% and the average diagnostic accuracy of the WDCNN method is 93.7%. The IPSO-WCNN method has higher diagnostic accuracy than others in all datasets.
Case 2: Validation based on self-made rolling bearing dataset
For further validation of the effectiveness of the proposed IPSO-WCNN method on other experimental devices, a self-made bearing dataset is collected on the UT6618 fault testbed (Figure 8).

UT6618 fault testbed.
The bearing fault testbed mainly consists of a motor, console, a belt driving part, couplings, loading device, bearing test device, and acceleration sensor. The console controls the speed of the drive motor via a pulse frequency modulation control system. The drive motor drives the bearing test device through the synchronous belt and couplings. The acceleration sensor is a CA-YD-187T02 piezoelectric acceleration sensor, which is located directly over the bearing seat. The data collector is a UT3408FRS-ICP 24-bit collector. The motor is a frequency control three-phase asynchronous motor with the power of 0.75 kW, and the maximum speed is 1500 rpm. The specification of the bearing is #6205. The faults of the bearing are processed by co-packer. The fault types of the bearing are pitting corrosion and cracks. Each fault type contains fault locations of the outer raceway, inner raceway and rolling balls. The fault degree is divided into three grades in each fault location. Therefore, there are 19 health conditions of bearings in the experiment. Types and positions of bearing faults are shown in Figure 9.

Types and positions of bearing faults: (a) outer raceway pitting, (b) inner raceway pitting, (c) rolling balls pitting, (d) outer raceway crack, (e) inner raceway crack and (f) rolling balls crack.
In the experiment, the motor speed maintains 1500 rpm, and the sampling frequency of the accelerometer is 20,480 Hz. Vibration signals are recorded under loads at 0, 300 and 600 N, respectively. The samples are obtained from the recorded vibration signals by the data augmentation. Each sample contains 2048 data points, and each vibration signal contains 1000 samples. Therefore, there are 19,000 samples in each dataset in total. The dataset was divided into a training set, verification set and testing set according to the ratio of 0.3:0.3:0.4. The details of all the datasets are described in Table 4.
Experiment 4: The self-made rolling bearing datasets are used to diagnose the fault of bearings with the proposed IPSO-WCNN method to evaluate the adaptability of the proposed IPSO-WCNN method. The stop conditions of IPSO-WCNN are set so that the number of iterations reached 20, or the fitness reached 99%. Other parameters in IPSO-WCNN are set the same as experiment 1. The results are shown in Figure 10.
Description of self-made rolling bearing datasets.

Fault diagnosis results of the proposed method with self-made dataset.
As Figure 10 shows, in Dataset E, the average diagnostic accuracy of the proposed IPSO-WCNN method is 99.7%. In Dataset F, the average diagnostic accuracy of the proposed IPSO-WCNN method is 99.4%. In Dataset G, the average diagnostic accuracy of the proposed IPSO-WCNN method is 99.5%. The proposed IPSO-WCNN method has higher diagnostic accuracy and good adaptability between different experimental devices.
Conclusions
In this paper, the IPSO-WCNN method is proposed for rolling bearing fault diagnosis. First, the PSO algorithm is improved by using the particle self-adaptive jump out algorithm, self-adaptive inertia weight and time-varying acceleration coefficients. Second, the WCNN model based on vibration signals is designed to simplify the diagnosis process, and IPSO optimises the key parameters of WCNN. Third, the proposed method is applied to diagnose rolling bearing faults, and the results show that the proposed method has higher diagnostic accuracy and good adaptability between different experimental devices.
Footnotes
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by National Natural Science Foundation of China (Grant nos. 61463021 and 61963018), Key Natural Science Foundation of Jiangxi Province in China (Research on optimal feature extraction and intelligent visual diagnosis method of gearbox local fault under variable working conditions).
