Sage Journals: Discover world-class research

Abstract

The existing transfer diagnosis methods based on entropy minimization are easy to lead to trivial solution. To solve this problem, a deep diversity maximization-based adversarial transfer diagnosis approach for rotating machinery is presented in this paper. Firstly, the deep convolution neural network is utilized as the feature encoder to learn the characteristics of vibration signals in different working conditions. The diversity maximization strategy is taken to balance the entropy minimization, so as to avoid trivial local minimum. The categories predicted by nontrivial domain adaptation method are more diverse. Moreover, the entropy is conducted to evaluate the uncertainty of the predicted result of the classifier. Using this deterministic strategy based on entropy to adjust the domain discriminator. The experimental study demonstrates the effectiveness of the developed method.

Keywords

Transfer learning fault diagnosis variable working conditions condition monitoring

Introduction

Rotating machinery is widely used in important engineering fields such as aerospace, automobile manufacturing, rail transit, and wind power generation. The working conditions of rotating machinery are complicated and changeable. Mechanical equipment often breaks down during a long period of operation. Unexpected damage may lead to long downtime and high maintenance costs, and even pose a huge threat to security. It is crucial to carry out condition monitoring and diagnosis to ensure the reliable, continuous, and stable operation of machinery.^1,2

Deep learning has obtained huge success in fault diagnosis, due to its strong feature learning ability. Deep learning provides an end-to-end solution for mechanical equipment fault diagnosis, and realizes joint optimization feature extraction and fault recognition without tedious feature extraction steps. Deep learning has built a bridge between mechanical big data and intelligent operation and maintenance, and promoted the development of intelligent health monitoring of mechanical equipment. Deep learning has been successfully applied in intelligent fault diagnosis, such as deep belief network,³ stacked auto-encoder,⁴ convolutional neural network (CNN)⁵ and recurrent neural network.⁶ However, its good classification performance is usually limited by the following two basic assumptions: (1) the test data and training data need to meet the independent and identical distribution; (2) the task to be diagnosed has sufficient label fault samples.^7,8 As the data discrepancy between training samples and test samples increases, the generalization performance of the training model will be significantly reduced.

Transfer learning relaxes the restriction that test data and training data must meet independent and identical distribution in the traditional machine learning. In transfer learning, the source domain and target domain do not need to follow the same distribution. It can mine the domain invariant features between different but related domains, so that the labeled data and other supervised information can be transferred between the two domains.^9,10 In the area of fault diagnosis, some research on transfer diagnosis methods by using traditional methods have been carried out such as transfer component analysis,¹¹ singular value decomposition and TrAdaBoost.¹² Kang et al.¹³ introduced a transfer fault diagnosis method by multifeature construction and variable mode decomposition.

In addition, the deep transfer learning method by combining the deep learning and transfer learning technology has also been explored. Yang et al.¹⁴ developed a transfer diagnosis model called polynomial kernel maximum mean discrepancy (PK-MMD). Shao et al.¹⁵ described transfer diagnosis method based on a fine-tuning transfer network. Jiao et al.¹⁶ designed two distinguished one-dimensional convolution networks as the basic structure to learn discrimination and domain invariant representation. Li et al.¹⁷ put forward using the multi-layer equalization domain adaptation method to train the model. Wang et al.¹⁸ illustrated a transfer learning approach using pre-training CNN to extract features from different data sets. An online incremental support vector machine was conducted to classify various cases. Hasan et al.¹⁹ presented a reliable transfer fault diagnosis approach by using acoustic spectral imaging. Zhou et al.²⁰ gave a multi-level deep convolution transfer learning scheme to transfer the fault diagnosis ability to other instruments. Wang et al.²¹ introduced a deep multi-scale intra-class transfer diagnosis approach, aiming to reduce the distribution difference of different working conditions. Liu et al.²² designed a joint distribution optimal domain adaptation approach for transfer fault diagnosis. An et al.²³ exploited self-learning transferable networks for mechanical fault diagnosis with unlabeled and unbalanced data. Similar research based on deep transfer diagnosis can be found in Wu et al²⁴ and Han et al.²⁵ Qian et al.²⁶ improved DenseNet and joint distribution adaptation for the transfer diagnosis. The core idea of the methods above is automatically learning the feature information of the two working conditions by using the deep learning model, and finally achieve the knowledge transfer by shortening the gap between the two working conditions. The core idea of the method above is automatically learning the feature information of the two working condition by using the deep learning model, and finally achieve the knowledge transfer by shortening the gap between the two working conditions.

Generative adversarial networks (GANs) is widely used in the unsupervised learning and semi supervised learning to improve the generalization ability.^27,28 The idea of unsupervised GAN is conducive to the target model learning of unlabeled domain invariant representation. Li et al.²⁹ described an adversarial domain adaptation approach using knowledge mapping. Jiao et al.³⁰ put forward a double-level adversarial model for cross-domain diagnosis. Chai and Zhao³¹ presented a fine-grained network to achieve industrial fault diagnosis. Guo et al.³² improved an intelligent scheme named deep convolutional transfer learning network to realize cross-domain diagnosis. Li et al.³³ introduced a two-stage transfer diagnosis method for multi-fault detection, which can effectively separate new multiple unlabeled fault types. Si et al.³⁴ gave an unsupervised deep network based on moment matching is proposed. The grayscale time-frequency image was used as the network input, and two adaptation methods were conducted to reduce the distribution difference. Jia et al.³⁵ designed a joint distribution adaptation-based transfer network to enhance feature extraction ability. Li et al.³⁶ described a class weighted adaptive neural network to encode positive transfer of the shared classes and ignore the source outliers. Shao et al.³⁷ gave an adversarial domain adaptation method by combining the MMD and domain confusion function. Zhang et al.³⁶ presented a small sample intelligent fault diagnosis method by using the multi-modal gradient penalty generation adversarial network. Li et al.³⁸ illustrated a new partial transfer fault diagnosis approach using weighted adversarial transfer network. Weighted learning strategy was used to weight their contribution to domain discriminator and source classifier (Figure 1).

Figure 1.

The network of DMATD.

The studies above successfully recognized transfer diagnosis without labels in the target domain. They have the following deficiencies:

They do not consider the problem of trivial solution in the process of transfer adaptation, and the diversity of prediction categories is not enough.

They do not consider the uncertainty of the predicted result of the classifier, which degrades the performance of the discriminator.

To tackle this challenge, deep diversity maximization-based adversarial transfer diagnosis (DMATD) approach for rotating machinery is proposed. Three key improvements have been made as follows:

The diversity maximization strategy is taken to balance the entropy minimization, so as to avoid trivial local minimum. The categories predicted by nontrivial domain adaptation method are more diverse.

The entropy of prediction category vector is conducted to evaluate the uncertainty of the predicted result of the classifier, so as to adjust the domain discriminator. The entropy strategy and designed network structure can be expanded in other application scenarios.

The input is the original vibration signal, which realized end-to-end fault diagnosis. The experiments of rolling bearing under variable working conditions were designed.

The rest of this article is arranged as follows: Section “Preliminaries” gives the problem definition and preliminaries. Section “Proposed fault diagnosis approach” presents the DMATD diagnosis network in detail. In Section “Case study,” the case study is analyzed to verify the DMATD. Finally, main conclusions and future work are given in Section “Conclusion.”

Preliminaries

Problem definition

Suppose there is a monitoring dataset ${x_{i}^{s}, y_{i}^{s}}_{i = 1}^{n_{s}}$ of mechanical equipment, where $y_{i}^{s} \in Ψ$ is the corresponding label of $x_{i}^{s}$ , and sample $x_{i}^{s}$ follows the space $P_{s} (X_{s})$ . The monitoring dataset ${x_{i}^{t}}_{i = 1}^{n_{t}}$ in the target working condition follows the sample space $P_{t} (X_{t})$ . $n_{s}$ and $n_{t}$ represent the number of source data and target data, respectively. The operating conditions of the two domains are quite different, that is, $P_{s} (X_{s}) \neq P_{t} (X_{t})$ . There shall be relevant typical fault information between the two domains. It is required that the label space $Ψ^{s}$ in the source working condition must overlap the target label space $Ψ^{t}$ , that is, $Ψ^{t} \subseteq Ψ^{s} \subseteq Ψ$ .³⁹ This paper aims to design a domain adaptation scheme, which can use the rich knowledge and discrimination performance in the source working condition, and identify the health state of the target working condition.

Feature encoder

CNN has become an excellent feature encoder and has obtained outstanding performance in many fields.^40,41 The vibration signal is a one-dimensional time series, therefore, one-dimensional deep convolution neural network (1D-DCNN) is adopted as the feature encoder structure to obtain the feature information of the source working condition and the target working condition in this paper.one-dimensional deep convolution neural network (1D-DCNN)

Proposed fault diagnosis approach

The samples x are a batch data of source samples or target samples, and the vector f is the feature after the feature learning process. F maps the x to $d_{f}$ -dimension feature, and the corresponding expression $f = F (x)$ . The matched parameter is $θ_{f}$ . G maps the feature matrix f to category prediction vector g, and the corresponding equation $g = G (f)$ . The discriminator D with parameters $θ_{d}$ judges whether each sample belongs to the source working condition or target working condition.

g = G (f) = Φ_{θ} (x) = [P (y = 1 | x), \cdot \cdot \cdot, P (y = K | x)]

(1)

For supervised learning in the source domain, the loss function of the classifier is

L_{y} = \frac{1}{| D_{s} |} \sum_{(x, y) \in D_{s}} L (Φ_{θ} (x), y)

(2)

where, $L (Φ_{θ} (x), y) = - \sum_{1}^{K} y_{i} \log Φ_{θ} (x)$ , $| D_{s} |$ is the cardinality of the data set in the source working condition. For the unlabeled data $x_{t} \in D_{t}$ in the target working condition, the prediction category vector $Φ_{θ} (x_{t})$ of the target domain can be obtained by using the network $Φ_{θ}$ . The prediction distribution of target working condition is

\hat{q} (D_{t}) = - \frac{1}{| D_{t} |} \sum_{x_{t} \in D_{t}} Φ_{θ} (x_{t}) \overset{Δ}{=} [{\hat{q}}_{1}, {\hat{q}}_{2}, \cdot \cdot \cdot, {\hat{q}}_{K}]

(3)

where,

{\hat{q}}_{k} = - \frac{1}{| D_{t} |} \sum_{x_{t} \in D_{t}} P (y_{t} = k | x_{t})

(4)

And $\sum_{k = 1}^{K} {\hat{q}}_{k} = 1$ . $| D_{t} |$ is the cardinality of the set in the target working condition. $\hat{q} (D_{t})$ changes dynamically in the batch training. In order to adapt the unlabeled target working condition, the entropy minimization of the target domain is utilized for domain adaptation. The entropy minimization loss of the target working condition is as follows:

L_{e} (θ, D_{t}) = - \frac{1}{| D_{t} |} \sum_{x_{t} \in D_{t}} 〈 Φ_{θ} (x_{t}), \log Φ_{θ} (x_{t}) 〉

(5)

Compared with using more complex cross-domain discrepancy, the end-to-end training of network parameter $θ$ by entropy minimization may be more direct and simpler. However, only using the entropy minimization may obtain trivial solutions in the minimization of target risk. At this point, a new regularization item is needed to promote the optimizer to find the global minimum, so as to avoid trivial local minimum. Furthermore, trivial solutions will lead to only one prediction category, and the categories predicted by nontrivial domain adaptation method are more diverse.

The entropy of $\hat{q} (D_{t})$ is calculated to evaluate the diversity of prediction categories in the target domain, and the further regularization item is

L_{d} (θ, D_{t}) \overset{Δ}{=} E (\hat{q} (D_{t})) = - \sum_{k = 1}^{K} {\hat{q}}_{k} \log {\hat{q}}_{k}

(6)

The discriminator D is constructed to distinguish the characteristics of source working condition and target working condition, while generator G is trained in the min-max adversarial mode. Learning domain invariant feature means finding the optimal parameter $θ_{f}$ that can maximize the domain classifier to make the two feature distributions as similar as possible, and finding the parameter $θ_{d}$ of domain classifier to minimize the domain classifier.⁴² Conventional discriminators provide the same transferability for all samples. However, different samples have different transferability, and those samples that have poor transferability may have an adverse impact on adaptation. To cut down this impact, the entropy of class prediction vector $E (g) = - \sum_{k = 1}^{K} g_{k} \log (g_{k})$ is conducted to evaluate the uncertainty of the classification result, and the weight of the predicted result is calculated according to $ω (E (g)) = 1 + e^{- E (g)}$ . Using this deterministic strategy based on entropy to adjust the domain discriminator, the final minimax function of is

\begin{matrix} min_{G} L_{y} + λ L_{e} (θ, D_{t}) - η L_{d} (θ, D_{t}) + \frac{1}{| D_{s} |} \sum_{x_{s} \in D_{s}} ω (E (g_{s})) \log \\ [D (F (x_{s}))] \\ + \frac{1}{| D_{t} |} \sum_{x_{t} \in D_{t}} ρ (E (g_{s})) \log [1 - D (F (x_{t}))] \end{matrix}

(7)

\begin{matrix} max_{D} \frac{1}{| D_{s} |} \sum_{x_{s} \in D_{s}} ω (E (g_{s})) \log [D (F (x_{s}))] \\ + \frac{1}{| D_{t} |} \sum_{x_{t} \in D_{t}} ω (E (g_{s})) \log [1 - D (F (x_{t}))] \end{matrix}

(8)

where, $λ$ and $η$ are tradeoff coefficients. $g_{s}$ is the prediction category vector of source working condition.

The general procedure of the DMATD is displayed in Figure 2, and the complete procedures are as follows:

Collect the labeled monitoring data ${x_{i}^{s}, y_{i}^{s}}_{i = 1}^{n_{s}}$ in the source working condition and the monitoring data ${x_{i}^{t}}_{i = 1}^{n_{t}}$ in the target working condition.

The DMATD model is designed and randomly initialized. Then, the constructed training datasets are input to the DMATD. Calculate the classification loss of the source working condition and calculate the entropy minimization loss and the diversity maximization loss to avoid trivial local minimum. Furthermore, calculate the discrimination loss to make the two feature distributions of domains as similar as possible. Finally, the DMATD is optimized by using the total loss.

The monitoring data ${x_{i}^{t}}_{i = 1}^{n_{t}}$ in the target working condition are input to the trained DMATD and the health status of the target working condition is identified.

Figure 2.

The general procedure of the DMATD.

Case study

In this part, the experiments of rolling bearings under variable working conditions were carried out and six transfer tasks were created. The case study is analyzed to verify the DMATD.

Dataset description

The case data came from Acceleration bearing life test (ABLT-1A) bearing test bench, as displayed in Figure 3(a). Figure 3(b) shows the data acquisition system. The tested bearing was 6204 single row deep groove ball bearing. In order to effectively detect the running state of each bearing, the vibration of the four bearings were measured by four single-axis acceleration sensors. Four thermocouple sensors and an acceleration sensor were utilized to detect the temperature of the bearing outer ring and the vibration of the whole test-bed, respectively. The four installed bearings were numbered as bearing 1, bearing 2, bearing 3, and bearing 4 from left to right. The bearing 1 was set four states: normal (N), inner and outer race compound fault (IOF), outer race fault (OF), and inner race fault (IF). Furthermore, the other three bearings were normal.

Figure 3.

The experiments of rolling bearings: (a) The test bench, (b) data acquisition system, (c) bearing installation, and (d) four states of bearing 1.

The fault size of rolling bearings was 1.8 mm deep and 1.2 mm wide. The installation of four bearings and the tested bearings are shown in Figure 3(c) and (d), respectively. Adding radial load was realized by adding weights to the radial load. The radial load transmitted the pressure to the tested head in a ratio of 100:1. The designed working conditions of the bearing designed in this experiment were A, 1800 rpm, 2.5 kN; B, 2100 rpm, 2.5 kN; and C, 2100 rpm, 5 kN. The working condition A meant the speed of the bearing was 1800 rpm and the designed working load was 2.5 kN. The data acquisition card in the experiments was National Instruments 9234. The sampling frequency of the whole experiment was 12.8 kHz. The description of the bearing dataset is depicted in Table 1.

Table 1.

Description of the bearing dataset.

Experimentalconditions	N (no. ofsamples)	IF (no. ofsamples)	OF (no. ofsamples)	IOF (no. ofsamples)	Speed(rpm)	Load (kg)
Condition A	4275	5775	4350	4725	1800	5
Condition B	4312	4237	3962	4075	2100	5
Condition C	4187	4237	4025	4212	2100	10

N: normal; IOF: inner and outer race compound fault; OF: outer race fault; IF: inner race fault.

Experimental results

The acquisition of data set and the construction of network

In order to verify the DMATD, six transfer tasks were designed. Transfer task A → C indicates that 100% of the labeled source data from the condition A and 50% of the unlabeled target data from the condition C. 50% of the data in target data C was used for testing. The target domain did not have health labels. Figure 4 shows the vibration signals of the working condition A, B, and C.

Figure 4.

The vibration signals in the time domain: (a) Designed work state A, (b) designed work state B, and (c) designed work state C.

The feature encoder network in this paper was constructed by 1D-DCNN. The structure of the feature encoder is depicted in Table 2. The original vibration data of rolling bearings were input into 1D-DCNN, and each sample had 1024 data points. The learning rate was set as $10^{- 3}$ . The tradeoff parameters $λ$ and $η$ were selected among [0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1]. Each parameter pair was conducted 10 times, and the average was taken. The results are shown in Figure 5. For instance, when $λ$ is selected from [0.1, 0.2], the model can achieve stable recognition accuracy with $η \in [0.2, 0.5]$ . At last, $λ$ and $η$ was set 0.1, 0.4, respectively.

Table 2.

The structure of feature encoder.

Layer	Parameters
Convolution-Pooling 1	KW (Kernel length) = 101; KC (Kernel channel) = 12; SC (Strides of convolution) = 1;Padding = 0; KP (Kernel size of pooling) = 2; SP (Strides of pooling) = 2; BN (Batch normalization);Rectified Linear Unit (ReLU)
Convolution-Pooling 2	KW = 101; KC = 18; SC = 1; Padding = 0; KP = 2; SP = 2; BN; ReLU
Convolution-Pooling 3	KW = 101; KC = 24; SC = 1; Padding = 0; KP = 2; SP = 2; BN; ReLU
Fully connected layer	Nodes: 960; BN; ReLU; Dropout rate: 0.5
Fully connected layer	Nodes: 500; BN; ReLU; Dropout rate: 0.5
Classification	Classes

Figure 5.

The analysis of tradeoff parameters.

The training of transfer diagnosis network

For the transfer task C → A, the changes of the total training loss, the classification loss and the diversity maximization loss are shown in Figure 6. The total training loss and the classification loss converge well. The diversity loss can dynamically fine tune the total training loss. The transfer diagnosis knowledge had been built after training. To visualize the characteristics learned from source data and target data, the t-SNE⁴³ is utilized for the dimension reduction of the learned characteristics. Figure 7 shows the visualization of C → A. It is evident that the features of source working condition and target working condition with same category can be well clustered.

Figure 6.

Three losses in the training process.

Figure 7.

The t-SNE visualization of the two domains.

Health status identification of working conditions in the target domain

The data set ${x_{i}^{t}}_{i = 1}^{n_{t}}$ of the target working condition was input to the transfer diagnosis knowledge to know the health status of the target working condition.

Comparison with other methods

The other four methods are compared to verify the effectiveness of the DMATD. In method 1, the source domain is trained directly without domain adaptation, which is to diagnose the target working condition; In method 2, the difference between the two domains is reduced by MMD⁴⁴; In method 3, domain adaptation is performed by correlation alignment (CORAL) loss⁴⁵; In method 4, the conditional domain adversarial networks (CDAN)⁴⁶ is introduced. In method 5, the entropy minimization without diversity maximization is adopted to achieve the domain adaptation. Table 3 displays the comparison among the DMATD and the latest methods. The accuracy of the DMATD is 30.5%, 19.9%, 24.3%, 10.0%, 7.5% higher than that of CNN, MMD, CORAL, CDAN, entropy minimization only (EMO), respectively.

Table 3.

Comparison with five other methods.

Methods	A → B	B → A	A → C	C → A	B → C	C → B	Average
CNN	79.6 ± 6.2	78.9 ± 3.6	42.6 ± 7.3	59.9 ± 14.4	57.1 ± .7	58.3 ± 9.3	62.7
MMD	99.3 ± 0.0	87.1 ± .0	56.1 ± 0.0	63.1 ± 0.0	63.2 ± .0	70.8 ± 0.0	73.3
CORAL	81.4 ± 1.4	81.2 ± .0	51.2 ± 2.9	61.4 ± 4.5	59.5 ± .0	78.6 ± 3.3	68.9
CDAN	99.9 ± 0.0	83.2 ± 7.6	61.9 ± 5.7	92.8 ± 16.0	63.6 ± .8	97.9 ± 3.9	83.2
EMO	100 ± 0.0	95.2 ± .7	59.3 ± 1.2	90.5 ± 2.8	70.3 ± .6	98.9 ± 1.2	85.7
DMATD	100 ± 0.0	100 ± .0	82.9 ± 0.5	100 ± 0.0	76.1 ± .3	100 ± 0.0	93.2

CNN: convolutional neural network; DMATD: deep diversity maximization-based adversarial transfer diagnosis, MMD: maximum mean discrepancy.

The t-SNE is utilized to visualize the features of the health state of the target working condition, as displayed in Figure 8. It is evident that the scatter diagram is chaotic and the phenomenon of misclassification is obvious. In Figure 8(a), the scattered points of the same classed are well clustered together. The DMATD not only shorten the gap between the two domains, but also expands the distance between different categories, indicating that the DMATD can identify the target working condition well. In Figure 8(b), there are still some wrong identification points. The method after domain adaptation has raised the identification accuracy of the target working condition. The method without domain adaptation is poor in identifying the target working condition. The confusion matrix is depicted in Figure 9. Due to the limited length of the paper, the visualization of the MMD, CORAL and CDAN methods are not presented here. The accuracy of the MMD, CORAL and CDAN methods is between the DMATD and CNN.

Figure 8.

The t-SNE dimension reduction of the different methods: (a) DMATD, (b) EMO, and (c) CNN.

Figure 9.

The corresponding confusion matrix: (a) DMATD, (b) EMO, and (c) CNN.

The performance analysis of the DMATD

The training time of different methods

Figure 10 displays the training time of different methods. The time consumption of the DMATD stands in the middle of the comparison method. The time consumption can be acceptable.

Figure 10.

The training time of different methods.

The comparison of the sensitivity and specificity

The DMATD is further assessed by the sensitivity and specificity. True positive (TP) denotes the number of correctly identified positive data. True negative (TN) means the number of correctly identified data as negative. False positive (FP) indicates the amount of misidentified data as positive. False negative (FN) represents the amount of misidentified data as positive. Therefore, the sensitivity $V_{SE}$ and specificity $V_{SP}$ is defined as:

V_{SE} = TP / (TP + FN)

(9)

V_{SP} = TN / (TN + FP)

(10)

where the $V_{SE}$ is the ratio of actual positive rightly classified, and $V_{SP}$ denotes the ratio of actual negative rightly classified. This can be extended to the evaluation of multi-class tasks by successively treating one category a positive category and the rest as negative classes. The comparison results of sensitivity and specificity are displayed in Figure 11. It can be found that the $V_{SE}$ and $V_{SP}$ of DMATD are 96.1% and 97.8%, respectively, which performs best.

Figure 11.

The analysis of sensitivity and specificity.

Conclusion

A DMATD approach for rotating machinery is presented in this paper, to solve the problem that the existing transfer diagnosis methods based on entropy minimization are easy to lead to trivial solution. The accuracy of the DMATD method is 30.5%, 19.9%, 24.3%, 10.0%, 7.5% higher than that of CNN, MMD, CORAL, CDAN, and EMO, respectively. The diversity maximization strategy can balance the entropy minimization effectively, so as to avoid trivial local minimum. The categories predicted by nontrivial domain adaptation method are more diverse. Moreover, the entropy is able to evaluate the uncertainty of the predicted results. Using this deterministic strategy based on entropy can adjust the domain discriminator. Finally, the DMATD can achieve the fault diagnosis of rotating machinery under variable working conditions.

The author’s research in the future will mainly focus on domain generalization.

Footnotes

Declaration of conflicting interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This research is supported by the National Natural Science Foundation of China (Grant No. 52105517).

ORCID iDs

Daoming She

Xiaoan Yan

References

Pan

Hong

Chen

, et al. Performance degradation assessment of wind turbine gearbox based on maximum mean discrepancy and multi-sensor transfer learning. Struct Health Monit Int J 2021; 20: 118–138.

Zhang

. Data privacy preserving federated transfer learning in machinery fault diagnostics using prior distributions. Struct Health Monit Int J 2022; 21: 1329–1344.

Shao

Jiang

Zhang

, et al. Electric iocomotive bearing fault diagnosis using a novel convolutional deep belief network. IEEE Trans Ind Electron 2018; 65: 2727–2736.

Sun

Zhao

, et al. Sparse deep stacking network for fault diagnosis of motor. IEEE Trans Ind Inf 2018; 14: 3261–3270.

Guo

Chen

Shen

. Hierarchical adaptive deep convolution neural network and its application to bearing fault diagnosis. Measurement 2016; 93: 490–502.

Zhang

Sun

, et al. Deep hybrid state network with feature reinforcement for intelligent fault diagnosis of delta 3-D printers. IEEE Trans Ind Inf 2020; 16: 779–789.

Niu

Liu

Wang

, et al. A decade survey of transfer learning (2010–2020). IEEE Trans Artif Intell 2020; 1: 151–166.

Bao

Gao

, et al. A new dam structural response estimation paradigm powered by deep learning and transfer learning techniques. Struct Health Monit Int J 2022; 21: 770–787.

Tan

Guo

Gao

, et al. Deep coupled joint distribution adaptation network: a method for intelligent fault diagnosis between artificial and real damages. IEEE Trans Instrum Meas 2021; 70: 1–12.

10.

Zhang

Wang

. Blockchain-based decentralized federated transfer learning methodology for collaborative machinery fault diagnosis. Reliab Eng Syst Saf 2023; 229: 12.

11.

Xie

Zhang

Duan

, et al. On cross-domain feature fusion in gearbox fault diagnosis under various operating conditions based on transfer component analysis. In: 2016 IEEE international conference on prognostics and health management (ICPHM) Ottawa, ON, Canada, 2016, pp. 1–6. IEEE.

12.

Zhang

Tao

, et al. Transfer learning with neural networks for bearing fault diagnosis in changing working conditions. IEEE Access 2017; 5: 14347–14357.

13.

Kang

Wang

, et al. Fault diagnosis method of a rolling bearing under variable working conditions based on feature transfer learning. Zhongguo Dianji Gongcheng Xuebao/Proc Chin Soc Electr Eng 2019; 39: 764–772.

14.

Yang

Lei

Jia

, et al. A polynomial kernel induced distance metric to improve deep transfer learning for fault diagnosis of machines. IEEE Trans Ind Electron 2019; 67: 9747–9757.

15.

Shao

McAleer

Yan

, et al. Highly accurate machine fault diagnosis using deep transfer learning. IEEE Trans Ind Inf 2019; 15: 2446–2455.

16.

Jiao

Zhao

Lin

, et al. Classifier inconsistency-based domain adaptation network for partial transfer intelligent diagnosis. IEEE Trans Ind Inf 2019; 16: 5965–5974.

17.

Tang

Deng

, et al. Deep balanced domain adaptation neural networks for fault diagnosis of planetary gearboxes with limited labeled data. Measurement 2020; 156: 107570.

18.

Wang

Zhang

, et al. Ensemble diagnosis method based on transfer learning and incremental learning towards mechanical big data. Measurement 2020; 155: 107517.

19.

Hasan

Islam

MMM

Kim

J-M

. Acoustic spectral imaging and transfer learning for reliable bearing fault diagnosis under variable speed conditions. Measurement 2019; 138: 620–631.

20.

Zhou

Zheng

L-Y

Wang

, et al. A multistage deep transfer learning method for machinery fault diagnostics across diverse working conditions and devices. IEEE Access 2020; 8: 80879–80898.

21.

Wang

Shen

Xia

, et al. Multi-scale deep intra-class transfer learning for bearing fault diagnosis. Reliab Eng Syst Saf 2020; 202: 107050.

22.

Liu

Z-H

B-L

Wei

H-L

, et al. Fault diagnosis for electromechanical drivetrains using a joint distribution optimal deep domain adaptation approach. IEEE Sens J 2019; 19: 12261–12270.

23.

Jiang

Cao

, et al. Self-learning transferable neural network for intelligent fault diagnosis of rotating machinery with unlabeled and imbalanced data. Knowledge-Based Syst 2021; 230: 107374.

24.

Tang

Chen

, et al. A study on adaptation lightweight architecture based deep learning models for bearing fault diagnosis under varying working conditions. Expert Syst Appl 2020; 160: 113710.

25.

Han

Liu

Yang

, et al. Deep transfer network with joint distribution adaptation: a new intelligent fault diagnosis framework for industry application. ISA Trans 2020; 97: 269–281.

26.

Qian

Jiang

Shen

, et al. An intelligent fault diagnosis method for rolling bearings based on feature transfer with improved DenseNet and joint distribution adaptation. Meas Sci Technol 2022; 33: 025101.

27.

Ganin

Ustinova

Ajakan

, et al. Domain-adversarial training of neural networks. J Mach Learn Res 2016; 17: 2096–2030.

28.

. Information generative bayesian adversarial networks: a representation learning model for transmission gear parameters. IEEE/ASME Trans Mechatronics 2019; 24: 1998–2007.

29.

Shen

Chen

, et al. Knowledge mapping-based adversarial domain adaptation: a novel fault diagnosis method with high generalizability under variable working conditions. Mech Syst Signal Process 2021; 147: 107095.

30.

Jiao

Lin

Zhao

, et al. Double-level adversarial domain adaptation network for intelligent fault diagnosis. Knowledge-Based Syst 2020; 205: 106236.

31.

Chai

Zhao

. A fine-grained adversarial network method for cross-domain industrial fault diagnosis. IEEE Trans Autom Sci Eng 2020; 17: 1432–1442.

32.

Guo

Lei

Xing

, et al. Deep convolutional transfer learning network: a new method for intelligent fault diagnosis of machines with unlabeled data. IEEE Trans Ind Electron 2018; 66: 7316–7325.

33.

Huang

, et al. A two-stage transfer adversarial network for intelligent fault diagnosis of rotating machinery with multiple new faults. IEEE/ASME Trans Mechatronics 2021; 26: 1591–1601.

34.

Shi

Chen

, et al. Unsupervised deep transfer learning with moment matching: a new intelligent fault diagnosis approach for bearings. Measurement 2021; 172: 108827.

35.

Jia

Deng

, et al. Joint distribution adaptation with diverse feature aggregation: a new transfer learning framework for bearing diagnosis across different machines. Measurement 2022; 187: 110332.

36.

Zhang

, et al. Partial transfer learning in machinery cross-domain fault diagnostics using class-weighted adversarial networks. Neural Networks 2020; 129: 313–322.

37.

Shao

Huang

Zhu

. Transfer learning method based on adversarial domain adaption for bearing fault diagnosis. IEEE Access 2020; 8: 119421–119430.

38.

Chen

. A Novel Weighted Adversarial Transfer Network for Partial Domain Fault Diagnosis of Machinery. IEEE Transactions on Industrial Informatics 2021; 17: 1753–1762.

39.

Liu

Qin

Shi

, et al. TScatNet: an interpretable cross-domain intelligent diagnosis model with antinoise and few-shot learning capability. IEEE Trans Instrum Meas 2021; 70: 1–10.

40.

Hoang

Kang

. A motor current signal-based bearing fault diagnosis using deep learning and information fusion. IEEE Trans Instrum Meas 2020; 69: 3325–3333.

41.

She

Jia

. Wear indicator construction of rolling bearings based on multi-channel deep convolutional neural network with exponentially decaying learning rate. Measurement 2019; 135: 368–375.

42.

Zhang

, et al. Universal domain adaptation in fault diagnostics with hybrid weighted deep adversarial learning. IEEE Trans Ind Inf 2021; 17: 7957–7967.

43.

Van der Maaten

Hinton

. Visualizing data using t-SNE. J Machine Learn Res 2008; 9: 2579–2605.

44.

Ghifary

Kleijn

Zhang

. Domain adaptive neural networks for object recognition. In: Pacific Rim international conference on artificial intelligence, Queensland, Australia, 2014, pp. 898–904. Springer.

45.

Sun

Saenko

. Deep coral: correlation alignment for deep domain adaptation. In: European conference on computer vision, Amsterdam, Netherlands, 2016, pp. 443–450. Springer.

46.

Long

Cao

Wang

, et al. Conditional adversarial domain adaptation. In: Advances in neural information processing systems, Montréal, Canada, 2018, pp. 1640–50.

Diversity maximization-based transfer diagnosis approach of rotating machinery

Abstract

Keywords

Introduction

Preliminaries

Problem definition

Feature encoder

Proposed fault diagnosis approach

Case study

Dataset description

Experimental results

The acquisition of data set and the construction of network

The training of transfer diagnosis network

Health status identification of working conditions in the target domain

Comparison with other methods

The performance analysis of the DMATD

The training time of different methods

The comparison of the sensitivity and specificity

Conclusion

Footnotes

Declaration of conflicting interests

Funding

ORCID iDs

References