Abstract
This article deals with incipient fault of insulated-gate bipolar transistors to improve the safety of traction systems of China Railway High-speed 5. Combining with the pulse width modulated strategy which makes signals variate periodically, the multi-mode kernel principal component analysis algorithm is proposed. It can effectively not only capture the tiny changes caused by incipient faults but also detect short-faults of insulated-gate bipolar transistors in electrical systems. In feature space of every mode, different thresholds will be formed corresponding to defined modes. The proposed scheme is tested in experimental setup of traction system of China Railway High-speed 5 with incipient fault and short-circuit fault, and experimental results show that the multi-mode kernel principal component analysis has superior monitoring performance compared to other five methods.
Keywords
Introduction
Three-phase pulse width modulated (PWM) voltage source inverter is one of the most important equipments in the traction system of high-speed railway trains.1,2 Due to performance degradation of components in the inverter such as insulated-gate bipolar transistor (IGBT), it is possible that some faults of IGBTs may deteriorate the whole operating condition1,3 and even lead to emergency power shutdown or unplanned standstill. 4 Thus, the real-time fault detection and diagnosis (FDD) that can explore fault features and types is urgently needed for the inverters in high-speed railway trains to improve their stability and reliability.
For the inverters in variable speed drives, there are many components that should be diagnosed and many types of faults that need to be determined. With a rough classification, around 38% of faults are due to failures of IGBTs.3,4 Faults in IGBTs are usually divided into three types, 5 namely short-circuit faults, open-circuit faults, and intermittent faults. Lots of methods have been proposed to deal with the open-circuit faults and short-circuit faults of IGBTs. For example, sliding mode observer, 6 wavelet neural networks, 7 and model reference adaptive system 8 were used to detect open-circuit faults; de-saturation detection, 9 snubbers and clamp circuits, 10 and vector composition of inverter output voltage 11 were used to handle short-circuit faults; for intermittent fault detection, time domain methods are more popular, because the fault features are insignificant in steady state. Besides that, graph theoretic approach, pattern recognition, and intelligent algorithms are usually considered.12–16 Although the above-mentioned methods can be useful for detecting and locating some inverter faults, they are difficult to be implemented in real systems because of unavailable information, such as the accurate stray inductance and external noises. In addition, most of the common methods are model-based or knowledge-based, which may be less powerful for the systems with complicated working environments and faults with various types. For the model-based methods, it is hard to set up exact mathematical models of three-phase inverter as complex structure of inverter in high-speed railway trains. And the main challenge for the knowledge-based methods is that expertise may be difficult to acquire. Therefore, traditional FDD is difficult to perform well.
With the development of sensor technology, data storage, and analysis technologies, many advanced data-driven FDD algorithms17–21 have been developed in the past 20 years, which provide new ways to solve FDD problems in complex electrical systems, such as the traction system of China Railway High-speed 5 (CRH5).
Different from traditional schemes,1–16 data-driven methods have also been broadly investigated in FDD of industrial processes.17–23 As pointed in Yin et al., 24 many multivariate statistical methods, such as principal component analysis (PCA), and its variants are developed in terms of being monitored systems. In Lee et al., 17 a kernel principal component analysis (KPCA) was suggested to monitor nonlinear biological wastewater treatment process. In semiconductor manufacturing, multiblock PCA was proposed to detect abnormal process operations. Likewise, multiblock PCA with adaptive strategy was developed for FDD in a sequencing batch reactor. Aiming at the applications of nonlinear and multi-model features, a multi-mode FDD scheme based on similarities was presented in Zhang et al. 21 Besides that, some improved methods were motivated by the other aspects, for example, a detailed fault-relevant PCA was proposed by the available fault information; 20 the multivariate exponentially weighted moving average (MEWMA) PCA keeping memory effect of variation trend was investigated for incipient abnormalities.
For traction system of CRH5, signals are usually periodic, highly correlated, and nonlinear, which make the existing data-driven FDD methods not applicable. And the data-driven FDD methods for common faults in inverter of electrical traction system have not been intensively investigated yet. For the incipient fault of IGBT, it is usually unnoticed at their early stage but may evolve to serious faults and increase the risk of possible hazards and rarely considered in existing literatures. These issues motivate us to focus on the study of data-based incipient FDD for inverter of CRH5.
An effective fault detection scheme is proposed based on multi-mode KPCA in this article. First, the multi-mode feature of PWM inverter is analyzed in detail, based on which a multi-mode KPCA algorithm is proposed. The proposed fault detection method consists of off-line analysis and on-line detection. For the off-line part, the collected data set is divided into six modes according to the gate trigger pulses of IGBTs, and then the off-line model of six modes’ data should be established. For the on-line part, the real-time collected data are assigned to the corresponding modes based on gate trigger signals, and the evaluation functions are then calculated to determine whether or not the inverter is running healthy finally. To illustrate its better effectiveness than other data-driven methods, PCA, 25 KPCA, 17 multiblock PCA, 18 and MEWMA PCA 26 are considered; besides that, a graph theoretic approach, bond graph, 14 is adopted. The experimental results show that the proposed multi-mode KPCA is capable of detecting the incipient and common faults efficiently.
This article is organized as follows: section “PWM inverter of CRH5” gives a brief introduction on the traction system of CRH5, the modeling of a three-phase PWM voltage source inverter, and IGBT short-circuit faults and incipient faults. In section “Fault detection based on multi-mode KPCA,” a traditional algorithm KPCA is introduced, followed by a novel multi-mode KPCA algorithm; on-line fault detection strategy is also given in this section. Experimental results for several faults using six different methods are presented in section “Experimental results and analysis.” Finally, the conclusion is given in section “Conclusion.”
PWM inverter of CRH5
A three-phase PWM voltage source inverter
The traction drive system of CRH5 has eight major parts: vacuum circuit beaker, transformer, pulse rectifier, intermediate links, inverter, motor, gear box, and controller. The input voltage of the transformer is about single phase 25 kV/50 Hz from high-voltage power supply network. Through the pulse rectifier and intermediate links, a single-phase 3600-V direct current (DC) is obtained and transformed to the three-phase PWM voltage inverter. The output of inverter voltage is 0–2.3 kV/0–220 Hz for the three-phase squirrel cage induction motor. The motor is adjusted by variable velocity variable frequency (VVVF) direct vector control.
The inverter in traction drive system of CRH5 has four modulation types, namely asynchronous modulation, subsection and synchronization modulation, calculation in advance angle modulation, and square modulation. Modulation mode changes with the stator voltage frequency fs. When 90 Hz ≤ fs < 110 Hz, the inverter works under square modulation. All data sets for fault diagnosis in this article are obtained under this modulation condition. A three-phase PWM voltage source inverter system of CRH5 is mainly composed of a DC supply, a PWM inverter, a filter circuit, and four squirrel cage induction motors.
As shown in Figure 1, where S1 to S6 are IGBT switches, Lf is the filter inductor and Cf is the filter capacitor. The three-phase alternating current (AC)

Main circuit of three-phase PWM voltage source inverter.
According to Kirchoff’s laws, the following relationships can be obtained
For the variables in equations (1)–(3), many are highly nonlinear, such as
As shown in Figure 1 and Table 1, 15 variables, 9 current and 6 voltage variables, are selected which are named as
The selected variables in inverter.
IGBT short-circuit fault and incipient fault
In the traction system of CRH5, faults can be classified into three types:
4
faults with drive breakdown, faults with emergency operation or performance reduction, and incipient faults with unnoticed effects. IGBT short-circuit faults belong to the second type and may happen due to an intrinsic failure or wrong gate trigger voltages.
5
These kinds of faults are difficult to deal with, as the time span between the alarming time

Illustration of an incipient fault evolving.
For IGBT fault signal, its deterioration, as shown in Figure 2, has two stages: incipient fault and short-circuit fault. In traction system, its current curve is under normal condition between 0 and
The incipient faults and short-circuit faults had not been considered sufficiently in the design of traction system; therefore both the faults should be focused on in the real traction system of CRH5.
Multi-mode feature of PWM inverter
Figure 3 shows the current curve

Line voltage

Six modes based on gate trigger pulse.
In Table 1, the lag degrees of 15 selected measurement signals in inverter are present in the last column. It can be seen that the measurements do not obey the Gaussian distribution. Therefore, the existing multivariate statistical methods cannot be used directly, such as the most popular PCA and its numerous variants.17–19 In this article, the proposed method that can handle the multi-mode feature is desirable for a PWM inverter.
From Figures 3 and 4, it is obvious that all the signals can be split into segments based on six modes in every period, even though they are irregular in every period. The existing algorithm can be directly used to process the collected nonlinear data. In different modes, the signals obey different distributions, but different statistical models can be developed for fault detection. Therefore, higher performance can be achieved, especially for detecting the incipient faults.
Fault detection based on multi-mode KPCA
KPCA
PCA is a popular data-driven technique which can deal with high-dimensional and high-correlated data by projecting the original data onto a new lower dimensional subspace which contains most of the original features.25,27 It has been widely applied in data-driven on-line fault diagnosis.22,24 However, as is well known, PCA can only be effectively used when the variables are correlated linearly and subject to normal distribution.
The KPCA method, an extension of PCA, 28 has been proposed to deal with the nonlinear data. Through some simple kernel functions, the original data set is mapped into a higher dimensional space with linear characteristics. And its applications can be found in Hu et al. 29
Considering that the collected data from a PWM inverter are non-Gaussian and nonlinearly correlated, KPCA is adopted in this article. Assume that all time series of signals in a PWM inverter are arranged in the data matrix
With
where
The eigenvectors
where
where ∥·∥ means matrix two-norms. In addition, all kernel functions should be subject to Mercer’s condition. Preliminary studies 29 pointed that the keys to choose the type of kernel function and the corresponding parameters are the amount of training data and the complexity of data space. Optimal selection of the kernel function is still a difficult issue.
Centralization of the kernel matrix K can be done by
Assume that there are l principal components in the feature space; the matrices
For a new sample
For on-line fault diagnosis based on KPCA, T2-test statistic and Q statistic are usually used. T2-test statistic is also called Hotelling’s T2-test statistic with l and
where
where
Its control limit can be calculated by
where
Except
On-line fault detection of multi-mode KPCA
Preprocessing for data
As in section “Multi-mode feature of PWM inverter,” the electrical PWM inverter system contains six operation modes according to the gate trigger pulse in every period. So the data can be regarded to have two more dimensions than original data collected from inverter of CRH5. Signals can be described in four dimensions, samples, amplitudes, periods, and modes. Figure 5 shows a signal

IGBT1 circuit
In every part, the data set, having the same characteristic or tendency, is different from every period. Therefore, a more effective fault detection method can be achieved if the signals of inverter in terms of different operation modes are sufficiently considered. In terms of the experiment in section ““Experimental results and analysis,” there are 400 samples in one period, and the six parts should be split as provided in Table 2, where
Samples in every mode.
Data normalization is needed before multi-mode modeling and fault detection algorithms. The purpose is to simplify the algorithm and eliminate the influence of different scales. And normalization is formulated in Algorithm 1.
Step 1: Define
where
Step 2: With
Off-line modeling for fault detection
Step 1: Collect normal operating data and assign them to corresponding mode based on the sample number. The obtained data can be denoted as
Step 2: Initialize j = 1. Data normalization for Calculate the kernel matrix
where Centralize
where matrix Compute
Compute the component through
Compute Save If
Step 3: Establish the normal operating condition models.
On-line computation for fault detection
Step 1: Obtain a new data
Step 2: Determine the current operation mode by Table 2 according to the sample number. Call the model parameters of the corresponding mode. Normalize Calculate kernel matrix
where Centralize
where matrix Obtain
Compute the monitoring statistics
where m and l are the number of variables and principal components. Make a decision for fault detection according to the following criterion
Back to Step 1.
Experimental results and analysis
To verify the effectiveness of the proposed multi-mode KPCA-based incipient fault detection algorithm for PWM inverter, an IGBT short-circuit fault and an IGBT incipient fault are injected into a CRH traction system simulation platform;30,31 the experimental setup is shown in Figure 6.

The experimental setup used for data generation.
This platform allows easy setting and generation of different types of faults in PWM inverter by upper computer. The IGBT short-circuit fault and incipient fault are injected by changing the turn-on resistance
where
Some popular data-driven FDD methods, such as PCA, 25 KPCA, 17 multiblock PCA, 18 and MEWMA PCA, 26 are simulated and compared with the proposed method. Besides that, a bond graph–based method proposed in our research group is also considered. In the following parts, we concentrate on incipient fault and short-circuit fault to illustrate the superiority of the proposed algorithm over the other five approaches.
Fault detection
For simplification without losing generality, the IGBT1 is chosen as a degrading IGBT in this article. And its historical data are generated in three-phase PWM inverter of CRH5 under normal condition. Before the on-line test, the historical collected data are input into the off-line modeling. For the on-line signals, they can be analyzed in real-time. To obtain the on-line data, first an incipient fault is injected by changing turn-on and turn-off resistance of IGBT1 between samples 400 and 1000. After sample 1000, the incipient fault developed to be a short-circuit fault.
As shown in Figure 7, 15 selected signals in inverter, as listed in Table 1, are described in three-dimensional space. There are five periods in the on-line test data set, and every period has 400 samples. The data under normal and incipient fault conditions have no significant changes before 1000th step. But after 1000th step, the IGBT switches

The on-line data set in three-dimensional space.
Figure 8 shows the results using a PCA-based fault detection method. It is clear that the short-current fault can be detected, but the

Fault detection using
In the implementation of KPCA, we choose

Fault detection using
By regarding a period as a batch or a block in this application, multiblock PCA-based method is conducted, which shows similar results as shown in Figure 10. The reason for this phenomenon is that the score and loading matrices of both methods in principal and residual space are the same when the samples’ number

Fault detection using
As pointed in Zhang et al.,
26
MEWMA and PCA can be combined for fault detection to enhance the detect ability of incipient faults. When this MEWMA PCA-based method is implemented in this article, a scalar constant

Fault detection using
The results using bond graph–based FDD are shown in Figure 12. Based on the global analytical redundancy relations of three-phase PWM inverter, the residuals can be obtained for FDD, as shown in Figure 12. It can be observed that the incipient fault can be detected after 400 to some degree. Meanwhile, the false alarm rate is 0.225, which is actually unacceptable for practical application.

Fault detection using
Finally, the results using the proposed multi-mode KPCA-based method are shown in Figure 13. SPE statistic values exceed the control limits between 400 and 1000, the missing alarm rate is 0.1083, and the false alarm rate is 0.0075 for incipient fault. The proposed method can successfully detect both the incipient fault and the short-circuit fault. The missing alarm rates and false alarm rates are summarized in Tables 3 and 4. The superiority of the proposed method is obvious compared to traditional algorithms.

Fault detection using
Missing alarm rates for the six methods.
PCA: principal component analysis; KPCA: kernel principal component analysis; MEWMA: multivariate exponentially weighted moving average.
False alarm rates for the six methods.
PCA: principal component analysis; KPCA: kernel principal component analysis; MEWMA: multivariate exponentially weighted moving average.
Discussions
Those traditional PCA-based FDD methods have poor missing and false alarming rates for detection of the incipient fault. The key reason is those methods do not consider the characteristic of electrical systems. In equations (28) and (29), it should be pointed that
Besides that, the short-current fault of IGBT appears from 1000th step; all the six methods can detect fault quickly. But if this fault happened in between 800th and 1000th steps, PCA, KPCA, multiblock PCA, and MEWMA PCA will lose these rapidity. The reason for this phenomenon is which IGBT1 and IGBT4 should work on turn-on and turn-off conditions, respectively, between 800th and 1000th steps. In other words, although the IGBT switches
It should be pointed that
Therefore, based on Algorithm 2 and Algorithm 3, the incipient fault can be detected correctly. Moreover, application of all other five methods from incipient fault detection results in an erroneous detection which has been confirmed through experimental results. If an algorithm can detect incipient fault, fault must be detected obviously. In real traction systems of CRH5, the proposed algorithm can play satisfied effects in the case that some monitoring variables have a tiny irregular deviation.
Conclusion
The real-time incipient fault detection for PWM inverter of CRH5 is investigated in this article. An improved approach based on KPCA is first proposed. In this proposed method, original data will have two more dimensions, as period and mode. In this way, it can capture the characters of electrical system so that the proposed algorithm detects both incipient faults and common faults of IGBT quickly and correctly. Many parameters using the proposed algorithm are changed by switching modes, so the fault detection can be updated adaptively. The proposed method is effective to detect faults for nonlinear electrical systems such as three-phase PWM voltage source inverter system of CRH5. Moreover, the method can be applied in traction system of high-speed railway with no more designs of hardware and controller. This study is proposed under a multivariate statistical analysis framework. For electrical systems, based on the multi-mode idea, the proposed data-driven method can be extended to handle other fault types and to other electrical systems both from theoretical and practical viewpoints.
Footnotes
Handling Editor: Chenguang Yang
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was partially supported by National Natural Science Foundation of China (grant no. 61490703) and Funding of Jiangsu Innovation Program for Graduate Education (grant no. KYLX16_0378).
