Sage Journals: Discover world-class research

Abstract

The recognition of partial discharge mode is an important indicator of the insulation condition in transformers, based on which maintenance can be arranged. Discharge feature extraction is the key to recognize discharge mode. To solve the problem of poor stability and low recognition rate of partial discharge mode, this paper proposes a feature extraction method based on synchrosqueezed windowed Fourier transform and multi-scale dispersion entropy. First, the four partial discharge signals collected under laboratory conditions are decomposed by synchrosqueezed windowed Fourier transform, then a number of band-limited intrinsic mode type functions are obtained, and the original feature quantities of partial discharge signals are obtained by calculating the multi-scale dispersion entropies of each intrinsic mode type function. Based on that, original feature quantity is optimized by using the maximum relevance and minimum redundancy criteria. Finally, the classification is implemented by the support vector machine. Experimental results show that in the case of noise interference, the proposed synchrosqueezed windowed Fourier transform–multi-scale dispersion entropy method can still accurately describe the feature of different discharge signals and has a higher recognition rate than both the empirical mode decomposition–multi-scale dispersion entropy method and the direct multi-scale dispersion entropy method.

Keywords

Partial discharge feature extraction synchrosqueezed windowed Fourier transform dispersion entropy

Introduction

The operation status of power transformers is directly related to the safe operation of the entire power system, and partial discharge (PD) is an important symptom and manifestation of transformer insulation degradation.¹ The feature extraction of the PD signal is a key step of transformer fault diagnosis.^1,2 Since the electromagnetic environment in the substation field has a great influence on the recognition accuracy, the focus of research has been to find more effective methods for feature extraction of the PD signal, which can effectively eliminate noise interference, while fully reflecting the essential information of signals. The traditional phase statistics methods based on atlas analysis are widely used, but the time-domain information is not fully utilized.^3–6 The joint time–frequency analysis methods of the PD signal compensate for the deficiency of the atlas method to a certain extent but is still in the stage of research.^7,8

At present, the commonly used methods for feature extraction of PD signals include the statistical feature extraction method,^3,7 fractal feature extraction method,⁸ waveform feature extraction method,⁹ wavelet feature method,¹⁰ empirical mode decomposition (EMD) method,^11,12 and so forth. Although to some extent the above feature extraction method can extract the characteristics of PD signal well, there are still some shortcomings. The statistical feature extraction method needs a large number of samples, and the dimension of extracted feature is high, which will lead to information redundancy and make subsequent recognition difficult. The fractal dimension calculation of the fractal feature extraction method is affected by the signal length and the number of effective discharge signal points, and the extracted feature dimension is relatively high. The waveform feature extraction method requires a high accuracy of signal acquisition, and it is difficult to accurately extract the characteristic representing the non-stationary dynamic PD signal. It is very hard to choose the proper wavelet basis and decomposition layer when extracting the feature of the PD signal by the wavelet feature method. EMD can decompose the PD signal adaptively, but the decomposition results are heavily affected by noise, and also, there is serious modal aliasing. Synchrosqueezed windowed Fourier Transform (SSWFT) is a new non-recursive time–frequency analysis method.¹³ On the basis of windowed Fourier transform, SSWFT minimizes the sum of bandwidth of each mode by ridge extraction and synchrosqueezing and achieves adaptive signal decomposition. SSWFT solves the mode aliasing of EMD, and the components with similar frequencies can be separated correctly. SSWFT uses the optimal filter to process the coefficients during the synchronous extrusion process, so it has better noise robustness.¹⁴

Entropy is a method to measure the complexity of time series and has been widely used in feature extraction and fault diagnosis.¹⁵ In the literature,^16,17 multi-scale sample entropy (MSE) is applied to the fault diagnosis of rolling bearings. Studies in the literature^18,19 have applied multi-scale permutation entropy (MPE) to feature extraction of electroencephalogram signals and system mutation detection and achieved good results. Nevertheless, these methods have their own shortcomings. For example, MSE is slow in processing long data, poor in real-time performance, and prone to mutation in similarity measurement. Although the calculation speed of MPE is high, the difference between the average amplitude and the amplitude value is not considered.²⁰

In order to overcome the inherent defects of the above two methods, in 2017, Azami et al.²¹ proposed a new irregular index multi-scale dispersion entropy (MDE). The calculation speed of MDE is high and is less affected by an abrupt signal. It takes into account the relationship between the amplitudes and overcomes the shortcomings of MSE and MPE to a certain extent. Compared with MSE and MPE, the results show that MDE has better stability, higher calculation speed, and more advantages in error and feature extraction.²¹

In view of the advantages of MDE in extracting features of non-linear dynamic complexity, this manuscript introduces MDE into the field of PD feature extraction. However, due to the large randomness of PD and the noise in the discharge signal, if only MDE is used to process the PD signal, it will not be enough to represent the multi-scale complex characteristic of the PD signal,²² which will affect the accuracy of the characteristic quantity. SSWFT can decompose a complex multi-component into a series intrinsic mode type functions (IMTFs), that is, it can realize the multi-scale decomposition of the PD signal adaptively. Based on this, by combining SSWFT with MDE, a new feature extraction and quantitative description method of the PD signal is proposed based on SSWFT and MDE (SSWFT–MDE).

First, the PD signal is decomposed into a series of IMTFs by SSWFT. Then, the MDE values of each IMTF are calculated, and the obtained MDE values are taken as the characteristic vectors of PD signals. Finally, a classifier based on support vector machine (SVM) is established to recognize the characteristic vectors of the PD signal, realizing the intelligent diagnosis of PD types. The proposed method is applied to the pattern recognition of PD signals in substations and compared with the MDE method and EMD–MDE method. The results show that the proposed method can recognize the types of PD faults more accurately than EMD–MDE and MDE.

Synchrosqueezed windowed Fourier transform

The PD signal $f (t)$ often contains multiple components, and each component has its own vibration, and $f (t)$ can be written as

f (t) = \sum_{k = 1}^{K} f_{k} (t) + r (t) = \sum_{k = 1}^{K} A_{k} (t) \cos (ϕ_{k} (t)) + r (t)

(1)

where $A_{k} (t) = A_{k} e^{- λ_{k} t}$ is the instantaneous amplitude of the kth component, and $ϕ_{k} (t) = ω_{k} t + ψ_{k}$ is the instantaneous phase of the kth component. $r (t)$ is the noise, and K is the number of components of the signal. The windowed Fourier transform of $f (t)$ with a window function $g (t)$ is defined by

V_{f}^{g} (t, η) = \int_{R} f (τ) g^{*} (t - τ) e^{- 2 i π η (t - τ)} d τ

(2)

Existing studies have shown that^13,14 in the time–frequency diagram of windowed Fourier transform, the spectral distribution of the signal is wide and the boundary is blurred. For more complex multi-component signals, there is often serious spectrum aliasing between the WFT spectrum of the components.

SSWFT is a new time–frequency analysis method based on windowed Fourier transform.¹³ By refining the time–frequency curve of WFT, SSWFT can effectively calculate the amplitude $A_{k} (t)$ and instantaneous frequency $ϕ'_{k} (t)$ $(k = 1, 2, \dots, K)$ of each component. As a special reorganization method, SSWFT re-allocates the coefficient $V_{f}^{g} (t, η)$ of WFT to different points $[t, {\hat{ω}}_{f} (t, η)]$ ( ${\hat{ω}}_{f} (t, η)$ represents the instantaneous frequency of the signal at $(t, η)$ ) in the time–frequency plane according to the local property of the coefficient $V_{f}^{g} (t, η)$ near $(t, η)$ , which makes the time–frequency curve finer and clearer, and the frequency resolution is improved and the modal aliasing is reduced. Thus, the reconstruction precision of the modal component is higher.

The instantaneous frequency any point $(t, η)$ in the signal can be calculated by derivation of the WFT coefficient¹³

{\hat{ω}}_{f} (t, η) = \frac{1}{2 π} \partial_{t} \arg {V_{f}^{g} (t, η)} = R {\frac{\partial_{t} V_{f}^{g} (t, η)}{2 i π V_{f}^{g} (t, η)}}

(3)

where $\arg {Z}$ and $R {Z}$ represent the argument and the real part of the complex Z, respectively, and $\partial_{t}$ represents the partial derivative of the function to t. SSWFT establishes a mapping $(t, η) \to [t, {\hat{ω}}_{f} (t, η)]$ based on the instantaneous frequency, which transforms the WFT coefficient $V_{f}^{g} (t, η)$ from the " $t - η$ " plane to the " $t - {\hat{ω}}_{f} (t, η)$ " plane. In SSWFT, the coefficients of WFT in the interval $[{\hat{ω}}_{f} - \frac{1}{2} Δ ω, {\hat{ω}}_{f} + \frac{1}{2} Δ ω]$ are squeezed onto the center frequency ${\hat{ω}}_{f}$ to obtain the synchrosqueezing value $T_{f}^{g, γ} (t, ω)$ , so as to improve the frequency resolution and reduce the spectrum aliasing.

Let $ε > 0$ and $Δ \in (0, 1)$ . The set $B_{Δ, ε}$ of multicomponent signals with modulation $ε$ and separation $Δ$ is the set of all signals $f (t) = \sum_{k = 1}^{K} f_{k} (t)$ where

$f_{k} (t) = A_{k} (t) e^{i 2 π ϕ_{k} (t)}$ satisfies $A_{k} \in C^{1} (R) \cap L^{\infty} (R)$ , $ϕ_{k} \in C^{2} (R)$ , $sup_{t} ϕ'_{k} (t) < \infty$ , and for all t, $A_{k} (t) > 0$ , $ϕ'_{k} (t) > 0$ , $| A'_{k} (t) | \leq ε$ , and $| {ϕ ″}_{k} (t) | \leq ε$

The $f_{k} s$ are separated with resolution $Δ$ , that is, for all $k \in {1, \dots, K - 1}$ and all t, $ϕ'_{k + 1} (t) - ϕ'_{k} (t) > 2 Δ$

For multi-component signals $f (t)$ satisfying conditions, the calculation formula of the SSWFT coefficient $T_{f}^{g, γ} (t, ω)$ is

\begin{matrix} T_{f}^{g, γ} (t, ω) = \frac{1}{g^{*} (0)} \int_{{η, | V_{f}^{g} (t, η) | > γ}} \\ V_{f}^{g} (t, η) δ [ω - {\hat{ω}}_{f} (t, η)] d η \end{matrix}

(4)

where $γ = \sqrt{2 \log N} \cdot \frac{median {W_{f}^{1} (t)}}{0.6745}$ ,¹⁴ Median() represents to take the median, N is the length of signal, and $W_{f}^{(1)} (t)$ is the first-level wavelet coefficients of $f (t)$ . For multi-component signals, each component $f_{k} (t)$ can be accurately reconstructed by $T_{f}^{g, γ} (t, ω)$ . Assuming that the $(t, ϕ'_{k} (t))$ is the K-th ridge of $f (t)$ , the reconstruction formula of $f_{k} (t)$ is¹⁴

f_{k} (t) \approx im f_{k} (t) = \int_{{ω, | ω - φ_{k} (t) | < ξ}} T_{f}^{g, γ} (t, ω) d ω

(5)

where $φ_{k} (t)$ is an approximate estimate of $ϕ'_{k} (t)$ , $φ_{k} (t)$ ¹³ is usually calculated by the ridge extraction method, and $ξ$ is the given error threshold of the ridge; in this paper, the greedy algorithm is provided in reference²³ to extract ridges. By using SSWFT, the time–frequency domain of the PD signal can be divided more accurately, and a set of intrinsic mode functions ${im f_{k} (t), k = 1, 2, \dots, K}$ can be obtained, which are the components of the PD signal. The intrinsic modal functions ${im f_{k} (t)}_{1 \leq k \leq K}$ of the PD signal f(t) are taken as the input signal when calculating the MDE.

MDE theory

DE is an algorithm for measuring the complexity or irregularity of a time series. For a given time series $x = {x_{j}, j = 1, 2 \dots, N}$ of length N, the calculation steps of DE are as follows:^18,19

The time series x is mapped to $y = {y_{j}, j = 1, 2 \dots, N}$ by using a normal distribution function

y_{j} = \frac{1}{σ \sqrt{2 π}} \int_{- \infty}^{x_{j}} e^{- \frac{{(t - u)}^{2}}{2 σ^{2}}} dt

(6)

where u and $σ^{2}$ represent the expectation and the variance, respectively.

y is mapped to the range of $[1, 2, \dots, c]$ by using of linear transformation

z_{j}^{c} = R (c \cdot y_{j} + 0.5)

(7)

where R is the rounding function, and c is the number of categories. In fact, steps 1 and 2 map each element in the time series x to $[1, 2, \dots, c]$ .

The embedded vector $z_{i}^{m, c}$ is calculated using equation (7)

\begin{matrix} z_{i}^{m, c} = {z_{i}^{c}, z_{i + d}^{c}, \dots, z_{i + (m - 1) d}^{c}}, \\ i = 1, 2, \dots, N - (m - 1) d \end{matrix}

where m and d are embedding dimensions and delays, respectively.

The scatter pattern $π_{v_{0}, v_{1}, \dots, v_{m - 1}}, (v = 1, 2, \dots, c)$ is calculated.²¹ If $z_{i}^{c} = v_{0}, z_{i + d}^{c} = v_{1}, \dots, z_{i + (m - 1) d}^{c} = v_{m - 1},$ then the corresponding scatter mode of $z_{i}^{m, c}$ is $π_{v_{0}, v_{1}, \dots, v_{m - 1}}$ . Since $π_{v_{0}, v_{1}, \dots, v_{m - 1}}$ is composed of c digits and each number has m possible values, there are a total of $c^{m}$ corresponding scatter patterns.

The probability $p (π_{v_{0}, v_{1}, \dots, v_{m - 1}})$ of each scatter pattern $π_{v_{0}, v_{1}, \dots, v_{m - 1}}$ is calculated by

p (π_{v_{0}, v_{1}, \dots, v_{m - 1}}) = \frac{Number (π_{v_{0}, v_{1}, \dots, v_{m - 1}})}{N - (m - 1) d}

(8)

where $Number (π_{v_{0}, v_{1}, \dots, v_{m - 1}})$ is the number of $z_{i}^{m, c}$ mapped to $π_{v_{0}, v_{1}, \dots, v_{m - 1}}$ , that is, $p (π_{v_{0}, v_{1}, \dots, v_{m - 1}})$ is equal to the number of $z_{i}^{m, c}$ mapped to $π_{v_{0}, v_{1}, \dots, v_{m - 1}}$ divided by the number of elements in $z_{i}^{m, c}$ .

According to the definition of Shannon entropy, the DE of the original signal x is defined as

\begin{matrix} DE (x, m, c, d) \\ = - \sum_{m = 1}^{c^{m}} p (π_{v_{0}, v_{1}, \dots, v_{m - 1}}) \ln [p (π_{v_{0}, v_{1}, \dots, v_{m - 1}})] \end{matrix}

(9)

Similar to sample entropy and permutation entropy, DE is also a way to characterize the irregularity of time series. The larger the DE value, the higher the degree of irregularity; the smaller the DE, the lower the degree of irregularity. It can be seen from the DE algorithm that when all the scattering modes have the same probability, DE takes the maximum value $\ln (c^{m})$ , such as a noise signal. Conversely, when only one $p (π_{v_{0}, v_{1}, \dots, v_{m - 1}})$ value is not equal to zero, the time series is a completely regular or predictable data, and the DE value is the smallest, such as a periodic signal.

MDE is defined as the dispersion entropy of a sequence after multi-scale coarse granulation. The original sequence is $x = {x_{i}, i = 1, 2 \dots, N}$ , and let $u_{j}^{(τ)}$ be the coarse-grained sequence

u_{j}^{(τ)} = \frac{1}{τ} \sum_{i = (j - 1) τ + 1}^{j τ} x_{i}, j = 1, 2, \dots, [N / τ], 1 \leq τ \leq τ_{max}

(10)

where the scale factor is $τ$ , $τ_{max}$ is the largest scale factor, and the degree of sequence coarse granulation is determined by the scale factor. When $τ$ = 1, the coarse granulation sequence degenerates into the original sequence. By calculating the permutation entropy of each $u^{(τ)}$ ( $1 \leq τ \leq τ_{max}$ ) we can get the multiscale permutation entropy of the original sequence x.

Feature extraction process based on SSWFT–MDE

The ability of MDE to characterize signal complexity depends on the selection of embedding dimension m, category number c, delay time d, and scale factor $τ_{max}$ .^20,21 Azami et al.²¹ suggest that the values of the embedded dimension m and the category c should not be too small or too large, m usually takes 2 or 3, c takes an integer between,^4,8 and the delay d generally takes 1. Therefore, we take m = 3,c = 6, and d = 1. If the scale factor $τ_{max}$ is too small, MDE cannot fully reflect the complexity information of the sequence at each scale. If the scale factor $τ_{max}$ is too large, it will increase the amount of calculation and cause information redundancy. Therefore, referring to Azami et al.,²¹ we set the scale factor to $τ_{max} = 14$ . The process of PD feature extraction based on SSWFT–MDE is shown in Figure 1. In SSWFT decomposition, the number of decomposition layers is set to 6.

Figure 1.

Flow-chart of synchrosqueezed windowed Fourier transform–multi-scale dispersion entropy features extraction of partial discharge.

In this paper, the scale factor of MDE is set to 14, and if the MPEs of all modals are directly composed into feature vectors, the dimensions of the feature vector will be as high as 84, which will cause a “dimension disaster” for subsequent pattern recognition. In order to reduce the redundant information in the feature vector and improve the computational efficiency, the optimal feature vectors are selected according to the criterion of maximum relevance and minimum redundancy (MRMR).²⁴

For a given two random variables x and y, supposing that $p (x)$ and $p (y)$ represent the distribution probability density of random variables x and y, respectively, $p (x, y)$ is the joint probability density of two random variables, and $I (x, y)$ is denoted as mutual information between x and y, the calculation formula of $I (x, y)$ is as follows

I (x, y) = \int \int p (x, y) \log \frac{p (x, y)}{p (x) p (y)} dxdy

(11)

In order to analyze the relevance and redundancy of features, D and R are respectively used to represent the correlation and redundancy indexes of the feature subset. Supposing C is the target category, $| S |$ is the number of feature vectors in feature set S, $I (s_{i}; C)$ is the mutual information between the feature vector $s_{i}$ and the target category C, and $I (s_{i}; s_{j})$ is the mutual information between the feature vector $s_{i}$ and the feature vector $s_{j}$ , then according to the principle of MRMR, the maximum relevance can be given as

max D (S, C) = \frac{1}{| S |} \sum_{s_{i} \in S} I (s_{i}; C)

(12)

The minimum redundancy can be given as

min R (S) = \frac{1}{| S |^{2}} \sum_{s_{i}, s_{j} \in S} I (s_{i}; s_{j})

(13)

According to the formula of MRMR, the formula of the MRMR criterion is as follows

max Φ (D, R) = D (S, C) - R (S)

(14)

The three scale factors with MRMR are selected from 14 scale factors and composed of a 18-dimensional feature subset.

PD pattern recognition based onSSWFT–MDE

Experimental model

According to the form and characteristics of the PD of the transformer, four discharge models are constructed in the laboratory. In order to make the collected discharge signals more representative, discharge models of various sizes and parameters are designed, including floating discharge (FD), needle-plate discharge (ND), surface discharge (SD), and corona discharge (CD).

PD models are shown in Figure 2. The diameter and thickness of all circular plate electrodes are 80 mm and 10 mm, the thickness of all cardboards is 0.5 mm. Figure 2(a) shows the electrode structure for simulating the FD in oil, and a metal particle with diameter of 0.3 mm is placed at the edge of the epoxy plate. Figure 2(b) shows a needle plate structure for simulating the CD in the oil. The diameter of the needle neck is 0.2 mm, the thickness of the epoxy plate between the needle and the plate electrode is 0.5 mm, and the diameter is 1 mm. Figure 2(c) shows the simulation of the discharge along the surface of oil. Figure 2(d) simulates the model structure of CD in insulation. The corona consists of three layers of epoxy plates with a diameter of 60 mm and a thickness of 0.5 mm, and the diameter of the center circular hole is 20 mm.

Figure 2.

Partial discharge (PD) models: (a) floating discharge (FD); (b) needle-plate discharge (ND); (c) surface discharge (SD); and (d) corona discharge (CD).

The experimental setup is shown in Figure 3. All the models are placed in the fuel tank filled with transformer oil. The PD signal is detected in the simulated transformer tank in the laboratory. The test standard is IEC 60270-2000, and the test circuit is a parallel test circuit based on the pulse current method. The discharge signal was collected using a TWPD -2 F PD analyzer with an acquisition frequency of 20 MHz and a sensor bandwidth of 40 to 300 kHz. The high pressure test platform model is TWI5133 -10 /100 am. PD signals are extracted under different voltage conditions, the experimental conditions are shown in Table 1. For each discharge model, 300 experimental samples were taken at each test voltage, and a discharge signal of one power frequency cycle was taken as one sample.

Figure 3.

Test setup in the laboratory.

Table 1.

Test conditions of the partial discharge model.

Discharge type	Initial voltage/kV	Breakdown voltage/kV	Experimental voltage / kV	Samples number
Floating discharge	3.2	8.5	5/6	60/60
Needle-plate discharge	10	14	12.2/13.1	60/60
Surface discharge	4.5	13	5.9/8.6	60/60
Corona discharge	6	11.5	7.2/9.7	60/60

One set of simulation results is shown in Figure 4. Figure 4(a) shows the waveform of the FD collected in the experiment, Figure 4(b) shows the waveform of the ND, Figure 4(c) shows the waveform of the SD, and Figure 4(d) shows the waveform of the CD.

Figure 4.

Measured signal waveform of simulation partial discharge: (a) Waveform of the simulation floating discharge signal. (b) Waveform of the simulation needle-plate discharge signal. (c) Waveform of the simulation surface discharge signal. (d)Waveform of the simulation corona discharge signal.

Feature extraction comparison

In this paper, we choose the number of decomposition layers of SSWFT to be 6. In order to verify the effectiveness of the SSWFT–MDE method, 400 PD samples (100 for each type) were randomly selected, then the feature extraction of PD signals was performed by MDE, EMD–MDE, and SSWFT–MDE. In the MDE method, the number of scale factor is 18. In the EMD method, the first five IMFs are reserved, and the rest are merged as the sixth component. In SSWFT, the Blackman window is selected as the window function and the width of the window is set to 30. All experiments are carried out in Matlab 2016a.

Figures 5 –7 show feature extraction results based on MDE, EMD–MDE, and SSWFT–MDE, respectively. It can be seen from Figure 5 that the MDEs of four kinds of PD signals have a significant deviation on some large scales (the scale after 14), but there are crossing and overlapping on most scales, which will inevitably affect the recognition accuracy of PD types. So, it is difficult to distinguish the types of PD by MDE alone. It can be seen from Figures 6 and 7 that the SSWFT–MDE features of each discharge type have obvious differences, showing a high recognition rate. However, except for FD, the EMD–MDE features of the other three discharges are very similar in trend and steepness, and the intervals of features are overlapping with each other, which makes it difficult to distinguish between them.

Figure 5.

The feature extraction results based on multi-scale dispersion entropy.

Figure 6.

The feature extraction results based on empirical mode decomposition–multi-scale dispersion entropy: (a) Floating discharge. (b) Needle-plate discharge. (c) Surface discharge. (d) Corona discharge.

Figure 7.

The feature extraction results based on synchrosqueezed windowed Fourier Transform–multi-scale dispersion entropy: (a) Floating discharge. (b) Needle-plate discharge. (c) Surface discharge. (d) Corona discharge.

Also, as shown in Figure 6, when the scale is lower than 4, there are different features between EMD–MDE, but when the scale is higher than 4, the differences between MDEs of the IMFs of EMD reduce gradually. It shows that the modal components obtained by EMD are simple and only few IMFs with small scale contain the discharge information. As can be seen from Figure 7, SSWFT overcomes the disadvantages of mode aliasing in EMD, and the modal components with different scales of SSWFT contain more detailed information of PD. Therefore, SSWFT–MDE features have better discrimination.

MDE reflects the intrinsic characteristics of the signal from the aspects of uncertainties and complexity. By observing the MDE of different discharge types, it can be seen that whether it is the SSWFT method or EMD method, the entropy of CD is the smallest and that of FD is the largest on most scales. According to the discharge process of different discharge types, the floating particulate matter during floating discharge has many states, such as static state, moving state, and so on. Therefore, the discharge process of FD has great randomness.

In SD and ND, the position of the initial discharge channel is not fixed. After many discharges, the discharge location mostly appears in the carbonization of the insulated cardboard. Compared with the other three discharge types, the CD pulses mostly appear near 270 degrees of the power frequency cycle, showing obvious polarity effect and strong regularity.

Pattern recognition of different discharge types

In order to analyze the influence of noise on the proposed method, 5 dB and 10 dB Gaussian white noise are added to the collected PD signals. The experiment selected 400 samples (100 randomly selected for each type of discharge), and the SSWFT–MDE, EMD–MDE, and 14-dimensional MDE of the signal itself are taken as feature vectors. Then, the SVM classifier was used to recognize PD signals based on the three feature vectors.

In PD recognition based on multi-scale entropy, the recognition process has the following characteristics: 1) PD recognition is a small sample recognition problem. 2) PD signal is nonlinear and multi-scale entropy feature is high-dimensional. 3) In practical application, it is usually expected that the faster the PD signal is recognized, the better. Although the artificial neural network has strong self-learning ability and nonlinear mapping ability, it is sensitive to the selection of initial weights and thresholds and needs a large number of training samples. So, it is easy to fall into local minima, especially, when the number of samples of the PD signal is not enough, it is difficult to obtain high-precision classification results by the artificial neural network. As a machine learning method with complete statistical basis, SVM avoids the shortcomings of the artificial neural network such as network structure selection, under-learning, and over-learning. SVM can not only deal with nonlinear data effectively, but also limit over learning, and so, it is especially suitable for solving small sample, nonlinear, and high-dimension pattern recognition and regression analysis problems.²⁵ Considering that only limited sample data can be obtained when dealing with practical PD problems, the number of samples is extremely limited; therefore, in this paper, SVM is chosen as the classifier. In the experiments, the kernel function of SVM is the Gauss radial kernel function. The width parameter of the kernel function is $σ = 5$ , the penalty factor is $C = 25$ , and fourfold cross validation was used. The extracted feature vector is inputted to the SVM classifier to realize PD-type recognition.

Table 2 shows that the recognition accuracy of the SWFT–MDE method is higher than those of the EMD–MPE method and direct MDE method, whether the signal contains noise or not. Since SSWFT is essentially an optimal threshold filter when selecting modal components, it is more robust to noise. Also, there is no spectrum aliasing and energy leakage between the modal components obtained by SSWFT decomposition, so the extracted multi-scale features can describe the time and frequency characteristics of the PD signal more accurately than the EMD–MDE method and direct MDE method.

Table 2.

The recognition accuracy of the PD signal.

Method	Signal type	Discharge type
Method	Signal type	Floating	Needle plate	Surface	Corona
SSWFT–MDE	Original	98.34	97.57	98.63	97.13
	Noisy signal 1	96.49	94.75	95.70	95.41
	Noisy signal 2	93.19	92.35	93.58	92.68
EMD–MDE	Original	94.61	92.47	93.42	92.45
	Noisy signal 1	91.97	90.64	89.32	85.97
	Noisy signal 2	82.92	83.04	81.62	80.37
MDE	Original	91.17	89.86	90.74	90.07
	Noisy signal 1	83.19	79.93	82.31	82.72
	Noisy signal 2	78.27	74.12	69.49	73.35

PD: partial discharge; SSWFT: synchrosqueezed windowed Fourier transform; MDE: multi-scale dispersion entropy; EMD: Empirical mode decomposition.

Noisy signal 1 represents the noisy PD signal with SNR of 10 dB, and noisy signal 2 represents the noisy PD signal with SNR of 5 dB.

With the increase of the noise level, the recognition accuracy of the EMD–MDE method decreases obviously. The SWFT–MDE method shows good stability and a higher recognition rate before and after mixing with noise. Compared with the direct MDE method, the SSWFT–MDE method can more accurately describe the complexity information of the discharge signal at different resolutions and has better noise robustness. It can be seen from Table 2 that the average recognition rate of the SSWFT–MDE method is higher than 92%, and when compared with MDE and EMD–MDE, the recognition accuracy of SSWFT–MDE is improved about 4% and 5%, respectively. From the experimental results, it can also be seen that in the case of small samples, SVM can still effectively identify the type of the PD signal. It shows that SVM can be used for PD-type recognition and has a good application prospect.

Conclusion

In this paper, a method of PD feature extraction based on SSWFT and MDE is proposed for recognition of transformer PD type. The features extracted by the proposed method can effectively characterize the uncertainty and complexity of PD signals in different frequency bands and have strong robustness to noise. The experimental results show that the SSWFT–MDE feature can effectively identify four types of discharge in the presence of noise. The average correct rate is over 92%, which is better than those of the EMD–MDE method and direct MDE method. However, there are still some drawbacks in the SSWFT–MDE method of PD feature extraction, such as parameter selection relying on prior knowledge and slow calculation speed, which need to be further improved.

Footnotes

Declaration of conflicting interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This research was financially supported by National Natural Science Fund (No.61671338, 51877161), fund of Open Research Fund Program of Key Laboratory of Digital Mapping and Land Information Application Engineering, NASG (No. GCWD201805), National Engineering Research Center for Water Transport Safety (No. A2019009), Hubei Key Laboratory of Transportation Internet of Things (NO.2018IOT006), the Natural Science Foundation of Guangdong Province (2016A030313710, 2015A030313624); the Science and Technology Program of Guangzhou (201607010170).

ORCID iD

Wang Wenbo

References

Koo

Jung

Ryu

, et al. Identification of insulation defects in gas-insulated switchgear by chaotic analysis of partial discharge. IET Sci Meas Technol 2010; 4(3): 115–124.

Pompili

Partial discharge development and detection in dielectric liquids. IEEE Trans Dielectr Electr Insulat 2009; 16(6): 1648–1654.

Beltle

Müller

Tenbohlen

Statistical analysis of online ultrahigh-frequency partial-discharge measurement of power transformers. IEEE Electr Insulat Mag 2012; 28(6): 17–22.

Rostaminia

Saniei

Vakilian

, et al. An efficient partial discharge pattern recognition method using texture analysis for transformer defect models. Int T Electr Energ Syst 2018; 28(7): e2558.

Wang

Liao

Yang

, et al. Optimal features selected by NSGA-II for partial discharge pulses separation based on time frequency representation and matrix decomposition. IEEE Trans Dielectr Electr Insul 2013; 20(3): 825836.

Majidi

Fadali

Etezadi Amoli

, et al. Partial discharge pattern recognition via sparse representation and ANN. IEEE Trans Dielectr Electr Insul 2015; 22(2): 1061–1070.

Darabad

Vakilian

Phung

, et al. An efficient diagnosis method for data mining on single PD pulses of transformer insulation defect models. IEEE Trans Dielectr Electr Insul 2013; 20(6): 20612072.

Yang

Kearns

, et al. Fractal-based autonomous partial discharge pattern recognition method for MV motors. High Volt 2018; 3(2): 103–114.

Rostaminia

Saniei

Vakilian

, et al. Evaluation of transformer core contribution to partial discharge electromagnetic waves propagation. Int J Elec Power 2016; 83: 40–48.

10.

Evagorou

Kyprianou

Lewin

, et al. Feature extraction of partial discharge signals using the wavelet packet transform and classification with a probabilistic neural network. IET Sci Meas Technol 2009; 4(3): 177–192.

11.

Yang

Sheng

, et al. Application of EEMD and high-order singular spectral entropy to feature extraction of partial discharge signals. IEEJ Trans Electr Electr Eng 2018; 13(7): 1002–1010.

12.

Jia

Xie

, et al. Power transformer partial discharge fault diagnosis based on multidimensional feature region. Math Probl Eng 2016; 2016: 4835694.

13.

Oberlin

Meignen

Perrier

. The Fourier-based synchrosqueezing transform. In: Acoustics, speech and signal processing (ICASSP), Florence, 4–9 May 2014, pp. 315–319. New York: IEEE.

14.

Ratikanta

Sylvain

Thomas

Theoretical analysis of the second-order synchrosqueezing transform. Appl Comput Harmonic Anal 2018; 45(2): 379–404.

15.

Bandt

Pompe

Permutation entropy: a natural complexity measure for time series. Phys Rev Lett 2002; 88: 1–4.

16.

Wang

Liu

, et al. The entropy algorithm and its variants in the fault diagnosis of rotating machinery: a review. IEEE Access 2018; 6: 66723–66741.

17.

Cui

Zheng

Xin

, et al. Feature extraction and classification method for switchgear faults based on sample entropy and cloud model. IET GenerTrans Distrib 2017; 11(11): 2938–2946.

18.

Azami

Escudero

Improved multiscale permutation entropy for biomedical signal analysis: interpretation and application to electroencephalogram recordings. Biomed Signal Pr Control 2016; 23: 28–41.

19.

Humeau-Heurtier

Refined scale-dependent permutation entropy to analyze systems complexity. Phys A Statist Mech Appl 2016; 450: 454–461.

20.

Rostaghi

Azami

Dispersion entropy: a measure for time-series analysis. IEEE Signal Pr Lett 2016; 23(5): 610–614.

21.

Azami

Rostaghi

Abasolo

, et al. Refined composite multiscale dispersion entropy and its application to biomedical signals. IEEE T Biomed Eng 2017; 64(12): 2872–2879.

22.

Haikun

Lun

Feng

Partial discharge feature extraction based on ensemble empirical mode decomposition and sample entropy. Entropy 2017; 19(9): 439.

23.

Meignen

Oberlin

McLaughlin

A new algorithm for multicomponent signals analysis based on synchrosqueezing: with an application to signal sampling and denoising. IEEE T Signal Pr 2012; 60(11): 5787–5798.

24.

Peng

Long

Ding

Feature selection based on mutual information: criteria of max-dependency, max-relevance, and min-redundancy. IEEE T Pattern Anal Mach Intel 2005; 27(8): 1226–1238.

25.

Hao

Lewin

PL.

Partial discharge source discrimination using a support vector machine. IEEE Trans Dielectr Electr Insulat 2010; 17(1): 189–197.

Partial discharge feature extraction based on synchrosqueezed windowed Fourier transform and multi-scale dispersion entropy

Abstract

Keywords

Introduction

Synchrosqueezed windowed Fourier transform

MDE theory

Feature extraction process based on SSWFT–MDE

PD pattern recognition based onSSWFT–MDE

Experimental model

Feature extraction comparison

Pattern recognition of different discharge types

Conclusion

Footnotes

Declaration of conflicting interests

Funding

ORCID iD

References