Sage Journals: Discover world-class research

Abstract

Monitoring the degradation state of hydraulic pumps is of great significance to the safe and stable operation of equipment. As an important step, feature extraction has always been challenging. The non-stationary and nonlinear characteristics of vibration signals are likely to weaken the performance of traditional features. The two-dimensional image representation of vibration signals can provide more information for feature extraction, but it is challenging to obtain sufficient information based on small-size images. To solve these problems, a method for feature extraction based on modified hierarchical decomposition (MHD) and image processing is proposed in this paper. First, a set of signals decomposed by MHD are converted into gray-scale images. Second, features from accelerated segment test (FAST) algorithm are applied to detecting the feature points of the gray-scale image. Third, the real part of Gabor filter bank is used to convolve the images, and the responses of feature points are used to calculate histograms that are regarded as feature vectors. The method for feature extraction fully acquires the multi-layered texture information of small-size images and removes the redundant information. Furthermore, support vector machine (SVM) and nondominated sorting genetic algorithm II (NSGA-II) are introduced to conduct feature selection and state identification. NSGA-II and SVM can conduct the joint optimization of these two goals. The details of the proposed method are validated using experimental data, and the results show that the highest recognition rate of our proposed method can reach 100%. The results of the comparison among the proposed method, local binary pattern (LBP), and one-dimensional ternary patterns (1D-TPs) certify the superiorities of the proposed method. It obtains the highest classification accuracy (99.7%–98%) and the lowest feature set dimension (13–10).

Keywords

modified hierarchical decomposition image processing hydraulic pump degradation state identification

Introduction

The heart of hydraulic system is hydraulic pump, which has the advantages of high pressure and high efficiency. As a component for transferring and converting energy, hydraulic pumps are widely used in the fields of transportation and industry.¹ Failure to troubleshoot hydraulic pumps in time will threaten the safe operation and reliability of the hydraulic system and even cause terrible accidents and economic losses.² Therefore, degradation state identification for hydraulic pumps is conducive to the reliable and long-term operation of the system. However, the structure of hydraulic pump is not as simple as other rotating machinery such as bearings. The thermal-solid-fluid coupling existing inside the hydraulic pump makes its fault information hidden.³ Consequently, a novel method needs to be proposed for feature extraction and applied to the degradation state identification for hydraulic pumps.

The traditional features of time–frequency domain, time domain, and frequency domain are widely used in the fields of fault diagnosis and prognosis. Applying frequency domain analysis and time domain analysis to non-stationary signals is not effective.⁴ Although the time–frequency analysis can be used to extract the features of a non-stationary signal more effectively, its stability will be weakened in the presence of a signal with weak fault characteristics and nonlinear characteristics.⁵ In recent years, some novel methods for feature extraction have been proposed in the fields of fault diagnosis and condition monitoring. Hongru Li et al.⁶ extracted the power entropy and singular entropy of the modified composite spectrum from multi-channel vibration signals of the hydraulic pump and proposed the relative entropy method to fuse the initial features. Ying Jiang et al.⁷ proposed a feature indicator called hierarchical entropy based on sample entropy and hierarchical decomposition. Since then, some scholars had combined the hierarchical decomposition with other complexity indicators and proposed some novel feature extraction methods. With the support of hierarchical decomposition, Bing Han et al.⁸ developed the single-scale Lempel–Ziv complexity into hierarchical Lempel–Ziv complexity and applied it to fault feature extraction for rotating machinery. In the process of hierarchical decomposition, the length of time series is shortened. Modified hierarchical decomposition (MHD) proposed by Yongbo Li et al.⁹ overcome this drawback. The moving-difference and moving-averaging procedure were used to improve hierarchical decomposition, and modified hierarchical permutation entropy was proposed based on permutation entropy and MHD. Cheng Yang et al.¹⁰ extended MHD to multiple scales, and proposed hierarchical multi-scale symbolic dynamic entropy, which is a tensor feature extraction method based on MHD, multi-scale analysis and symbol dynamic entropy. The above studies related to the hierarchical decomposition show that it has the advantage of obtaining the information of signals from high- and low-frequency components, so that it can reflect the fault information more accurately than traditional multi-scale analysis.

Inspired by the rapidly developing image processing technology, some scholars also applied it to extracting fault features. Kaplan et al.¹¹ applied texture analysis to diagnosing bearing faults. The one-dimensional (1D) vibration signals were converted into two-dimensional (2D) gray-scale images. Local binary pattern (LBP), a classic texture descriptor, was used to extract image features. Hao Zheng et al.¹² used a corner detection method called features from accelerated segment test (FAST) to identify the feature points in gray-scale images. They then used the unoriented scale-invariant feature transform (unoriented-SIFT) to describe the feature points. Melih Kuncan et al.¹³ developed the local ternary patterns for image processing into one-dimensional ternary patterns (1D-TP), which was used to extract bearing fault features from vibration signals. 1D-TP can be considered as a variant of image processing technology. Since deep learning can learn the deep abstract features of data and has a strong data expression, it has also been widely used in the field of mechanical fault diagnosis. Although deep learning can process 1D signals, images have been more widely used as its input. Jianyu Wang et al.¹⁴ proposed a bearing fault diagnosis model transferred from an AlexNet model. The images created by eight time–frequency analysis methods, including short-time Fourier transform, and fast kurtogram, were used as the input of the model. Yongliang Bai et al.¹⁵ proposed an algorithm to represent the frequency spectrum characteristics of vibration signals in image form, and a deep neural network (DNN) classified the images to diagnose the faults of wheelset-axlebox assemblies. Duy-Tang Hoang et al.¹⁶ converted the vibration signals into gray-scale images, and a deep convolutional model automatically extracted features from the images and recognized the bearing faults. Compared with 1D signal, 2D image representation has three main advantages: (1) For machinery working in a noisy environment, vibration signals are usually added with noise. However, when the data is transformed into images, the added noise is considered as the illumination of the light to the image. Therefore, the effect of noise is suppressed.¹⁷ (2) Image representation can provide a comprehensive, detailed, and nonlinear description of the data.¹⁵ (3) From the perspective of human vision, 2D images are of course easier to distinguish. With a simple transformation of the signal, people can more easily classify each signal from the images.¹⁸

Although the application of image processing technology is novel and effective in the fields of fault diagnosis and prognosis, there are still several problems that need to be considered. First, an image often contains a large number of pixels in the field of image processing, while a fault sample is likely to contain a small number of data points in the field of fault diagnosis, which makes a 1D signal into a small-size image. It contains less information, and its local and global information needs to be considered comprehensively. Second, it is challenging to obtain sufficient information based on small-size images. Third, deep learning requires a large number of data samples to meet good recognition results. To extract the degradation features of hydraulic pumps more effectively, it is promising to combine hierarchical decomposition with image feature extraction. In this paper, an approach based on MHD, FAST algorithm, and Gabor filter is proposed to extract features. FAST is a feature point detection algorithm widely used in computer vision.¹⁹ The points with large difference in gray value of surrounding pixels are identified as feature points by FAST algorithm; thus, FAST is a local feature detector and suitable for signals with non-stationary characteristics.¹² The role of the Gabor filter in this paper is to describe the detected feature points. As a powerful tool in the field of computer vision, the Gabor filter proposed by Dennis Gabor is one of the best approaches for texture analysis.²⁰ Because of its high sensitivity to describing directional features, it is suitable for calculating the gradient of feature points. Wei Jia et al.²¹ used histogram of oriented gradients (HOG) descriptor improved by the Gabor filter to extract palmprint features. In this paper, a Gabor filter bank with one scale and several directions is applied to all feature points, and the response histogram is used as the feature vector of a gray-scale image.

In this study, after feature extraction, nondominated sorting genetic algorithm II (NSGA-II)²² and support vector machine (SVM)²³ are introduced to conduct feature selection and identify different degradation states of a hydraulic pump. Overall, a new strategy based on MHD, FAST algorithm, Gabor filter, NSGA-II, and SVM is proposed to identify degradation states of hydraulic pumps. The results of two experiments show that the proposed method is effective and reasonable. Compared with LBP and 1D-TP, it can obtain the highest classification accuracy with the fewest features.

The following contents are also presented: The details of model establishment for the proposed degradation state identification method are presented in the second section. Experimental test results and discussion content are given in the third section. The fourth section summarizes the full paper and gives conclusions.

Model establishment

Feature extraction

Modified hierarchical decomposition (MHD)

Compared with the conventional hierarchical decomposition, MHD proposed by Yongbo Li has a better stability.⁹ The calculation process of MHD includes four steps as follows:

Step 1: For a given time series $Q {q (i), i = 1,2, ..., L},$ an averaging operator $O_{0}$ and a high operator $O_{1}$ can be expressed according to equation (1) and equation (2)

O_{0} (q) = \frac{q (i) + q (i + 1)}{2} i = 1,2, \dots, L - 1

(1)

O_{1} (q) = \frac{q (i) - q (i + 1)}{2} i = 1,2, \dots, L - 1

(2)

Step 2: The operator $O_{j}^{h} (j = 1 o r 0)$ at the hierarchical layer h can be calculated according to equation (3)

O_{j}^{h} = {[\begin{array}{l} \frac{1}{2} \underset{2^{h - 1} - 1}{\underset{︸}{0 \dots 0}} \frac{{(- 1)}^{j}}{2} \begin{matrix} 0 & \begin{matrix} \dots & 0 & 0 & 0 \end{matrix} \end{matrix} \\ 0 \frac{1}{2} \underset{2^{h - 1} - 1}{\underset{︸}{0 \dots 0}} \begin{matrix} \frac{{(- 1)}^{j}}{2} & \begin{matrix} \dots & 0 & 0 & 0 \end{matrix} \end{matrix} \\ \dots \\ 0 0 0 0 \dots \frac{1}{2} \underset{2^{h - 1} - 1}{\underset{︸}{0 \dots 0}} \frac{{(- 1)}^{j}}{2} \end{array}]}_{(L - 2^{h} + 1) \times (L - 2^{h - 1} + 1)}

(3)

Step 3: For a given positive integer $h$ and unique vector $[v_{1}, v_{2}, \dots, v_{h}]$ , the integer $e$ is expressed according to equation (4)

e = \sum_{m = 1}^{h} 2^{h - m} v_{m}

(4)where

{v_{m}, m = 1, \dots, h} \in {1, 0}

determines the selection of high or averaging operator at the

m^{t h}

layer.

Step 4: Calculate the component at the node $e$ of the layer $h$ in the hierarchical structure according to the following equation

Q_{h, e} = O_{v_{h}}^{h} \cdot O_{v_{h - 1}}^{h - 1} \cdot \dots \cdot O_{v_{1}}^{1} \cdot Q

(5)To illustrate the MHD, an example is presented in Figure 1. The time series

Q

is decomposed into three layers, and the last layer has eight components. Take the case of h = 3 as an example, as the low-frequency component of

Q_{2,1}

Q_{3,2}

is the component at node 2 of layer 3. When e = 2 and h=3, the unique vector

[v_{1}, v_{2}, v_{3}]

= [0,1,0] according to equation (4). Consequently,

Q_{3,2}

can be obtained as

O_{0}^{3} \cdot O_{1}^{2} \cdot O_{0}^{1} \cdot Q

according to equation (5).

Q_{3,3}

is the high-frequency component of

Q_{2,1}

Figure 1.

An example of MHD with three layers.

FAST feature detection

As shown in Figure 2, a pixel p is selected from an image, and the calculation method of FAST is as follows: The gray value of $p$ is represented as $I_{p}$ , and an appropriate threshold t is set. There are a series of pixels on a discretized circle with $p$ as the center and $r$ as the radius. For instance, there are 16 pixels on the circle with $r$ = 3 in Figure 2. Any pixel on the circle is represented as z. If the gray value of z $(marked as I_{z})$ is not greater than the value of $(I_{p} - t)$ , $z$ is darker than $p$ , and the status of $p \to z$ (expressed as $S_{p \to z}$ ) is $d$ . Similarly, the value of $S_{p \to z}$ is calculated according to equation (6)

S_{p \to z} = {\begin{matrix} b, I_{p} + t \leq I_{z} (b r i g h t e r) \\ s, I_{p} - t < I_{z} < I_{p} + t (s i m i l a r) \\ d, I_{z} \leq I_{p} - t (d a r k e r) \end{matrix}

(6)Equation (6) is applied to all pixels on the circle, and two sets are defined as follows

Y_{b} = {z : S_{p \to z} = b}

(7)

Y_{d} = {z : S_{p \to z} = d}

(8)The center

p

is identified as a feature point if the number of elements in

Y_{b} o r Y_{d}

is greater than n. The parameter n is assigned by the user.

Figure 2.

FAST feature detection.

Gabor filter

In a two-dimensional space, the Gabor filter is a Gaussian kernel function modulated by a sinusoidal plane wave,²⁴ which can be expressed as equation (9)

G (x, y; λ, θ, σ, γ) = \frac{1}{2 π σ^{2}} e x p (- \frac{x^{'} + γ^{2} y^{'}}{2 σ^{2}}) e x p (2 π i \frac{x^{'}}{λ})

(9)its real part is expressed as

G_{r} (x, y; λ, θ, σ, γ) = \frac{1}{2 π σ^{2}} e x p (- \frac{x^{'} + γ^{2} y^{'}}{2 σ^{2}}) cos (2 π \frac{x^{'}}{λ})

(10)

{\begin{cases} x^{'} = x c o s θ + y s i n θ \\ y^{'} = - x s i n θ + y c o s θ \end{cases}

(11)where

i^{2} = - 1, λ

is the wavelength of the sinusoidal function,

θ

controls the direction of the parallel strips in the Gabor kernel,

σ

is the standard deviation of the Gaussian factor, and

γ

is the spatial aspect ratio, which controls the ellipticity of the Gabor function. If

γ

= 1, the image of the Gabor function is circular. A Gabor filter bank can be created by changing

θ

and fixing other parameters. The angle

θ_{c}

is calculated according to equation (12)

θ_{c} = \frac{π (c - 1)}{n^{'}}, c = 1,2, \dots, n^{'}

(12)

The proposed feature extraction method

For a given signal segment $Q {q (i), i = 1,2, ..., L}$ , the proposed feature extraction method consists of 6 steps. As seen in Figure 3, an example is shown there. In this case, $Q$ is decomposed into three layers, and the Gabor filter bank has six different directions: (0, $π / 6$ , $π / 3$ , $π / 2$ , $2 π / 3$ , $5 π / 6$ ).

Figure 3.

Illustration of the proposed feature extraction method (an example).

Step 1: Apply MHD to $Q$ . Then, $(2^{h_{m a x} + 1} - 1)$ hierarchical components can be obtained ( $Q$ is treated as a component), where $h_{m a x}$ is the maximum value of the layer value k, which is also the total number of layers (Figure 3(b)).

Step 2: Transform each hierarchical component into a gray-scale image. First of all, each component is converted into a matrix according to the method shown in Figure 4. When it comes to the size of the matrix, the difference between M and N is minimized. In other words, a square matrix is the best result. Note that several data points may be discarded. For example, for a given signal with 4088 data points, (M, N) is set to (68, 60), and 8 points are discarded. Then, all the elements of the matrix are converted to values between 0 and 255 according to equation (13)

New a_{i} = [\frac{a_{i} - m i n (A)}{m a x (A) - m i n (A)}] \times 255

(13)where

a_{i}

is any element in matrix A, max(A) is the maximum value of elements in matrix A, and min(A) is the minimum value of elements in matrix A. Finally, each matrix is converted to uint 8 type, and the gray-scale images are obtained (Figure 3(c)).

Figure 4.

Illustration of converting a one-dimensional signal into a two-dimensional matrix.

Step 3: FAST feature detection: The feature points in each gray-scale image are detected according to the method described in the subsection “FAST feature detection” (Figure 3(d)).

Step 4: A Gabor filter bank is created according to the method described in the subsection “Gabor filter”. Then, each gray-scale image is convolved with the real part of the filter bank. Finally, the gradient magnitude $m_{p}$ and orientation $θ_{p}$ corresponding to each feature point are calculated according to equations (14) and (15), respectively

m_{p} = m a x [I_{p} * G_{r} (θ_{c})]

(14)

θ_{p} = a r g m a x_{c} [I_{p} * G_{r} (θ_{c})]

(15)where

c = 1,2, \dots, n^{'}

G_{r} (θ_{c})

is the real part of the filter bank obtained according to equations (10–12),

I_{p}

is the feature pixel, and

*

means the convolution operation (Figure 3(e)).

Step 5: Create $n^{'}$ bins with different $θ_{c}$ and calculate the histogram within each gray-scale image (HI) as follows (Figure 3(f))

H I {(θ_{c})}_{i} = H I {(θ_{c})}_{i} + m_{p} i f θ_{p} = θ_{c}

(16)

Step 6: The histogram corresponding to the signal segment $Q$ (HS) can be obtained by integrating the histograms of $(2^{h_{m a x} + 1} - 1)$ gray-scale images as follows (Figure 3(g))

HS = {H I_{1}, H I_{2}, \dots, H I_{(2^{k_{m a x} + 1} - 1)}}

(17)It is obvious that HS can be considered as a

n^{'} \times (2^{h_{m a x} + 1} - 1)

dimensional vector. Feature selection and state recognition based on NSGA-II and SVM.

Feature selection and state recognition based on NSGA-II and SVM

After performing feature extraction, a feature pool that contains amounts of information is created. However, the feature pool generally includes redundant information, which will cause the performance of degradation state identification to be impaired.²⁵ In this paper, NSGA-II and SVM are introduced for joint optimization of feature selection and degradation state identification.

Nondominated sorting genetic algorithm-II

NSGA is the earliest multi-objective evolutionary algorithm.²⁶ NSGA-II is an improved version that addresses the shortcomings of NSGA, such as lack of elitism and high computational complexity. As an algorithm based on nondominated sorting, NSGA-II has the same basic process as the original genetic algorithm.²⁷ With the properties of an elitist strategy, a fast nondominated sorting procedure, a simple yet efficient constraint-handing method and a parameterless approach, NSGA-II has found increasing application in the fields of fault prognosis and diagnosis for equipment.^28,29 The procedure of NSGA-II is summarized as follows:

Step 1: Initialize a random population, which consists of N individuals.

Step 2: Sort the population based on the nondominated sorting procedure and assign ranks to all individuals.

Step 3: Utilize selection, crossover, and mutation operators to create the offspring population Q_t from the parent population P_t.

Step 4: Combine the offspring population and the parent population to form a whole and sort the whole according to the nondominated sorting procedure.

Step 5: Continuously select nondominated solutions from the best ranking (F₁, F₂, …, F_k, …, F_n) to generate a new population P_t+1. The selection is not stopped until the selection of a certain F_k makes the size of P_t+1 exceed the population size N. Then select individuals from F_k according to the crowding distance sorting until the size of P_t+1 is equal to N.

Step 6: Go to step 2 if the stop condition is still not met.

For a better understanding, Step 4 and Step 5 are illustrated in Figure 5. For other unfinished details about NSGA-II, refer to Deb et al.²²

Figure 5.

Step 4 and 5 of NSGA-II.

Support vector machine

As one of the classic machine learning techniques, SVM is widely used in pattern recognition. It learns by example to assign labels to objects.³⁰ SVM finds an optimal classification hyperplane so that the sum of the distances from this hyperplane to the nearest sample in each class is minimized. Take the binary classification problem as an example, for n linearly separable samples $(x_{1,} y_{1})$ , $(x_{2,} y_{2})$ , …, $(x_{n,} y_{n})$ , $x \in R^{m}$ (where m is the sample dimension, and $y_{1}, y_{2}, \dots, y_{n}$ are classification labels), Finding a hyperplane can be transformed into a solution to the following problem

\max_{λ} \sum_{i = 1}^{n} λ_{i} - \frac{1}{2} (\sum_{i = 1}^{n} \sum_{j = 1}^{n} λ_{i} λ_{j} y_{i} y_{j} x_{i} x_{j})

s .t . \sum_{i = 1}^{n} λ_{i} y_{i} = 0, λ_{i} \geq 0

(18)As the only variable in problem (18),

λ

can be solved by quadratic programming. We assume that the optimal solution

λ^{*} = (λ_{1}^{*}, λ_{2}^{*}, \dots, λ_{n}^{*})

. For the classification of nonlinear data, the commonly used method is to express the high-dimensional set through the nonlinear mapping function

φ

, and the classification hyperplane is constructed in a high-dimensional space. The kernel function that replaces the dot product operation is defined as

K (x_{i}, x_{j}) = φ (x_{i}) φ (x_{j})

(19)Nonlinear data is mapped to a high-dimensional space, and the problem (18) can be transformed into the following expression

\max_{λ} \sum_{i = 1}^{n} λ_{i} - \frac{1}{2} [\sum_{i = 1}^{n} \sum_{j = 1}^{n} λ_{i} λ_{j} y_{i} y_{j} K (x_{i} x_{j})]

s .t . \sum_{i = 1}^{n} λ_{i} y_{i} = 0, 0 \leq λ_{i} \leq C

(20)where C is the penalty factor. Consequently, the classification function is expressed as equation (21)

f (x) = sgn [\sum_{i = 1}^{n} λ_{i}^{*} y_{i} K (x_{i}, x_{j}) + b^{*}]

(21)In this paper, radial basis function (RBF) is selected due to its good performance and universal application,³¹ and the RBF kernel of samples

x

and

x^{'}

can be expressed as a feature vector defined as follows

K (x, x^{'}) = e x p (- \frac{∥ x - x^{'} ∥^{2}}{2 σ^{2}})

(22)where

1 / 2 σ^{2}

f

and

f

is the kernel parameter.

The details of SVM are described in Hearst et al.³²

The proposed method for feature selection and state recognition

NSGA-II has two objective functions, recognition accuracy and the total number of features fed into SVM, in this study. The recognition accuracy is the classification accuracy provided by SVM. The chromosomes of individuals in the population are encoded in binary form, “0” means the corresponding feature is not selected and “1” means it is selected. When NSGA-II runs, the algorithm maximizes the recognition accuracy and minimizes the total number of features simultaneously.

The multiple optimal solutions called “Pareto Frontiers” result from a multi-objective optimization problem.³³ The Pareto Front with the highest classification accuracy will be selected. In other words, by performing this step, we can get the highest accuracy recognition result and its corresponding lowest-dimensional feature subset. The detailed process is shown in Figure 6.

Figure 6.

Flowchart of joint optimization of feature selection and state recognition using NSGA-II and SVM.

Suitable parameters determination

There are several parameters in the algorithms described above that need to be determined: Hierarchical layer h, radius r and two thresholds (t, n) of FAST feature detection, the same $(λ, σ, γ)$ and different $θ$ ( $θ_{c}, c = 1,2, \dots, n^{'}$ ) of Gabor filter bank, the kernel parameter $f$ of SVM, and several parameters of NSGA-II. Note that the size of the filter kernel needs to be determined before using a Gabor filter bank. For the selection of h, insufficient decomposition will be caused by too small h, and time-consuming will be caused by too large h. We set h = 3. In our code, the parameter $f$ of SVM is set to “auto,” which enables $f$ to be adaptively changed according to the change of feature subset during the operation of NSGA-II. That is, $f$ = 1/ $N_{f}$ , where $N_{f}$ is the number of features. According to the recommendation in Rosten et al.,³⁴ we choose n = 12 and r = 4. The more the angle parameters $θ_{c}$ we set, the more the detailed description of the image we obtain. However, this also means more time-consuming. Furthermore, when the number of angles changes in the interval [4, 6], it has little effect on the classification accuracy.³⁵ Consequently, we take the empirical value as $(θ_{1}, θ_{2}, θ_{3}, θ_{4}, θ_{5}, θ_{6})$ = (0, $π / 6$ , $π / 3$ , $π / 2$ , $2 π / 3$ , $5 π / 6$ ). According to the recommendation in Zhang et al.,³⁶ we choose $γ = 1$ . When it comes to the parameters of NSGA-II, they are set to the following relatively balanced values: the total number of evolutionary generations = 500, crossover probability = 0.6, mutation probability = 0.6, and population size = 30. Compared with other unset parameters, they have weaker effects on the recognition accuracy. Then, there are four more important parameters, t, $λ$ , $σ$ , and Gabor kernel size (Ksize), that need to be fixed. In fact, the relationship between $λ$ and $σ$ can be expressed as follows³⁷

\frac{λ}{σ} = π \frac{2^{b} - 1}{2^{b} + 1} \cdot \sqrt{\frac{2}{\ln 2}}

(23)where b is the half-response spatial frequency bandwidth of the filter. Specifically, inspired by Bianconi et al.,³⁵ we use equation (24) to express the relationship between

λ

and

σ

λ = \frac{2 (σ + (\sqrt{\ln 2} / π))}{σ}

(24)Therefore, we still need to assign values to three independent parameters. The four parameters are selected as follows based on experimental results: t = 3,

λ

= 2.13,

σ

= 4, and Ksize = (11×11). The specific details are presented in the section “Experimental validation.”

The proposed degradation state identification method for hydraulic pump

The proposed method consists of the following four steps:

Step 1: Acquire the vibration signals for various degradation states of the hydraulic pump and divide them into several samples.

Step 2: Utilize the method described in subsection 2.1.4 to extract features from each sample. 90 features will be acquired according to the formula $n^{'} \times (2^{h_{m a x} + 1} - 1)$ .

Step 3: Integrate the features of all samples into a new dataset, and partition it into a testing dataset and a training dataset with a ratio of 1: 1.

Step 4: Utilize the method described in 2.2.3 to identify the different degradation states of the hydraulic pump, and the best feature subset is also obtained.

Experimental validation

A hydraulic pump test platform shown in Figure 7 was set up for data acquisition, which consists of a cooling system, a control system, a signal monitoring, acquisition and display system, a pressure regulating system, and a drive system.

Figure 7.

Hydraulic pump test platform.

The hydraulic pump for this study is an axial piston pump with the following parameters: type: L10VS028DFR, displacement at the rated working condition: 28 $m l / r$ , rated pressure: 22 $M P a,$ and rated rotation speed: 1480 $r / m i n .$ Three acceleration sensors are, respectively, installed in three mutually orthogonal directions as Figure 8. They acquire vibration signals at a sampling frequency of 50 $K H z,$ each sampling lasts for 1s, and the interval between two samplings is 30 s.

Figure 8.

Layout of the three vibration sensors.

The single loose boot is studied in this paper because the loose boot is one of the common fault patterns of hydraulic pump. The gap between the plunger and the boot will increase when the loose boot occurs. To acquire vibration signals close to the actual situation, the normal plungers are replaced with failed plungers obtained after equipment maintenance. Five different degrees of the loose boot, as shown in Figure 9, are considered. Vernier caliper is used to measure the maximum radial distance between the boot and the plunger under the five different degrees. The five measurements considered as the loose degree are 0.12 mm, 0.18 mm, 0.3 mm, 0.42 mm, and 0.64 mm, respectively. Including the normal state which is considered as a special degradation state, a total of six different degradation states are considered. The data of each degradation state is divided into 100 samples, and each sample consists of 4095 data points. The time domain waveforms of several samples are shown in Figure 10. Furthermore, one sample of each degradation state is decomposed into two sub-signals using MHD, and their time domain waveforms and gray-scale image representations are shown in Figure 11. When the figure is analyzed, it is difficult to separate the degradation states from each other using the time domain waveforms. However, all images appear to be differentiated from each other. Therefore, the conclusion that the image representation of the signal is useful is reinforced.

Figure 9.

Five different degrees of loose boot: (a) 0.12 mm; (b) 0.18 mm; (c) 0.3 mm; (d) 0.42 mm; and (e) 0.64 mm.

Figure 10.

Time domain waveforms of several vibration signal samples.

Figure 11.

Time domain waveforms and image representations of the sub-signals: (a) normal; (b) loose degree: 0.12 mm; (c) loose degree: 0.18 mm; (d) loose degree: 0.3 mm; (e) loose degree: 0.42 mm; and (f) loose degree: 0.64 mm.

Experiment 1: Selection of parameters $t$ , $λ$ , $σ$ , and Ksize

In this experiment, 600 data samples are directly converted into 600 images for testing. The threshold $t$ affects the number of detected feature points. One sample of each degradation state is taken to analyze the relationship between $t$ and the number of feature points, as shown in Figure 12. It can be seen that the feature points decrease with the increase of $t$ . Too few feature points lose too much information and too many feature points lead to time-consuming and acquisition of useless information. Therefore, our goal is to choose a greater $t$ while guaranteeing the recognition accuracy. In addition, the number of feature points in each state is less than 200 except for the loose degree 0.64 mm when the t value is greater than 20, and there are less than 100 feature points in the normal state, which will lose too much information. Therefore, the $t$ value should be selected in the interval [1, 20].

Figure 12.

The number of feature points of different degradation states under different t.

The Ksize value should be generally an odd number so that the filter kernel is centered on a pixel. As the Ksize value increases, neighbors in a larger neighborhood of a pixel participate in the convolution operation on the pixel. Some unrelated neighbors may participate if the Ksize value is too great. Likewise, some useful neighbors may be discarded if the Ksize value is too small.³⁸

The parameter $σ$ should match the texture boundary. The window overlap different texture regions if $σ$ is too great. However, a too small $σ$ causes window-position perturbations and makes the output unstable.³⁹

To verify the relationship between these three parameters and classification accuracy, we assign values to them for testing. The result is shown in Figure 13. It can be seen that the distribution of scattered points is not messy but regular. The scattered points with high classification accuracy are clustered together. The following phenomena have been noticed: (1) When the value changes in a range greater than 10, Ksize has little effect on accuracy. The scattered points with high classification accuracy are basically in this range. (2) The scattered points with high classification accuracy are mainly gathered in the open interval $σ$ = (2.5, 5). (3) The ideal value of $t$ is in the interval [1, 7.5].

Figure 13.

Two angles of the visualized relationship between classification accuracy and the parameters Ksize, σ, and t: (a) angle 1 and (b) angle 2.

In order to finally determine the values of the parameters, we set the value ranges of the three parameters as $σ = [2, 6]$ , $t = [1, 7]$ , and Ksize = [11, 16]; their incremental step sizes are all 1. The test result is visualized as shown in Figure 14. The two points with the highest classification accuracy can be easily found (surrounded by red dotted boxes). Therefore, the ideal parameter settings are as: t = 3, $σ$ = 4, Ksize = (11×11), and $λ$ = 2.13 according to the equation (24).

Figure 14.

Classification accuracy under different values of Ksize, σ, and t.

Experiment 2: Verification for the necessity of modified hierarchical decomposition

The following four methods are used to identify the degradation states of the hydraulic pump.

Method 1: The steps of Method 1 are the same as those described in subsection 2.4 except that the MHD is not applied. That is, only the 6 features of the original signal are extracted.

Method 2: The $H O L_{G a b o r}$ descriptor described in Jia et al.²¹ is modified with FAST feature detection and used for feature extraction. The gradients of the detected feature points instead of all pixels are calculated. The gradient magnitudes of the detected feature points in the cell are used to vote into the orientation histogram. The other steps of this method are the same as those of Method 1. According to the results of multiple tests, cell size = $(30 \times 30)$ pixels and block size = $(2 \times 2)$ cells are selected in this method. The image corresponding to each sample is divided into 4 cells, and 24 features are acquired.

Method 3: The proposed method described in subsection 2.4. 90 features are acquired.

Method 4: Apply Method 2 after applying the MHD to the original signal. 360 features are acquired.

From the above description of the four methods, it can be seen that the main differences between Method 1 and Method 2 are as follows: The image window in Method 2 is divided into small spatial regions called “cells,” and several adjacent “cells” are merged as a “block.” However, the image window in Method 1 is not divided into cells. Each method is implemented 10 times to reduce random effects. The highest classification accuracy obtained in each run is shown in Figure 15. Table 1 gives the statistical results of accuracy.

Figure 15.

Classification accuracy of the 4 methods in experiment 2.

Table 1.

Classification accuracy statistics of the 4 methods in experiment 2.

Method	Max	Min	Mean
1	0.88	0.867	0.876
2	0.9	0.88	0.888
3	1	0.987	0.997
4	0.982	0.967	0.977

The following information is drawn from Figure 10 and Table 1: The classification accuracy obtained by Method 2 is higher than that obtained by Method 1. What is more noticeable is that the classification accuracy obtained by the proposed method is significantly higher than that obtained by Method 4. The proposed method achieves the highest classification accuracy (100–98.7%). There are 3 reasons for the result. First, in Method 2, more detailed information can be obtained since the image is divided into cells. However, each sample only consists of 4095 data points, which results in a smaller gray-scale image. Therefore, the contribution of dividing the image into cells to the classification accuracy is limited. Second, dividing the image into cells and applying the MHD together will improve the classification accuracy, but it will also produce more redundant information. Third, for samples with a small number of data points, not dividing the image into cells based on MHD and FAST feature detection is more beneficial to obtain useful information.

Experiment 3: Comparison among the proposed method, LBP, and 1D-TP

As mentioned in Kaplan et al.,¹¹ when LBP is applied to texture feature extraction, all non-uniform patterns are assigned a bin in the histogram, and each uniform pattern is assigned a bin. We set 8 neighbors for the central pixel. In this case, 59 features are acquired. We set 8 neighbors for the central point, threshold $β = 4.5$ in 1D-TP. In this case, two feature sets, 256 low features and 256 up features, are obtained, respectively.¹³ For a fair comparison, the four feature sets obtained by the three methods should have the same dimension. In this experiment, to reduce the dimensionality of feature sets generated by the other two methods except for LBP, principal component analysis (PCA)⁴⁰ is introduced. So that all feature sets are 59-dimensional. To facilitate the comparisons of the identification results, the method for feature selection and state recognition described in subsection 2.2.3 is also used after LBP (simplified into LBP-NSGA-II-SVM) and 1D-TP (simplified into 1D-TP-NSGA-II-SVM), respectively. 15 run times are conducted to reduce random effects. The classification accuracy and the corresponding number of features obtained in each run are shown in Figure 16. Table 2 gives the statistical results of accuracy, and the minimum number of features corresponding to the highest accuracy and the lowest accuracy are also shown in parentheses after them. First of all, the proposed method achieves the highest identification accuracy (99.7–98%). Second, it is more evident that the proposed method requires the least features (13–10). Third, other methods get lower identification accuracy (99.3–95.8%), and they require more features (20–13). Therefore, the superiority of the proposed method is reinforced.

Figure 16.

Comparison results of three methods in experiment 3: (a) classification accuracy and (b) number of features corresponding to classification accuracy.

Table 2.

Degradation state identification of the 3 methods in experiment 3.

The proposed method			LBP-NSGA-II-SVM			1D-TP-NSGA-II-SVM (up features)			1D-TP-NSGA-II-SVM (low features)
Max	Min	Mean	Max	Min	Mean	Max	Min	Mean	Max	Min	Mean
0.997(13)	0.98(10)	0.991	0.977(20)	0.958(16)	0.969	0.993(17)	0.97(13)	0.982	0.99(16)	0.971(14)	0.979

There are three reasons for the results. First of all, the proposed method for feature extraction combines the merits of FAST and Gabor filter. It can comprehensively reflect the local and global information of images. Second, the MHD enhances the integrity of the mined information. Third, in contrast, LBP and 1D-TP are insufficient in reflecting the information of signals.

Summary

To effectively extract the degradation features of hydraulic pumps, a novel method based on MHD and image feature extraction is proposed. The original signals are decomposed and transformed into gray-scale images. FAST algorithm and Gabor filter are applied to the images, and the histograms constructed by the filter response of the feature points are considered as feature vectors. They can fully reflect the local and global information of small-size samples. The effectiveness of the novel method and its superiorities in degradation feature extraction are validated using the experimental data. Furthermore, a new strategy for identifying the degradation states of hydraulic pumps is proposed based on our feature extraction method, NSGA-II and SVM. The experimental results show that the six different degradation states of the hydraulic pump can be successfully identified by the proposed method. This paper has 4 major contributions as follows:

(1) The combination of FAST feature detection and Gabor filters can effectively obtain local and global information of the image.

(2) MHD was introduced to increase the ability to obtain information.

(3) A novel method based on our feature extraction method, NSGA-II and SVM was proposed, which was applied to degradation state identification for hydraulic pumps.

(4) The superiority of the method proposed in this paper was demonstrated with experimental signals.

In this study, the experimental data comes from a hydraulic pump with prefabricated faults. The purpose of the experiment is to verify the effectiveness of the proposed method in identifying the degradation state of the hydraulic pump. When it comes to condition monitoring of an actual hydraulic pump, the real-time vibration data of the hydraulic pump is acquired first. After dividing the data into some samples, the feature representations are calculated according to the feature extraction method proposed in this paper. Vibration data of the prefabricated faults has extracted features in advance. In this way, two sample sets are obtained. The sample set belonging to the actual hydraulic pump is considered as the testing set. The other is considered as the training set. These two sample sets are fed to the NSGA-II-SVM proposed in this paper to conduct feature selection and state recognition, and the degradation state of the actual hydraulic pump can be identified. As the identification continues, the training set can be continuously updated, which is conducive to the subsequent identification.

In previous studies, some effective feature indicators were proposed by us. Fusing the degradation features proposed in this paper with them is the focus of further studies.

Footnotes

Acknowledgments

The authors gratefully acknowledge the support of National Natural Science Foundation of China (Grant No. 51275524).

Declaration of conflicting interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This project is supported by National Natural Science Foundation of China (Grant No. 51275524).

ORCID iDs

Hong-ru Li

References

Wang

Zhang

, et al. A load sequence design method for hydraulic piston pump based on time-related markov matrix. IEEE Trans Reliab 2018; 67: 1237–1248.

. A conditional factor VAE model for pump degradation assessment under varying conditions. Appl Soft Comput 2021; 100: 106992.

. Vibration signal fusion using improved empirical wavelet transform and variance contribution rate for weak fault detection of hydraulic pumps. ISA Trans 2020; 107: 385–401.

Chen

Cheng

, et al. Fault diagnosis of planetary gear based on entropy feature fusion of DTCWT and OKFDA. J Vib Control 2017; 24(21): 5044–5061.

Kumar

. Time-frequency analysis and support vector machine in automatic detection of defect from vibration signal of centrifugal pump. Measurement 2017; 108: 119–133.

Sun

Tian

. A novel method based upon modified composite specturm and relative entropy for degradation feature extraction of hydraulic pump. Mech Syst Signal Process 2019; 114: 399–412.

Jiang

Peng

. Hierarchical entropy analysis for biological signals. J Comput Appl Math 2011; 236: 728–742.

Han

Wang

Zhu

, et al. Intelligent fault diagnosis of rotating machinery using hierarchical lempel-ziv complexity. Appl Sci 2020; 10: 4221.

Yang

, et al. A fault diagnosis scheme for planetary gearboxes using adaptive multi-scale morphology filter and modified hierarchical permutation entropy. Mech Syst Signal Process 2018; 105: 319–337.

10.

Yang

Jia

. Health condition identification for rolling bearing based on hierarchical multiscale symbolic dynamic entropy and least squares support tensor machine–based binary tree. Struct Health Monit 2020; 20: 151–172.

11.

Kaplan

Kaya

Kuncan

, et al. An improved feature extraction method using texture analysis with LBP for bearing fault diagnosis. Appl Soft Comput 2020; 87: 106019.

12.

Zheng

Cheng

, et al. A new fault diagnosis method for planetary gear based on image feature extraction and bag-of-words model. Measurement 2019; 145: 1–13.

13.

Kuncan

Kaplan

Mi Naz

, et al. A novel feature extraction method for bearing fault classification with one dimensional ternary patterns. ISA Trans 2020; 100: 346–357.

14.

Wang

Zhang

, et al. A deep learning method for bearing fault diagnosis based on time-frequency image. IEEE Access 2019; 7: 42373–42383.

15.

Jya

, et al. Image representation of vibration signals and its application in intelligent compound fault diagnosis in railway vehicle wheelset-axlebox assemblies. Mech Syst Signal Process 2021; 152: 107421.

16.

Hoang

Kang

. Rolling element bearing fault diagnosis using convolutional neural network and vibration image. Cogn Syst Res 2018; 53: 42–50.

17.

Ruiz

Mujica

Alferez

, et al. Wind turbine fault detection and classification by means of image texture analysis. Mech Syst Signal Process 2018; 107: 149–167.

18.

Zhang

Peng

. Bearings fault diagnosis based on convolutional neural networks with 2-D representation of vibration signals as input. MATEC Web of Conferences 2017; 95: 13001.

19.

Bhat

. Makeup invariant face recognition using features from accelerated segment test and eigen vectors. Int J Image Grap 2017; 17: 1750005

20.

Zuñiga

Florindo

Bruno

. Gabor wavelets combined with volumetric fractal dimension applied to texture analysis. Pattern Recognit Lett 2014; 36: 135–143.

21.

Jia

R-X

Lei

Y-K

, et al. Histogram of oriented lines for palmprint recognition. IEEE Trans Syst Man Cybern Syst 2014; 44: 385–395.

22.

Deb

Pratap

Agarwal

, et al. A fast and elitist multiobjective genetic algorithm NSGA-II. IEEE Trans Evol Comput 2002; 6: 182–197.

23.

Noble

. What is a support vector machine? Nat Biotechnol 2006; 24: 1565–1567.

24.

Humeau-Heurtier

. Texture feature extraction methods: a survey. IEEE Access 2019; 7: 8975–9000.

25.

X-S

Qin

A-S

, et al. Machinery fault diagnosis scheme using redefined dimensionless indicators and mRMR feature selection. IEEE Access 2020; 8: 40313–40326.

26.

Srinivas

Deb

. Muiltiobjective optimization using nondominated sorting in genetic algorithms. Evol Comput 1994; 2: 221–248.

27.

Mirjalili

Dong

Sadiq

, et al. Genetic algorithm: theory, literature review, and application in image reconstruction. In: Mirjalili

Song Dong

Lewis

(eds) Nature-inspired Optimizers. Cham: Springer, 2020, pp. 69–85, Vol. 811.

28.

Wang

Zhao

Yuan

, et al. Application of NSGA-II Algorithm for fault diagnosis in power system. Electric Power Syst Res 2019; 175: 105893

29.

Srivastava

Bhat

Singh

. Fault diagnosis, service restoration, and data loss mitigation through multi-agent system in a smart power distribution grid. Energy Sourc Recovery, Util, Environ Eff 2020: 1–26.

30.

Boser

. A training algorithm for optimal margin classifiers. Proc Annu Acm Workshop Comput Learn Theor 2008; 5: 144–152.

31.

Scholkopf

Kah-Kay

Burges

CJC

, et al. Comparing support vector machines with Gaussian kernels to radial basis function classifiers. IEEE Trans Signal Process 1997; 45: 2758–2765.

32.

Hearst

Dumais

Osuna

, et al. Support vector machines. IEEE Intell Syst Their Appl 1998; 13: 18–28.

33.

Kim

de Weck

. Adaptive weighted-sum method for bi-objective optimization: Pareto front generation. Struct Multidiscip Optim 2004; 29: 149–158.

34.

Rosten

Porter

Drummond

. FASTER and better: a machine learning approach to corner detection. IEEE Trans Softw Eng 2010; 32: 105–119.

35.

Bianconi

Fernández

. Evaluation of the effects of Gabor filter parameters on texture classification. Pattern Recognit 2007; 40: 3325–3335.

36.

Zhang

, et al. Adaptive learning Gabor filter for finger-vein recognition. IEEE Access 2019; 7: 159821–159830.

37.

Wei

Yuan

, et al. A new gabor filter-based method for automatic recognition of hatched residential areas. IEEE Access 2019; 7: 40649–40662.

38.

Chen

Zhang

. Effects of different Gabor filters parameters on image retrieval by texture. In: 10th International Multimedia Modelling Conference, Brisbance, Australia, 5-7 January 2004, 2004, pp.273–278. New York: IEEE.

39.

Dunn

Higgins

. Optimal Gabor filters for texture segmentation. IEEE Trans Image Processing 1995; 4: 947–964.

40.

Abdi

Williams

. Principal component analysis. Wiley Interdiscip Rev Comput Stat 2010; 2: 433–459.

Degradation state identification for hydraulic pumps using modified hierarchical decomposition and image processing

Abstract

Keywords

Introduction

Model establishment

Feature extraction

Modified hierarchical decomposition (MHD)

FAST feature detection

Gabor filter

The proposed feature extraction method

Feature selection and state recognition based on NSGA-II and SVM

Nondominated sorting genetic algorithm-II

Support vector machine

The proposed method for feature selection and state recognition

Suitable parameters determination

The proposed degradation state identification method for hydraulic pump

Experimental validation

Experiment 1: Selection of parameters t , λ , σ , and Ksize

Experiment 2: Verification for the necessity of modified hierarchical decomposition

Experiment 3: Comparison among the proposed method, LBP, and 1D-TP

Summary

Footnotes

Acknowledgments

Declaration of conflicting interests

Funding

ORCID iDs

References

Experiment 1: Selection of parameters $t$ , $λ$ , $σ$ , and Ksize