Abstract
Fault diagnosis is a problem processing variable information obtained from different sources in nature. Evidence theory, efficient to deal with information viewed as evidence, is widely used in fault diagnosis. However, a shortcoming of the existing fault diagnosis methods only gets probability distribution rather than the basic probability assignment. A novel method of generating basic probability assignment that takes information quality into account is proposed. The probability distribution is determined by the preliminary matrix and sampling matrix that are constructed by sensor data. And the quality of probability distribution is taken as the discount factor and the rest of belief is assigned to the universal set. Hence, the basic probability assignment is obtained. Then, basic probability assignment can be combined with Dempster and Shafer evidence theory to determine the status of the engine. An application of engine fault is shown to illustrate the practicability of the proposed method. Then by comparing the result of the method which takes information quality into account (the proposed method) and does not do it, the former is better than the latter. Finally, the reliability analysis shows that the proposed method has strong reliability because performance accuracy is 100% when the error rate is less than 10%.
Introduction
Fault diagnosis is important to system to correct timely and work smoothly. Up to now, fault diagnosis has been applied extensively to all kinds of profession, such as mechanics,1–4 chemistry, 5 nucleus, 6 and electric. 7 A number of approaches to optimize the algorithm of fault diagnosis are proposed, such as the average current Park’s vector approach, 8 a fuzzy approach,9,10 and optimized threshold de-noising method. 2
In real system, a larger quantity of information can be obtained from all kinds of sensors that detect the concrete values. Consequently, it is necessary for decision-makers to get rational result by considering all the complicated information11–15 whose certainty maybe very high or low. And some methods of information fusion 16 used in fault diagnosis and other fields have been proposed, such as Kalman Filter,17–19 neural network,20–22 and fuzzy logic.23–28
In addition, evidence theory29,30 is efficient to deal with uncertainty31–37 and Dempster’s rule can take the advantage of evidence combination from different sources without prior information. Up to now, on the basis of evidence theory, D-number,38–40 Z-number,41,42 and so on, have been studied by lots of scholars. As a result, evidence theory is not only used in medical diagnosis,43–45 dependence assessment, 46 correlation,47,48 data stream, 49 forensic crime investigations, 50 traffic, 51 and target recognition52,53 but also in fault diagnosis.54–59 There are many researches about applying evidence theory into fault diagnosis. For example, the new combination rule 57 is then built to allocate the conflicted information from multi-sensors based on the support degree of focal element. A novel-weighted evidence combination rule 60 based on evidence distance and uncertainty measure is proposed. A novel method 61 that comprehensively analyzes vibration and temperature signals to diagnose bearing faults based on improved Dempster–Shafer (D-S) evidence theory is presented. A novel dissolved gas analysis (DGA) method 62 for power transformer incipient fault diagnosis based on integrated adaptive neuro fuzzy inference system (ANFIS) and Dempster–Shafer theory (DST) is presented. A weak thruster fault detection method 63 is developed based on the combination of artificial immune system and single pre-processing. The architecture of an expert system 64 that uses flame images grabbed during the combustion process in an experimental oil furnace as input parameters is presented. An effective method 65 for precise fault diagnosis of planetary gearbox based on fusion of vibration and acoustic data using the DST is proposed. A new transformer fault diagnosis method 66 based on a wavelet neural network optimized by adaptive genetic algorithm (AGA) and an improved D-S evidence theory fusion technique is proposed. In Basir and Yuan, 67 to make rational decisions, a method is proposed with respect to engine quality and to evaluate the performance of the proposed information fusion system, a criterion is presented. But there are two issues needed to be improved, one is that the mass function obtained by calculating the distance between the measured features and fault prototypes is only probability distribution for singleton and cannot be called as basic probability assignment to some extent. And the other is that information quality is not taken into consideration.
To handle the above issues, in this article, a new method that Shannon entropy68,69 of each probability obtained from each feature is used as discount factor70–72 to acquire the basic probability assignment is proposed. All the measurements obtained from sensors can characterize two faults: (1) X1: exhaust valve fault; (2) X2: piston ring fault, but not all of the measurements can distinguish fault accurately in some situation, in other words, measurements have different information quality. Therefore, the mass function can be obtained by considering the information quality and on basis of probability distribution. The main advantages of this method are that Shannon entropy as the discount factor that is assigned to the universal set can obtain basic assignment instead of probability distribution and decrease conflicts to fuse efficiently.
This article is organized as follows: in section “Preliminary,” the concepts and rule of evidence theory and notations and formulation of the Shannon entropy are introduced. In section “Proposed method,” we present the frame of discernment and the new evidence combination. In section “Application in fault diagnosis,” an application is used to illustrate efficiency and reliability of the proposed method. In section “Conclusion,” the study is briefly concluded.
Preliminary
Preliminary notion of the D-S evidence theory
Evidence theory is the classical mathematic theory of evidence which is initially based on Dempster’s work concerning lower and upper probability distribution families and expanded by Shafer. In this section, some basic concepts and functions are introduced as follows:
1. The frame of discernment: Let
the set consisting of all the subsets of
where
2. Mass function: A key point of the frame of discernment is the basic probability assignment (BPA), which is a mapping of the power set
where
3. Rules of evidence combination: Assuming
The
Shannon entropy
The idea of entropy, of which Shannon entropy is accepted by most people, is an important concept to the probability distribution on the space
where
Proposed method
In this article, the proposed method is in Figure 1 shown and detailed as follows:

The flowchart of the proposed method.
Faults in the frame of discernment
To solve fault diagnosis problem by using evidence theory, the frame of discernment is necessary. Each element in the frame of discernment represents relevant fault that machine has when working. For example, we take
where
In fault diagnosis, possessing information is the prerequisite of all work. Accordingly, various sensors, such as the vibration sensor and the acoustic sensor, are used to detect the corresponding characteristic values that describe some features to determine the status of the engine and decide which fault takes place. Besides information obtained from sensors when the engine is working, the relevant parameter of the specific engine is also significant, or decision made cannot make a comparison. The specific feature value corresponding to the engine being in specific fault is necessary to be obtained. Consequently, a preliminary feature matrix of
where N represents the number of faults taking place in the engine and M represents all feature values that the sensor can obtain.
After preliminary feature matrix is established, the values obtained from all sensors when the engine is working, it is used to construct sampling matrix compared by preliminary feature matrix. Let
where k represents the kth sampling and
A key problem is that which method we choose to be used to calculate the basic probability assignment. Obviously, if the more similar the element of measurement vector S k we get from sensors is to the corresponding row vector {hj1hj2 . . . hjM} of the preliminary matrix, the more probable the corresponding fault Xj occurs. Inversely, the less similar it is, the less probable it does. There are many measures to quantify the distance between the measured feature obtained from sensors and the relevant parameter value. The absolute distance measure is used in this article as follows
where k is the kth sample, i is the ith fault, and j is the jth feature that sensors detect.
The distances between all sensor measurements and the relevant parameter values can be captured in a matrix form 73
Each third dimension in the matrix represents the distance between measurements obtained from sensors and all fault including non-fault. The smaller the distance
After normalization, a probability matrix of
Accordingly, the probability is obtained from the above matrix. Each vector obtained from the measurement and preliminary parameter in the matrix can be used as the probability because each feature can detect the fault. According to the preceding analysis, each fault is determined by many features. However, some features do not correctly distinguish the types of faults in some situation. For example, the feature value is equal to 10 when fault
To make
Then, for
Evidence combination in fault diagnosis
There are many rules proposed to fuse information which are based in D-S evidence theory. Although these rules74,75 measure and decrease conflict or make result be rational intuitively, they also have some drawbacks, increasing computation complexity or lacking associative property. Due to the above basic probability assignments having fewer conflict when information quality is taken consideration, the classic Dempster’s rule, having fewer computation complexity, is the best choice for us to fuse information. Consequently, according to equations (5) and (6), a mass function can be obtained as follows
The decision is made on basis of the mass function. The rule decision-makers use is maximum support rule, in which the hypothesis with maximum belief function is chosen to represent the state of the engine. Intuitively, if
In addition, a situation where an unknown fault may occur when the engine is working is qualitatively analyzed. If this fault occurs, because we have not added this fault in the frame of discernment, the quality of information obtained by the sensors is reduced. Then, based on
Application in fault diagnosis
A practical application of the proposed method
In this section, an application for detecting the state of the engine76,77 is given to illustrate the preceding proposed fault diagnosis method. Three sensors (two acceleration sensors and one acoustic sensor) are used to detect the state of the engine. One acceleration sensor is mounted on the cylinder cover near the outlet valve. Another is located on the cylinder cover near the inlet valve. Their peak-to-peak value (P-to-P) in the time domain and the frequency of the maximum spectrum
Then, the power set of X is established as follows
Six features are taken into consideration for each state of the engine and their characteristic values are obtained as follows
The values of H that are from Basir and Yuan’s 67 research are shown in Table 1.
Features according to previous knowledge.
Then, the characteristic values of the features are collected and a feature matrix is constructed by data obtained from sensors after four samplings as follows
The values of S that are from Song and Jiang’s 78 research are shown in Table 2.
Features according to samplings.
According to equations (9)–(12), the probability matrix can be established as follows
The result of calculation is shown in Table 3.
Probability distribution.
All the measurements obtained from sensors can characterize two faults, but not all of the measurements can distinguish fault accurately in some situation, in other words, measurements have different information quality. The discount factor obtained from the Shannon entropy, representing information quality, is used to construct basic probability assignments. Therefore,
The value of
The discount factor.
After the basic probability assignment matrix constructed by considering information quality is obtained, the Dempster’s rule of evidence combination is used to get mass function in this article. According to equations (5) and (6), we can get the final result as follows
To show the difference of mass function intuitively, the histogram is shown as follows:
The rule decision-makers use is maximum support rule, in which the hypothesis with maximum belief function is chosen to represent the state of the engine. Consequently, from Figure 2, the basic probability assignment (0.9981) of

The values of mass function for the methods which take information quality into account and do not do it.
In Figure 2, both methods get effective results that the fault is
Reliability analysis and comparison with the typical method
In this section, the reliability of the proposed method is analyzed by the procedure of simulating random errors in actual measurement. 67 And maximum support rule is used as a decisive standard in this analysis.
When the sensors perform measurement, there is a certain error for data obtained from sensors due to its own measurement error. Therefore, on this basis, we would give a random error rate no more than a certain percent (5%, 10%, 15%, and 20%) to the data obtained from Table 2 for reliability analysis. Use the proposed method to process the new data obtained, then according to maximum support rule, performance accuracy is obtained and it is shown in Table 5. We can find that when the error rate is less than 10%, performance accuracy is 100%, so the proposed method has strong reliability.
Performance accuracy.
Basir and Yuan 67 had used three sensors (two acceleration sensors and one acoustic sensors) to test the efficiency of the typical method. The performance accuracy is displayed on the left side of the Table 5. By comparing performance of the new method and the typical method, the former is slightly better than the latter.
Conclusion
In this article, a new method that entropy of each probability obtained from each feature is used as the discount factor to get the basic probability assignment is proposed. First, it is important to construct the frame of discernment. Second, the preliminary matrix is established by expert knowledge. Third, the basic probability assignment matrix can be obtained from calculating the distance between the values from samplings and preliminary matrix. Last, the Dempster’ rule is used to combine evidence to get final result. The proposed method could be demonstrated efficient from the preceding example. And the proposed method also has strong reliability. In addition, in this article, there is a limitation of this article which is that multiple subsets are not taken into account because the engine may have multiple faults at the same time. Therefore, the future research direction is to extend the single subset to multiple subsets to more effectively identify one or more possible faults of the engine at the same time.
Footnotes
Acknowledgements
The authors greatly appreciate the reviewers’ suggestions and the editor’s encouragement.
Handling Editor: ZW Zhong
Data availability statement
The authors confirm that the data sources in this paper are public.
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was partially supported by National Natural Science Foundation of China (Grant Nos 61573290 and 61503237).
