An improved method to determine basic probability assignment with interval number and its application in classification

Abstract

Due to its efficiency to handle uncertain information, Dempster–Shafer evidence theory has become the most important tool in many information fusion systems. However, how to determine basic probability assignment, which is the first step in evidence theory, is still an open issue. In this article, a new method integrating interval number theory and k-means++ cluster method is proposed to determine basic probability assignment. At first, k-means++ clustering method is used to calculate lower and upper bound values of interval number with training data. Then, the differentiation degree based on distance and similarity of interval number between the test sample and constructed models are defined to generate basic probability assignment. Finally, Dempster’s combination rule is used to combine multiple basic probability assignments to get the final basic probability assignment. The experiments on Iris data set that is widely used in classification problem illustrated that the proposed method is effective in determining basic probability assignment and classification problem, and the proposed method shows more accurate results in which the classification accuracy reaches 96.7%.

Keywords

Dempster–Shafer evidence theory basic probability assignment interval number recognition

Introduction

Information fusion is a process to combine information from the multi-source of same object or scene to obtain more complex, reliable, and accurate information. Multi-source information fusion technology plays a significant role in real applications, such as classification problem,^1–6 fault diagnosis,^7–9 medical diagnosis,¹⁰ risk and reliability analysis,¹¹ decision-making,^12,13 tracking problem,^14,15 and online estimation of batteries state-of-charge.¹⁶ There are many methods to analyze fused data from multi-sources including Dempster–Shafer evidence theory (DS theory),^1–4 principal component analysis (PCA),^17,18 independent component analysis (ICA),^19,20 and Z-number.^21–23 As an effective tool,²⁴ DS theory has been widely used especially for classification problem.^25–27 Compared with ICA and PCA in combination with another kind of approaches, Dempster’s combination rule can fuse multi-source information without depending on prior information²⁸ and it has tremendous advantages in uncertainty modeling and evidence combination, that is, it allows for the allocation of probability mass to sets or intervals.²⁹ Many scholars have made great contribution to improve evidence theory, for example, Su and Xu extended the evidence theory which can combine dependent evidence,^30,31 Gong et al.³² proposed a new basic probability assignment (BPA) construction method, and Jiang and Hu³³ extended Yager’s soft likelihood function to combine BPA. The first step of this evidence theory is to obtain BPA. But how to determine BPA is still an open issue and there is no general method.

Recently, interval number has become more popular in uncertainty measurement field in which the bounds of the uncertain coefficients are only required, not necessarily knowing the probability distributions or membership functions.^34,35 The interval method provides a natural way to directly incorporate measurement uncertainty into computation,^36,37 and the combination of interval method and DS theory has been demonstrated convenient and comprehensive.^38,39 Kang et al.⁴⁰ have proposed a method to determine BPA based on interval number. In Kang et al.’s method, the relationship between interval number model of the sample and test sample is determined to obtain BPA. It can be found that their approach can be improved in some way. Because Kang et al. used maximum and minimum values of the sample to be the upper and lower bound values of interval number, their approach cannot handle environmental noises and human disturbances in real application well. In the engineering applications, the data reported by sensors may be imprecise due to noises, that is, the maximum and minimum values are probably much more than or less than any other data in the same sample. If it happens, the maximum and minimum values are not precise enough to describe the detailed information of the target.

To address this issue, an improved method to determine BPA is proposed in this article and its reasonability is verified by some classification problems. First of all, the process of constructing interval number model is focused. The clustering method can divide data points into some clusters and each cluster center can be regarded as the mean point of the points belonging to same cluster. If the sample can be divided into two clusters, the two cluster centers are two mean points of the upper and lower parts of the sample. Then, the two cluster centers can be the upper and lower bound values of the interval number. Based on this idea,k-means++method is used to find cluster centers. After constructing improved interval number model, the distance and similarity between sample and model are calculated. Finally, the similarity is normalized and the value of BPA is obtained. Some experiments prove that the improved interval number model can handle noises well and the proposed method shows more precise results in determining BPA than the related work.

The contribution of this article is conducted as follows:

The proposed method put forward an innovative focus on the problem of BPA determination in DS evidence theory.

A new interval number model is built based on k-means++, which is effective in determining the BPAs.

The proposed method is data-driven, thus can reduce the subjectivity.

The proposed method can be easily used in many engineering applications to accomplish classification problem in reality.

The article is organized as follows. Section “Preliminaries” is about preliminaries’ knowledge of Dempster–Shafer evidence theory, interval number, and k-means++cluster method. The proposed method to determine BPA is presented in section “The proposed method.” Section “Numerical example” illustrates a numerical example. Some experiments and discussion are presented in section “Experiments and analysis.” Conclusions are presented in section “Conclusion.”

Preliminaries

Dempster–Shafer evidence theory

Researchers have put forward many powerful methods to handle uncertainty, including evidence theory,⁴¹ D numbers,^42–44 and so on. Dempster–Shafer evidence theory was proposed by Dempster²⁶ and later developed by Shafer.²⁷ It is a complete theory of dealing with uncertainty. It can handle not only epistemic uncertainty but also aleatoric uncertainty. It has abilities to deal with uncertainty and unknown information and requires fewer conditions than probability theory.⁴⁵ It is hence widely applied in the field of information fusion, such as sensor data fusion,^46–48 decision-making,^49–51 multicriteria decision-making,⁵² risk and reliability analysis,⁵³ target recognition,⁵⁴ failure mode and effects analysis,⁵⁵ online energy management strategy,⁵⁶ and conflict evidence combination.⁵⁷ Formally, DS theory concerns the following preliminary notations.

Frame of discernment

DS evidence theory first supposes the definition of a set of hypotheses $θ$ that represents the complete answer collection a system can give when a certain problem remains to be solved, which is called the frame of discernment. For instance, in a real application of Iris classification problem, the hypotheses stand for three classes of Iris data sets, such as Setosa, Versicolor, and Virginica, and it is defined as follows

θ = {H_{1}, H_{2}, \dots H_{n}}

(1)

where the set $θ$ is composed of N exhaustive and exclusive hypotheses $H_{i} (i = 1, 2, 3, \dots, N)$ and each hypothesis $H_{i}$ represents one possible answer that a system can give. Any one set that contains only one element in the N sets is called singleton.

BPA

Denote $P (θ)$ , the power set composed of the $2^{N}$ propositions of $θ$ , by

\begin{matrix} P (θ) = {Ø, {H_{1}}, {H_{2}}, \dots, {H_{N}}, \\ {H_{1} \cup H_{2}}, {H_{1} \cup H_{3}}, \dots, θ} \end{matrix}

(2)

where Ø denotes the empty set and each subset in $P (θ)$ represents a proposition of the problem. When the frame of discernment is determined, the BPA m is defined as a mapping of the power set $P (θ)$ to a number between 0 and 1, that is

m : P (θ) \to [0, 1]

(3)

which satisfies the following conditions

\sum_{G \in P (θ)} m (G) = 1

(4)

m (Ø) = 0

(5)

The BPA m is also called the mass function. $m (G)$ expresses the proportion of all relevant and available evidence that supports the claim that a particular element of $P (θ)$ belongs to the set G but to no particular subset of G. Any subset G of $P (θ)$ , such that $m (G) > 0$ , is called a focal element.

Belief and plausibility functions

The belief function Bel is defined as follows

Bel (G) = \sum_{J \subseteq G} m (J)

(6)

The plausibility function Pl is defined as follows

Pl (G) = \sum_{J \cap G \neq Ø} m (J)

(7)

and

Pl (G) = 1 - Bel (\bar{G})

(8)

Pl (Ø) = 0

(9)

The function Bel is the lower limit function of proposition G and the function Pl is the upper limit function of proposition G.

Dempster’s combination rule

Suppose $m_{1}$ and $m_{2}$ are two BPAs obtained from two different evidence in the frame of discernment $θ$ ; Dempster’s rule of combination, denoted by $m = m_{1} \oplus m_{2}$ , also known as the orthogonal sum, is to combine two BPAs $m_{1}$ and $m_{2}$ to a new BPA

m (G) = \frac{\sum_{J \cap C = G} m_{1} (J) m_{2} (E)}{1 - z}

(10)

and

z = \sum_{J \cap E = Ø} m_{1} (J) m_{2} (E)

(11)

where z represents the conflict between two BPAs. The larger the value of z is, the more conflicting are two evidence.

Pignistic probability

Let m be a BPA on $θ$ . Its associated pignistic probability function BetP(G) is defined as follows

BetP (G) = \sum_{J \subseteq θ} \frac{| G \cap J |}{| J |} \cdot \frac{m (J)}{1 - m (Ø)}, \forall G \subseteq θ

(12)

where $| G |$ is the cardinality of subset G. BetP(G) is the piginistic probability transform proposed by Smets and Kennes.⁵⁸ The transferable belief model (TBM) is based on the assumption that beliefs have two levels: the “credal” level where beliefs are entertained and combined and the “piginistic” level where beliefs are used to make decisions, and function of piginistic probability transform is to transform a BPA to probability.

Interval number

Interval analysis is an approach to putting bounds on rounding errors and measurement errors in mathematical computation and thus developing numerical methods that yield reliable results.⁵⁹ Archimedes is the first one to use interval number with lower and upper bounds $[223 / 71, 22 / 7]$ to represent the value of $π$ in the third century BC.⁶⁰ Its rules for calculating and other subsets of the real numbers were the first one published by Young⁶¹ and modern interval arithmetic was developed by Moore⁶² in 1963. And then, many scholars made contribution to modern interval arithmetic.^59,63,64 Interval arithmetic is one of the mathematical theories of uncertainty.⁶³ Almost any scientific calculation starts with inaccurate initial data. The interval method provides a natural way to directly incorporate measurement uncertainty into computation. Usually, interval computation was designed for machine implementation. In applications, interval analysis provides rigorous enclosures of solutions to model equations. In this way, what a mathematical model represents can be obtained for sure, and, from that, it can be determined whether it adequately represents reality. It is used in many fields such as parameter estimation problems,⁶⁴ set inversion, and set estimation.⁶⁵

Definition 1

If $\tilde{a} = [a^{-}, a^{+}] = {a | a^{-} \leq a \leq a^{+}}, a^{-}, a^{+} \in R$ , $\tilde{a}$ is an interval number. Particularly, if $a^{-} = a^{+}$ , $\tilde{a}$ becomes an integer.

Definition 2

The intersection of two interval numbers $A (a_{1}, a_{2})$ and $B (b_{1}, b_{2})$ is empty if either $a_{2} < b_{1}$ or $b_{2} < a_{1}$ . In this case

A \cap B = Ø

Otherwise, the intersection $A \cap B$ is defined as follows

A \cap B = [max (a_{1}, b_{1}), min (a_{2}, b_{2})]

Definition 3

Let $A (a_{1}, a_{2})$ and $B (b_{1}, b_{2})$ be two interval numbers and the distance between them is defined as follows⁴⁰

\begin{matrix} D^{2} (A, B) & = \int_{- 1 / 2}^{1 / 2} \int_{- 1 / 2}^{1 / 2} {[(\frac{a_{1} + a_{2}}{2}) + x (a_{2} - a_{1})] - [(\frac{b_{1} + b_{2}}{2}) + y (b_{2} - b_{1})]}^{2} dxdy \\ = {[(\frac{a_{1} + a_{2}}{2}) - (\frac{b_{1} + b_{2}}{2})]}^{2} + \frac{{[(a_{2} - a_{1}) + (b_{2} - b_{1})]}^{2}}{2} \end{matrix}

(13)

Definition 4

If $A (a_{1}, a_{2})$ and $B (b_{1}, b_{2})$ be two interval numbers, its similarity $S (A, B)$ can be defined as follows⁴⁰

S (A, B) = \frac{1}{1 + α D (A, B)}

(14)

where $α > 0$ is support coefficient and its function is to increase the discreteness of data preventing errors due to accuracy according to Kang et al.⁴⁰ The value of $α$ is chosen as 5 in this article, and the reason is discussed in section “Experiments and analysis.” In this section, A and B are interval numbers.

k-means++ method

k-means clustering is a method of vector quantization that is popular for cluster analysis in data mining. Given an integer k and a set of n data points in anI-dimensional space $χ$ , the proposed method can divide n points into k clusters and choose k centers, so that the total squared distance between each point and its closest center is minimized.⁶⁶ Given an integer k and a set of n data points $χ$ , our aim is to choose k centers C and to minimize the potential function

ϕ = \sum_{x \in χ} min_{c \in C} ‖ x - c ‖^{2}

(15)

where x is a data point and c is a cluster center. However, the k-means method has at least two major theoretic shortcomings. First, it has been shown that the worst case running time of this method is super-polynomial in the input size.⁶⁷ Second, the approximation found can be arbitrarily bad with respect to the objective function compared to the optimal clustering.

To improve these issues, k-means++ method is proposed by Arthur and Vassilvitskii.⁶⁷

The proposed method addresses the second shortcoming by specifying a procedure to initialize the cluster centers for the standard k-means method. The main point behind this approach is that the first cluster center is chosen randomly from the given data points after which each subsequent cluster center is chosen from the remaining data points, and the shortest distance $D (x)$ between a data point and its closest center already chosen with probability $D (x)^{2} / \sum_{x \in χ} D (x)^{2}$ is measured. This method yields considerable improvement in the final error of k-means and lowers the computation time. In this section, C is the set of cluster centers.

Algorithm 1. k-means++ method.⁶⁷
Input: k, the number of clusters. $χ = {x_{1}, x_{2}, \dots, x_{n}}$ , a set of data points Output: $C = {c_{1}, c_{2}, \dots, c_{k}}$ . $L = {l (x) \| x = 1, 2, \dots, n}$ . $c \leftarrow \emptyset$ Choose one center x uniformly at random from $χ, C = C \cup {x}$ while $\| C \| < k$ do Sample $x \in χ$ with probability $\frac{D {(x)}^{2}}{\sum_{x \in χ} D {(x)}^{2}}$ ; $C \leftarrow C \cup {x}$ ; end foreach $c_{i} \in C$ do $l (x_{i}) \leftarrow \arg min \| x_{i} - c_{j} \|, j \in {1, \dots, k}$ ; end $changed \leftarrow false$ ; $iter \leftarrow 0$ ; repeat foreach $c_{i} \in C$ do $UpdateCluster (c_{i})$ end foreach $x_{i} \in χ$ do $minDist \leftarrow \arg min \| x_{i} - c_{j} \|, j \in {1, \dots, k}$ if $minDist \neq l (x_{i})$ then $l (x_{i}) \leftarrow minDist$ $changed \leftarrow true$ end end $iter + +$ until $changed = true$ and $iter \leq MaxIters$

Algorithm 1. k-means++ method.⁶⁷

Input: k, the number of clusters.

χ = {x_{1}, x_{2}, \dots, x_{n}}

, a set of data points
Output:

C = {c_{1}, c_{2}, \dots, c_{k}}

L = {l (x) | x = 1, 2, \dots, n}

c \leftarrow \emptyset

Choose one center x uniformly at random from

χ, C = C \cup {x}

while

| C | < k

do
Sample

x \in χ

with probability

\frac{D {(x)}^{2}}{\sum_{x \in χ} D {(x)}^{2}}

;

C \leftarrow C \cup {x}

;
end
foreach

c_{i} \in C

l (x_{i}) \leftarrow \arg min | x_{i} - c_{j} |, j \in {1, \dots, k}

;
end

changed \leftarrow false

;

iter \leftarrow 0

;
repeat
foreach

c_{i} \in C

UpdateCluster (c_{i})

end
foreach

x_{i} \in χ

minDist \leftarrow \arg min | x_{i} - c_{j} |, j \in {1, \dots, k}

minDist \neq l (x_{i})

then

l (x_{i}) \leftarrow minDist

changed \leftarrow true

end
end

iter + +

until

changed = true

and

iter \leq MaxIters

The proposed method

In this section, a new method will be presented for determining BPA. In multiple sensors system, the specific value of the target properties is hard to obtain. And, due to error caused by the actual measurement and the calculation of the processing of data and lack of information, objective error will often be caused. So, some interval range represents the behavior of the feature instead of a specific value, that is, interval number. Using interval number to solve the uncertainty problem can avoid the subjective error and objective error and the result accords with the actual need. Interval number can be used to represent information of an object as mentioned above. But in application, things go different. Not only the procedure of constructing interval number model may cause addition or loss of target information compared to original information but also, in engineering applications, environmental noises and human disturbances often lead to conflict among the reports of multiple sensors. Therefore, data reported by multiple sensors may be imprecise, that is, the maximum value is probably much more than any other data and the minimum is much less than others. Thus, the kind of model using the maximum and minimum values as the upper and lower bounds of interval number is not precise enough to use. Based on the analysis described above, an improved method that constructs the interval number model using k-means++cluster method is proposed. Data of each attribute reported by sensors are divided into two clusters according to its value and each cluster center can be found withk-means++ method. The two cluster centers can be regarded as the upper and lower bounds of interval number. Until now, a new and more accuracy model is constructed.

A flow chart of the proposed method is illustrated in Figure 1 and details are shown as follows. Suppose that there are n classes ${c_{1}, c_{2}, \dots, c_{n}}$ in the frame of discernment. Each class $c_{i}$ has k attributes $c_{i 1}, c_{i 2}, \dots, c_{ik}$ . If the test sample is $ξ$ , it also has k attributes $ξ_{1}, ξ_{2}, \dots, ξ_{k}$ . m instances for each class $c_{i}$ are randomly chosen and a model is built

T^{i} = (t_{1}^{i}, t_{2}^{i}, \dots, t_{m}^{i})

where $T^{i}$ is a $k \times m$ matrix and $t_{m}^{i}$ is a column vector which donates $m th$ instance of class $c_{i}$ . Its each row donates all possible values of j attribution of each class $c_{i}$ . Different number of training instances per class will not have influence on the result only if the number of instances is enough to construct the model. For simplicity without loss of generality, m instances are used for each class.

Step 1. The multi-attribute data set is divided into two parts: the training set and the test set. The training set is used to build interval number model and the test set is used to evaluate the performance of proposed method.

Step 2. Construct improved interval number model.

For each row, divide it into two parts according to ascending order and each part can be regarded as a cluster. Find cluster center for two clusters and an improved interval number can be obtained by k-means++ method, where

A_{ij} = [a_{ij}, b_{ij}]

$A_{ij}$ donates the interval number model of $j th$ attribute of class $c_{i}$ . Then, the improved interval number model for each attributes of class $c_{i}$ can be obtained, where

M_{i} = (A_{i 1}, A_{i 2}, \dots, A_{ik})

Furthermore, a $k \times n$ matrix $M = (M_{1}, M_{2}, \dots, M_{n})'$ can represent all the improved interval number models for all classes, where each column of M represents the interval numbers belonging to different classes but the same attribute.

Step 3. Select a test instance and construct interval number.

Select a test instance $ξ$ . Each attribute of the instance is a number and can be regarded as a particular interval number $[ξ_{k}, ξ_{k}]$ , which has the same lower and upper values.

Step 4. Measure the distance between test sample and constructed model.

The distance between two interval numbers can be calculated according to equation (13)

\begin{array}{l} D (i, j) = D (ξ_{j}, A_{i j}) \\ = \sqrt{{[(\frac{ξ_{j} + ξ_{j}}{2}) - (\frac{a_{i j} + b_{i j}}{2})]}^{2} + \frac{{[(ξ_{j} - ξ_{j}) + (b_{i j} - a_{i j})]}^{2}}{12}} \end{array}

(16)

Step 5. Measure the similarity between test sample and constructed model.

The similarity can be obtained according to equation (14)

S (i, j) = \frac{1}{1 + α D (i, j)}

(17)

Step 6. Normalize similarity and generate BPA

BPA (i, j) = \frac{S (i, j)}{\sum_{j = 1}^{k} S (i, j)}

(18)

Step 7. So far, for a test sample $ξ$ , k BPAs have been determined for all attributes. Combine k BPAs and get the final BPA of sample $ξ$ using Dempster’s combination rule according to equations (10) and (11).

Figure 1.

The flow chart of the new method.

Numerical example

Step 1. In this step, 40 instances are randomly selected as the training set, and the remaining 10 instances serve as the test set.

Step 2. Construct improved interval number model. For instance, for attribute sepal length in the class of Setosa, k-means++ method is used first to find two cluster centers that are 4.90 and 5.40, respectively, as shown in Figure 2. Similar procedure is used for the other two classes and result is shown in Figure 3. In order to reflect the relationship between the attribute intervals of different classes, the intersection between them should be taken into account. The intersection of single subset proposition can represent the multi-subset propositions. (Based on the evidence theory, if a frame of discernment is {Setosa, Versicolor, Virginica}, its subset propositions are {Setosa}, {Versicolor}, {Virginica}, {Setosa, Versicolor}, {Versicolor, Virginica}, {Setosa, Virginica}, {Setosa, Versicolor, Virginica}). And, their relationships are shown clearly in Figure 4. Tables 1 and 2 show the numerical results.

Step 3. Select a test sample instance (5.1, 3.5, 1.4, 0.2) which belongs to species Setosa. For the attribute sepal length = 5.1 cm, calculate it as an interval number [5.1, 5.1] (Figure 5).

Step 4. Calculate the similarity between test sample and constructed model. The distance of two interval numbers needs to be calculated first according to equation (13). Then, the support coefficient $α$ is set 5 and the similarity can be calculated according to equation (14). The result is shown in Table 3 and the similarity between the test sample and class Setosa is much bigger than the others. It can be illustrated that the test sample belongs to class Setosa with high probability.

Step 5. Normalize the similarity and generate BPA (Table 4).

Step 6. Improved interval number model for the four attributes of the three species is constructed using the proposed method as shown in Table 5.

Step 7. In this step, BPAs for each attribute of each test sample are obtained and combined using Dempster’s combination rule to get the final BPA. For an arbitrary instance, its result is shown in Table 6. Obviously, the final BPA of hypothesis {Se} is the biggest and the result is almost equal to 1.

Step 8. The final BPA can be transformed to pignistic probability and it can be obtained as follows

\begin{array}{l} B e t P ({S e}) = 0.9881, B e t P ({V e}) \\ = 0.0072, B e t P ({V i}) = 0.0047 \end{array}

Figure 2.

The process of finding cluster center using k-means++.

Figure 3.

Interval numbers of sepal length of each species.

Figure 4.

Intersection of Versicolor and Virginica.

Figure 5.

Relationship between the selected test sample and the interval number model.

Table 1.

Interval numbers of sepal length (SL) of each species.

Hypothesis	Attribute (SL)
Setosa (Se)	[4.90, 5.40]
Versicolor (Ve)	[5.60, 6.35]
Virginica (Vi)	[6.30, 7.35]

Table 2.

Interval numbers of the intersection.

Hypothesis	Attribute (SL)
Se, Ve	[0, 0]
Se, Vi	[0, 0]
Ve, Vi	[6.30, 6.35]
Se, Ve, Vi	[0, 0]

SL: sepal length.

Table 3.

Similarity between sample’s sepal length and interval number model.

Hypothesis	Similarity
Se	0.5670
Ve	0.1816
Vi	0.1039
Se, Ve	0
Se, Vi	0
Ve, Vi	0.1403
Se, Ve, Vi	0

Table 4.

BPA between sample’s sepal length and interval number model.

Hypothesis	BPA
Se	0.5711
Ve	0.1829
Vi	0.1047
Se, Ve	0
Se, Vi	0
Ve, Vi	0.1413
Se, Ve, Vi	0

BPA: basic probability assignment.

Table 5.

Interval number model for each attribute of the three species.

Species	Attributes
Species	SL	SW	PL	PW
Se	[4.90, 5.40]	[3.10, 3.50]	[1.40, 1.50]	[0.20, 0.40]
Ve	[5.60, 6.35]	[2.40, 2.95]	[3.90, 4.60]	[1.20, 1.50]
Vi	[6.30, 7.35]	[2.80, 3.10]	[5.00, 5.80]	[1.80, 2.20]

Table 6.

BPAs for each attribute and final combined BPA.

Item	Attributes
Item	SL	SW	PL	PW	Combined BPA
{Se}	0.5711	0.3102	0.8650	0.7408	0.9881
{Ve}	0.1829	0.1285	0.0782	0.1843	0.0072
{Vi}	0.1047	0.1767	0.0568	0.1110	0.0047
{Se, Ve}	0	0	0	0	0
{Se, Vi}	0	0.2228	0	0	0
{Ve, Vi}	0.1414	0.1618	0	0	0
{Se, Ve, Vi}	0	0	0	0	0

BPA: basic probability assignment.

The maximum pignistic probability is taken as the decision-making criterion. As the result shows, the test sample belongs to Setosa and it is consistent with its actual class, and the final true pignistic probability is almost equal to 1 which illustrates proposed method has the superiority.

Experiments and analysis

Data set

Data set employed in this article is from UCI repository of machine learning databases (https://archive.ics.uci.edu/ml/index.php). This database contains many kinds of multi-attribute data and the Iris data set is one of the classic data sets used in the classification problem.⁶⁸ Iris data set contains the three classes (Setosa, Versicolor, and Virginica) and there are 50 instances for each of the three classes. Each type of iris plant contains the four attributes, namely, sepal length, sepal width, petal length, and petal width.

Experiments

Several experiments are conducted in this section to demonstrate the proposed approach. Multi-attribute data are common in many application systems and each attribute can be considered as an information source. Iris data set is a typical multi-attribute data set so it is suitable to do classification recognition. The whole data set is divided into training set and test sample as mentioned above. First, training set is used to build interval number model. The proposed method can be used to obtain BPAs for different attributes of a test sample. Then, these BPAs can be combined to get a final BPA for the test samples. The maximum pignistic probability can be regarded as the decision-making criterion. Therefore, when the final BPA is obtained, it can be transformed to pignistic probability by equation (12) and its classification can be determined according to its belief.

Experiment without noise

In order to demonstrate that the proposed method has better superiority and recognition rate than Kang et al.’s method, 50 instances of each class are tested to obtain their belief and its results are shown in Figure 6(a)–(c). Apparently, the proposed method can identify the correct category with higher probability than Kang et al.’s method. And, the training percentage is set from 50% to 100% to test two methods’ classification accuracy. Figure 6(a) shows the probability result of class Setosa, from which it is obvious that the proposed method can identify the correct category with a much higher probability. Figure 6(b) shows the probability result of class Versicolor; the proposed method also performs better than Kang et al.’s method in most cases. Figure 6(c) shows the probability result of class Virginica; for some instances, Kang et al.’s method fails to recognize the correct class as its results reach below 0.5, but for the proposed rule, it still can identify the correct class. Figure 6(d) shows the classification accuracy for the average of three classes, and our result is 95.33%, whereas Kang et al.’s result is 92.80%. Apparently, the proposed method is still better and has a more stable performance.

Figure 6.

(a–c) Probability results of three classes and (d) classification accuracy without noise.

Experiment with noise

As mentioned above, environmental noises and human disturbances often lead to conflict among the reports of multiple sensors in engineering applications. So, in this experiment, these circumstances are simulated by adding Gaussian noise to Iris data set to test the ability of proposed method in application. First, Gaussian noise is added to training set randomly and interval number model is constructed. Then, same steps are conducted and results are shown in Figure 7. Generally speaking, the proposed method performs better than Kang et al.’s method although in noise situation as it is more effective and stable to determine BPA and classify category in application. It is obvious from Figure 7(a) that probability result of class Setosa by Kang et al.’s method has an average drop of 0.5 but the proposed method still performs well without noise. The probability result of class Versicolor in Figure 7(b) shows that the proposed method is still better but not as obvious as without noise. The probability result of class Virginica in Figure 7(c) shows that it is similar to condition without noise; in some instances, the probability results can reach below 0.5, but the number increases. In addition, results of some instances are less than 0.4 which loses much accuracy. The proposed method is still convincing. Similarly, Kang et al.’s method does not perform well in classification in noise environment as its average accuracy decreases from 92.80% to 91% but the proposed method increases from 95.33% to 96% (Figure 7(d)). Obviously, the proposed method is not influenced by noise and even improves in accuracy.

Figure 7.

Probability results of three classes (a–c) and (d) classification accuracy with noise.

Classification test on different data sets

There are other well-known classifier algorithms in Waikato Environment for Knowledge Analysis (WEKA),⁶⁹ including naive Bayes (NB), support vector machine (SVM), SVM with radial basis function (RBF), decision tree learner (REFTree), 1 nearest neighbor (1NN), multilayer perceptron (MP), RBF network (RBFN), and so on. These are popular machine learning and data mining algorithms which have a good performance in classification problem. The comparison of identification accuracy between proposed method and these classifiers is presented in Table 7.

Table 7.

Classification results of different classifiers.

Data	NB	SVM	SVM-RBF	REFTree	IB1	MP	RBFN	Proposed method
Iris	96	96.7	92.7	94.7	94	96	96.7	96.7

About the support coefficient

In the previous section, a support coefficient $α$ is assigned to determine the similarity of two interval numbers. In this part, the object is to model the attribute of support coefficient. Obviously, its function is to increase the discreteness of data preventing errors due to accuracy. For all assignments to singletons of the correct test sample class, the average assignment and the classification accuracy are computed and recorded, and the average assignment and the classification accuracy are used for judgment. The influence of the parameter $α$ is discussed where the experimental results are shown in Figure 8. When $α$ is set 1, the distances of two interval numbers are not discrete and their values are relatively concentrated. Therefore, some errors are caused in the calculation of similarity due to the influence of data accuracy, and the results in Figure 8(a) and (b) are consistent with fact. With the increase in $α$ , the distance of two interval numbers is more discrete and judgment performs better. Through statistical analysis, conclusions are summarized as follows: when $α$ varies from 1 to 7, it may promote the accuracy of detecting the class that a test sample belongs to. If the support coefficient increases, the membership degree has a stable performance but the classification of the Iris will be much fuzzier. It shows that the evidence after combination will not give a correct judgment. Thus, the value of support coefficient $α$ is set 5 in the experiment setting.

Figure 8.

Judgment performance of support coefficient (a) without noise and (b) with noise.

Discussion

The two experiments prove that the proposed method has a better performance both in generating BPA and in classification and whether there is noise in data or not. For class Setosa, the belief of each instance is larger than 0.9 and some are even almost equal to 1 both in circumstances with noise and without noise. For classes Versicolor and Virginica, although the results are better than Kang et al.’s method, it is not as obvious as class Setosa. Kang et al.’s method apparently loses much accuracy in noise environment as most belief of class Virginica can reach below 0.5 and its classification rate decreases 1.8% in noise environment. The reason is that the shortcomings of their representation of interval number model, which only considers the maximum and minimum values, are considered and it is solved using k-means++ method. The proposed method can handle imprecise data and uncertain information as it constructs model using data from sensors. In the third experiment, it is clear that the classification accuracy of our method is one of the highest and it can be proved that our method has a good performance as other well-known classifiers. Based on the results, it is not difficult to see that the proposed method has a significant capacity to determine BPA compared with Kang et al.’s method.

Conclusion

In this article, a method based on interval number and k-means++ method to obtain BPA was proposed. The proposed method effectively avoids high data conflict reported from sensors caused by environmental noises and human disturbances so that it can be helpful to build reasonable model. Since the interval number is available with fewer data from the information source, in the real application, it is easy to apply this method in many engineering applications to accomplish multi-source data fusion and classification. Meanwhile, it is data-driven and can reduce the uncertainty of subjectivity. Finally, the experiment results supported that the proposed method is superior in determining BPA and simple and practical in actual engineering application, whereas the object lying in overlapping borders of several classes could be hard to classify and its classification result is considered with low reliability. In further research, this issue would be improved. Some researchers compared different classification algorithms and found that DS theory has marginal improvement in classification performance in fusion algorithm.⁷⁰ In the future work, the new BPA determination method will be applied in fusion algorithm to improve classification performance.

Footnotes

Acknowledgements

The authors greatly appreciate the reviewers’ suggestions and the editor’s encouragement.

Handling Editor: Daming Zhou

Declaration of conflicting interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This research was funded by the Chongqing Overseas Scholars Innovation Program (no. cx2018077), the National Natural Science Foundation of China (grant numbers 61672435, 61702427, and 61702426), and the 1000-Plan of Chongqing by Southwest University (grant number SWU116007).

References

Guan

Bell

The combination of multiple classifiers using an evidential reasoning approach. Artif Intell 2008; 172: 1731–1751.

Denœux

A k-nearest neighbor classification rule based on Dempster-Shafer theory. In: Yager

Liu

(eds) Classic works of the Dempster-Shafer theory of belief functions. Berlin: Springer, 2008, pp.737–760.

Liu

Z-G

Pan

Dezert

A new belief-based k-nearest neighbor classification method. Pattern Recog 2013; 46: 834–844.

Tabassian

Ghaderi

Ebrahimpour

Combining complementary information sources in the Dempster–Shafer framework for solving classification problems with imperfect labels. Knowl-Based Syst 2012; 27: 92–102.

Denoeux

A k-nearest neighbor classification rule based on Dempster-Shafer theory. IEEE T Syst Man Cy 1995; 25: 804–813.

DenuxDenoeux

Smets

Classification using belief functions: relationship between case-based and model-based approaches. IEEE T Syst Man Cy B 2006; 36: 1395–1406.

Jiang

Xie

Zhuang

et al . Failure mode and effects analysis based on a novel fuzzy evidential method. Appl Soft Comput 2017; 57: 672–683.

Deng

Generalized ordered propositions fusion based on belief entropy. Int J Comput Commun Contr 2018; 13: 792–807.

Xiao

A novel evidence theory and fuzzy preference approach-based multi-sensor data fusion technique for fault diagnosis. Sensors 2017; 17: E2504.

10.

Xiao

A hybrid fuzzy soft sets decision making method in medical diagnosis. IEEE Access 2018; 6: 25300–25312.

11.

Han

Deng

An enhanced fuzzy evidential DEMATEL method with its application to identify critical success factors. Soft Comput 2018; 22: 5073–5090.

12.

Sohn

Jeon

Han

EJ.

A new cost of ownership model for the acquisition of technology complying with environmental regulations. J Clean Prod 2015; 100: 269–277.

13.

Moon

Sohn

Case-based reasoning for predicting multiperiod financial performances of technology-based SMEs. Appl Artif Intell 2008; 22: 602–615.

14.

Xue-bo

You-xian

. Optimal fusion estimation covariance of multisensor data fusion on tracking problem. In: Proceedings of the international conference on control applications, Glasgow, 18–20 September 2002, vol. 2, pp.1288–1289. New York: IEEE.

15.

Jin

X-B

Dou

T-L

et al . Parallel irregular fusion estimation based on nonlinear filter for indoor RFID tracking system. Int J Distrib Sens N. Epub ahead of print 23 May 2016. DOI: 10.1155/2016/1472930.

16.

Zhou

Zhang

Ravey

et al . On-line estimation of lithium polymer batteries state-of-charge using particle filter based data fusion with multi-models approach. IEEE Trans Indus Appl 2016; 52: 2582–2595.

17.

Vitola

Pozo

Tibaduiza

et al . A sensor data fusion system based on k-nearest neighbor pattern classification for structural health monitoring applications. Sensors 2017; 17: E417.

18.

Arredondo

MAT

Sierra-Pérez

Zenuni

et al . A pattern recognition approach for damage detection and temperature compensation in acousto-ultrasonics. In: EWSHM-7th European workshop on structural health monitoring, Nantes, 8–11 July 2014. Nantes, France: IFFSTTAR, Inria, Université de Nantes.

19.

Tibaduiza

Mujica

Anaya

et al . Independent component analysis for detecting damages on aircraft wing skeleton. In: Proceedings of the 5th European conference on structural control (EACS 2012), Genoa, Italy, 18–20 June 2012, pp.18–20.

20.

Anaya Vejar

Ceron

Vitola Oyaga

et al . Damage classification based on machine learning applications for an unmanned aerial vehicle. In: Proceedings of the 11th international workshop on structural health monitoring (IWSHM), Stanford, CA, 12–14 September 2017, pp.2042–2049.

21.

Kang

Chhipi-Shrestha

Deng

et al . Stable strategies analysis based on the utility of Z-number in the evolutionary games. Appl Math Comput 2018; 324: 202–217.

22.

Kang

Deng

Hewage

et al . Generating Z-number based on OWA weights using maximum entropy. Int J Intell Syst 2018; 33: 1745–1755.

23.

Kang

Deng

Hewage

et al . A method of measuring uncertainty for Z-number. IEEE Trans Fuzzy Syst. Epub ahead of print 3 September 2018. DOI: 10.1109/TFUZZ.2018.2868496.

24.

Frikha

Moalla

Analytic hierarchy process for multi-sensor data fusion based on belief function theory. Euro J Operat Res 2015; 241: 133–147.

25.

Beynon

Curry

Morgan

The Dempster–Shafer theory of evidence: an alternative approach to multicriteria decision modelling. Omega 2000; 28: 37–50.

26.

Dempster

. Upper and lower probabilities induced by a multivalued mapping. Ann Math Stat 1967: 325–339.

27.

Shafer

A mathematical theory of evidence, vol. 42. Princeton, NJ: Princeton University Press, 1976.

28.

Jones

Lowe

Harrison

MJ.

A framework for intelligent medical diagnosis using the theory of evidence. Knowl-Based Syst 2002; 15: 77–84.

29.

Yin

Deng

The negation of a basic probability assignment. IEEE T Fuzzy Syst. Epub ahead of print 24 September 2018. DOI: 10.1109/TFUZZ.2018.2871756.

30.

Deng

Dependent evidence combination based on Shearman coefficient and Pearson coefficient. IEEE Access 2018; 6: 11634–11640.

31.

Fei

Deng

A new divergence measure for basic probability assignment and its applications in extremely uncertain environments. Int J Intell Syst. Epub ahead of print 26 October 2018. DOI: 10.1002/int.22066.

32.

Gong

Qian

et al . Research on fault diagnosis methods for the reactor coolant system of nuclear power plant based on D-S evidence theory. Ann Nucl Energy 2018; 112: 395–399.

33.

Jiang

An improved soft likelihood function for Dempster–Shafer belief structures. Int J Intell Syst 2018; 33: 1264–1282.

34.

Jiang

Han

Liu

et al . A nonlinear interval number programming method for uncertain optimization problems. Euro J Operat Res 2008; 188: 1–13.

35.

Zhang

Olson

DL.

The method of grey related analysis to multiple attribute decision making problems with interval numbers. Math Comput Model 2005; 42: 991–998.

36.

Ferson

Kreinovich

Hajagos

et al . Experimental uncertainty estimation and statistics for data having interval uncertainty. Report SAND2007-0939, Sandia National Laboratories, Albuquerque, NM, 2007.

37.

Kreinovich

Lakeyev

Rohn

et al . 2013Computational complexity and feasibility of data processing and interval computations, vol. 10. Berlin: Springer Science+Business Media.

38.

Ferson

Kreinovich

Grinzburg

et al . Constructing probability boxes and Dempster-Shafer structures. Technical report, Sandia National Lab, Albuquerque, NM, 2015.

39.

Nassreddine

Abdallah

Denoux

State estimation using interval analysis and belief-function theory: application to dynamic vehicle localization. IEEE T Syst Man Cy B 2010; 40: 1205–1218.

40.

Kang

B-Y

Deng

et al . Determination of basic probability assignment based on interval numbers and its application. Dianzi Xuebao (Acta Electronica Sinica) 2012; 40: 1092–1096.

41.

Han

Deng

A novel matrix game with payoffs of Maxitive Belief Structure. Int J Intell Syst. Epub ahead 29 November 2018. DOI: 10.1002/int.22072.

42.

Deng

D-AHP method with different credibility of information. Soft Comput. Epub ahead of print 29 December 2017. DOI: 10.1007/s00500-017-2993-9.

43.

Xiao

A novel multi-criteria decision making method for assessing health-care waste treatment technologies based on D numbers. Eng Appl Artif Intell 2018; 71: 216–225.

44.

Deng

A new MADA methodology based on D numbers. Int J Fuzzy Syst 2018; 20: 2458–2469.

45.

Chen

Deng

A modified method for evaluating sustainable transport solutions based on AHP and Dempster–Shafer evidence theory. Appl Sci 2018; 8: 563.

46.

Xiao

Bowen

A weighted combination method for conflicting evidence in multi-sensor data fusion. Sensors 2018; 18: E1487.

47.

Xiao

Multi-sensor data fusion based on the belief divergence measure of evidences and the belief entropy. Inform Fusion 2019; 46: 23–32.

48.

Jiang

Wei

Qin

et al . Sensor data fusion based on a new conflict measure. Math Probl Eng 2016; 2016: 5769061.

49.

Rikhtegar

Mansouri

Ahadi Oroumieh

et al . Environmental impact assessment based on group decision-making methods in mining projects. Econ Res-Ekon Istraž 2014; 27: 378–392.

50.

Jiang

Wei

Liu

et al . Intuitionistic fuzzy power aggregation operator based on entropy and its application in decision making. Int J Intell Syst 2018; 33: 49–67.

51.

Jiang

An evidential dynamical model to predict the interference effect of categorization on decision making. Knowl-Based Syst 2018; 150: 139–149.

52.

Jiang

Wei

Intuitionistic fuzzy evidential power aggregation operator and its application in multiple criteria decision-making. Int J Syst Sci 2018; 49: 582–594.

53.

Deng

Jiang

Dependence assessment in human reliability analysis using an evidential network approach extended by belief rules and uncertainty measures. Ann Nucl Energy 2018; 117: 183–193.

54.

Han

Deng

An evidential fractal analytic hierarchy process target recognition method. Defence Sci J 2018; 68(4): 367–373.

55.

Chen

Deng

A new failure mode and effects analysis model using Dempster–Shafer evidence theory and grey relational projection method. Eng Appl Artif Intell 2018; 76: 13–20.

56.

Zhou

Al-Durra

Gao

et al . Online energy management strategy of fuel cell hybrid electric vehicles based on data fusion approach. J Power Sources 2017; 366: 278–291.

57.

Zhang

Deng

Combining conflicting evidence using the DEMATEL method. Soft Comput. Epub ahead of print 14 August 2018. DOI: 10.1007/s00500-018-3455-8.

58.

Smets

Kennes

The transferable belief model. Artif Intell 1994; 66: 191–234.

59.

Moore

Lodwick

Interval analysis and fuzzy set theory. Fuzzy Set Syst 2003; 135: 5–9.

60.

Hansen

Walster

. 2003Global optimization using interval analysis: revised and expanded, vol. 264. Boca Raton, FL: CRC Press.

61.

Young

RC.

The algebra of many-valued quantities. Math Ann 1931; 104: 260–290.

62.

Moore

RE.

Interval arithmetic and automatic error analysis in digital computing. PhD Thesis, Stanford University Stanford, CA, 1963.

63.

Alefeld

Mayer

Interval analysis: theory and applications. J Comput Appl Math 2000; 121: 421–464.

64.

Dreyer

Interval analysis of analog circuits with component tolerances. Aachen: Shaker, 2005.

65.

Jaulin

Kieffer

Didrit

et al . Applied interval analysis: with examples in parameter and state estimation, robust control and robotics, vol. 1. Berlin: Springer Science+ Business Media, 2001.

66.

Hartigan

Wong

MA.

Algorithm as 136: ak-means clustering algorithm. J R Stat Soc C Appl 1979; 28: 100–108.

67.

Arthur

Vassilvitskii

. k-means++: the advantages of careful seeding. In: Proceedings of the eighteenth annual ACM-SIAM symposium on discrete algorithms, New Orleans, LA, 7–9 January 2007, pp.1027–1035. Philadelphia, PA: Society for Industrial and Applied Mathematics.

68.

Xia

Feng

Liu

et al . An evidential reliability indicator-based fusion rule for Dempster–Shafer theory and its applications in classification. IEEE Access 2018; 6: 24912–24924.

69.

Hall

Frank

Holmes

et al . The WEKA data mining software: an update. ACM SIGKDD Explorat Newslett 2009; 11: 10–18.

70.

Sohn

Lee

SH.

Data fusion, ensemble and clustering to improve the classification accuracy for the severity of road traffic accidents in Korea. Safety Sci 2003; 41: 1–14.