Sage Journals: Discover world-class research

Abstract

Multi-sensor information fusion occurs in a vast variety of applications, including medical diagnosis, automatic drive, speech recognition, and so on. Often these problems can be modeled by Dempster–Shafer theory. In Dempster–Shafer theory, the most primary processing unit is the basic probability assignment, which is a description of objective information in the real world. How to make this description more effective is a vital but open issue. A novel basic probability assignment generation model is proposed in this article whose objective is to provide perspective with respect to how basic probability assignment can be determined based on learning algorithms. First, the basic probability assignment generation model is constructed based on clustering idea using K-means method, which is employed to determine basic probability assignment with the proposed basic probability assignment generation method. Moreover, the proposed basic probability assignment generation method is extended by K–nearest neighbor (K-NN) algorithm. The detailed implementation of the proposed method is demonstrated by several numerical examples. As an extension, a classifier called KKC is constructed according to the developed approach, and its classification effect is compared with several famous classification algorithms. Experiments manifest desirable results with regard to classification accuracy, which illustrates the applicability of the proposed method to determine basic probability assignment.

Keywords

Dempster–Shafer evidence theory basic probability assignment K-means K–nearest neighbor multi-sensor information fusion classification

Introduction

Information fusion is an information processing technology, also known as data fusion.¹ It refers to the process of combining multiple groups of information from different sources with certain methods and rules to obtain the final fusion results, which is also called multi-source (multi-sensor) information fusion technology.² Dempster–Shafer evidence theory (DST) is widely applied in information fusion, which is also called theory of belief function. Its basic concept was put forward by Dempster in 1967,³ and then developed and improved by Shafer,⁴ making it a complete uncertain reasoning theory. DST is the generalized form of the classical probability theory, which expands the basic event space into its power set, and establishes the basic probability assignment (BPA) function. Dempster³ proposed a fusion rule, which can combine evidence without prior information. In the past decades,⁵ DST has been applied in a large number of scientific fields, such as medical diagnosis,^6–10 multi-sensor fusion,^11–13 pattern recognition,^14–17 and intelligent decision-making.^18–22

How to generate BPA effectively is a crucial and open issue in DST.^23–26 BPA is the most basic processing unit and a way of information expression in DST. The fusion process in DST is based on the BPA function, so the rationality of the generated BPA directly affects the accuracy of fusion results, and also determines the efficiency of evaluation and decision. There are usually two perspectives to determine BPAs.^27,28 First, it is given subjectively by experts or decision makers based on their experience and survey results; second, a model is established based on collected data to automatically generate BPAs. To solve this problem, scholars have proposed various methods, which are described below.

As analysis, these methods of generating BPA can be divided into several categories, which are probability distribution-based method, fuzzy theory-based method, and other methods. For methods using probability distribution, a method to obtain BPA was proposed by Xu et al.²⁹ ground on the normal distribution and they also proposed a non-parametric method to determine BPA by Gaussian process regression.³⁰ And a method to determine BPA using core samples was introduced by Zhang et al.³¹ For methods employing fuzzy theory, a few BPA generation algorithms based on (triangular) fuzzy numbers were provided in previous literature^32–35 and the generalized BPAs generation algorithm was proposed by Deng and colleagues.^36–38 In addition, there are other methods for generating BPAs of sample attribute values, such as the BPA generation algorithm found on interval number proposed by Qin and Xiao³⁹ and the algorithm based on confusion matrix proposed by Deng et al.⁴⁰ Some BPA generation algorithms are presented in Table 1.

Table 1.

Overview of research on basic probability assignment generation methods.

Method	Core idea
Normal distributions²⁹	Normal distribution test is performed on the training set. Model is established according to the normal distribution curve. BPA is obtained from the intersection of the sample attribute values and the model.
Gaussian process regression³⁰	The main advantage of this method over Xu et al.²⁹ is that there is no requirement for data distribution, that is, there is no need to make any distribution assumption for data, so it is more suitable for engineering practice.
Core samples³¹	The core samples associated with each attribute are trained based on the training set and these conducive to the generation of BPA are selected. The distance between attribute values and the core samples are calculated to obtain BPAs.
Generalized fuzzy numbers³³	The fuzzy number model of attributes is established, and the similarity between attribute values to be tested and the generalized fuzzy number are calculated, so as to normalize the similarity and obtain the BPA of attributes.
Triangular fuzzy number³⁸	The triangular fuzzy number model under each attribute is established, and the BPA corresponding to each attribute value to be tested is generated according to the overlap between the attribute value and the model.
Confusion matrix⁴⁰	Based on classification problems, a BPA construction method using confusion matrix is proposed, which employed the accuracy and recall rate of each class to model and generate the BPA.
Interval number³⁹	The interval number model of attributes is established, then calculate the distance between attribute values to be tested and the interval number, and the similarity measure is further obtained. The normalized similarity is used to obtain BPAs.
Triangular fuzzy number³²	An improved method to obtain basic belief assignment is proposed based on the triangular fuzzy number and k-means++ algorithm.
Minimum spanning tree³⁷	A method to determine GBPA in the open world is put forward. First, a MST is established. Then, the MST coverage of each class is established. Combine these coverages to form the MST covering model.
Triangular fuzzy number³⁴	Frame of discernment in an open world is established first, and then the triangular fuzzy number models to identify target are established. Pessimistic strategy based on the differentiation degree between model and sample is defined to determine BPAs.
Fuzzy sets⁴¹	This paper presents three methods to construct the BPA function. These methods are based on gray correlation analysis, fuzzy sets, and attribute measure, respectively.
Cloud model⁴²	First, the normal cloud model is constructed. Second, the average certainty of the test sample is obtained. Third, the method for measuring the similarity of normal cloud models is proposed. Finally, the certainty is normalized to obtain BPAs.

BPA: basic probability assignment; GBPA: generalized basic probability assignment; MST: minimum spanning tree.

The above introduces several popular methods of BPA generation. Different models can be established from different perspectives, and different results may be obtained. As for which method is employed to construct BPA, it needs to be selected according to the specific situation in practical application. There is no general rule for BPA determination, so it is an open issue.^32,39 Different from the traditional BPA method, in this article, a novel BPA generation algorithm is put forward from the perspective of machine learning (i.e. K-means and K–nearest neighbor (K-NN)).

First, the model to generate BPA is constructed based on K-means method, where the concepts of centroid and radius are defined to represent the scope of the model. Then, the BPAs of different attribute values associated samples to be classified can be determined based on the built model introduced in section “Generate BPAs based on the constructed model.” A numerical example is provided to illustrate the utilization of the presented approach. In addition, the BPA generation method in section “Generate BPAs based on the constructed model” is extended by K-NN algorithm. The K NNs of the sample to be identified are investigated. And the BPAs of them are determined based on the proposed generation model, which are processed to obtain the final BPA for the sample to be tested. A numerical example is also given to demonstrate the procedure of this method. To apply the proposed method to different fields easily, a classifier is constructed based on the developed BPAs determination method. The classification validity of the constructed classifier is verified by a comparative experiment with other state-of-the-art machine learning algorithms.

The remainder of this article is organized as follows. The “Preliminaries” section introduces the concept of DST, K-means method, and K-NN algorithm. “The model to determine BPA based on K-means method” section presents the model to determine BPA based on K-means method. The section “An improved method for BPA generation based on K-NN algorithm” introduces an improved method for BPA generation based on K-NN algorithm. “An application for classification based on multi-sensor information fusion” section constructs an application for classification based on the proposed BPA generation method. In the “Conclusions and future research” section, the conclusions and future research directions are provided.

Preliminaries

DST

DST was first introduced by Dempster³ and then developed by Shafer,⁴ which is widely applied in uncertainty modeling.^43–46 Its job is to numerically deal with the “probabilities” of events without sharply definite boundary. In DST, all the basic events constitute the frame of discernment (FOD), which is expressed as $Θ = {θ_{1}, θ_{2}, \dots, θ_{n}}$ . The elementary event is no longer singular in DST, all events are included in the power set of $Θ$ , denoted as $2^{Θ} = {ϕ, θ_{1}, θ_{2}, \dots, θ_{n}, θ_{1} \cup θ_{2}, θ_{1} \cup θ_{3}, \dots, Θ}$ . Each subset is assigned a probability value whose sum is 1. The BPA on FOD $Θ = {θ_{1}, θ_{2}, \dots, θ_{n}}$ can be defined as a mapping $2^{Θ} \to [0, 1]$ which satisfies the following condition

m (ϕ) = 0, \sum_{A \in 2^{Θ}} m (A) = 1

(1)

where A denotes one of the propositions in $2^{Θ}$ and is called focal element if $m (A) > 0$ . $| A |$ is defined as the cardinality of A, which measures the number of elements in A. BPA is also called belief function, mass function, or a piece of evidence in DST.

DST is mostly employed in multi-source information fusion.^47–49 In such applications, multi-source information is usually expressed in the form of BPAs. These sources need to be combined for decision making and classification.^50–53 In DST, the concept of orthogonal sum was presented by Dempster³ as follows.

Definition 1. (Dempster’s rule of combination) Let $m_{1}$ and $m_{2}$ be two BPAs, the rule of Dempster denoted by $m = m_{1} \oplus m_{2}$ can be defined as

m (A) = \frac{\sum_{B \cap C = A} m_{1} (B) m_{2} (C)}{1 - K}

(2)

with

K = \sum_{B \cap C = ϕ} m_{1} (B) m_{2} (C)

(3)

Note that the rule of Dempster only works for two such BPAs where $K < 1$ .

For decision or classification, the combination results of multi-source information usually need to be transformed into probability distribution.^54,55 The common method is pignistic probability,⁵⁶ which can be defined as follows.

Definition 2. (Pignistic probability) Let m be a BPA, the pignistic probability function is defined as

Bet P_{m} (A) = \sum_{B \subseteq Θ} \frac{| A \cap B |}{| B |} \frac{m (B)}{1 - m (ϕ)}, \forall A \subseteq Θ

(4)

where $| A |$ is the cardinality of focal element A.

K-means clustering method

For most multi-sensor information fusion applications, it is suitable to employ clustering method to process the involved data, which can divide similar objects into the same or similar groups, so that the data objects in the same group have the same or similar characteristics. This data processing method can greatly reduce the difficulty in dealing with high-dimensional data, and subsequent processing can eliminate data redundancy to achieve the purpose of refining data sources.

The K-means algorithm is a relatively simple one among machine learning clustering algorithms. It is an iterative algorithm based on distance.⁵⁷ Its core idea is to classify n samples into K clusters so that each sample is closer to the center point of the cluster than the other clusters. The following is a brief introduction of the core idea of K-means algorithm. First, K points are initialized as the centroid of the class cluster, and then the distance measure function is selected to calculate the distance from each cluster sample to each centroid. According to the obtained distance, each sample is divided into the cluster where the nearest particle is located. Update the centroid next, and continue to measure distance to the centroid of the sample. Divide each sample according to the nearest distance principle, and keep repeating this step. When the number of cycles reaches the preset maximum number of iterations, or the sum of squared errors of each two iterations is less than the set threshold, stop iteration, and the output is the final clustering result. The specific process of K-means algorithm is described in Algorithm 1.

Algorithm 1. K-means algorithm.
Input: Sample set $D = {x_{1}, x_{2}, \dots, x_{m}}$ and the number of clusters K Output: Cluster partition $C = {C_{1}, C_{2}, \dots, C_{K}}$ 1: K samples are randomly selected from D as the initial mean vector ${μ_{1}, μ_{2}, \dots, μ_{K}}$ 2: while The current vector is no longer updated do 3: Let $C_{i} = ϕ (1 \leq i \leq K)$ 4: for $j = 1$ to m do 5: Calculate the distance $d_{ji} = \| \| x_{j} - μ_{i} \| \|_{2}$ between sample $x_{j}$ and each mean vector $μ_{i} (1 \leq i \leq K)$ 6: The cluster $λ_{j} = argmi n_{i \in {1, 2 \dots, K}} d_{ji}$ of $x_{j}$ is determined according to the nearest mean vector 7: Divide sample $x_{j}$ into clusters 8: end for 9: for $j = 1$ to K do 10: Calculate the new mean vector $μ_{i}^{'} = \frac{1}{C_{i}} \sum_{x \in C_{i}} x$ 11: if $μ \neq μ^{'}$ then 12: $μ^{'} \leftarrow μ$ 13: else 14: $μ$ 15: end if 16: end for 17: end while

Algorithm 1. K-means algorithm.

Input: Sample set

D = {x_{1}, x_{2}, \dots, x_{m}}

and the number of clusters K
Output: Cluster partition

C = {C_{1}, C_{2}, \dots, C_{K}}

1: K samples are randomly selected from D as the initial mean vector

{μ_{1}, μ_{2}, \dots, μ_{K}}

2: while The current vector is no longer updated do
3: Let

C_{i} = ϕ (1 \leq i \leq K)

4: for

j = 1

to m do
5: Calculate the distance

d_{ji} = | | x_{j} - μ_{i} | |_{2}

between sample

x_{j}

and each mean vector

μ_{i} (1 \leq i \leq K)

6: The cluster

λ_{j} = argmi n_{i \in {1, 2 \dots, K}} d_{ji}

x_{j}

is determined according to the nearest mean vector
7: Divide sample

x_{j}

into clusters
8: end for
9: for

j = 1

to K do
10: Calculate the new mean vector

μ_{i}^{'} = \frac{1}{C_{i}} \sum_{x \in C_{i}} x

11: if

μ \neq μ^{'}

then
12:

μ^{'} \leftarrow μ

13: else
14:

μ

15: end if
16: end for
17: end while

In the K-means clustering algorithm, the selection of distance measure is also important. There are many ways to calculate the distance, such as Euclidean distance, Manhattan distance, Chebyshev distance, cosine distance, Jaccard similarity coefficient, and so on. The selection of different distance measures will have a certain impact on the clustering results. Since this is not the focus of this article, the detailed introduction is not given. The K-means algorithm is selected in this article because it has the following advantages:

This algorithm is simple and fast

This algorithm converges after a certain number of iterations

The clustering effect is better when the clusters are spherical or cluster, and the differences between clusters are obvious

This algorithm has certain advantages for big data processing.

Based on the above analysis and aiming at how to effectively generate the BPA of sensor data, this article breaks the traditional statistical modeling method and proposes the BPA generation algorithm innovatively using clustering ideas based on the advantages of K-means clustering algorithm. The specific algorithm flow is introduced in Algorithm I.

K-NN method

K-NN algorithm is a classical classification algorithm in machine learning. It classifies objects by measuring the distance between different features.⁵⁸ Its basic idea can be described as follows: the category of the sample to be tested is equal to the category of its K most similar samples in the feature space. This is the core of the K-NN algorithm, that is, the category of the K samples closest to the sample to be tested is the correct classification of this sample. K-NN method makes decision according to the principle of quantity superiority and avoids the uncertainty brought by single object decision. This ensures the accuracy of K-NN method classification and becomes its main advantage. The K-NN method is briefly illustrated by an example below. In Figure 1, the coordinates of all sample points in the plane graph have been given, and now it is necessary to determine the category of purple triangle. According to the sample distribution in the graph, it can be known that the possible category of purple triangle is one of rectangle, circle, or diamond. K-NN method is employed to classify it. When $K = 4$ , since the proportion of rectangle is the largest, which is $1 / 2$ , purple triangle is divided into rectangle. If $K = 7$ , then the proportion of diamond is the largest, which is $3 / 7$ , so in this case, the purple triangle is divided into diamond. This example demonstrates the core idea of K-NN method and also reflects the importance of K value. In the same problem, different K values may lead to different classification results.

Figure 1.

The illustration for K-NN algorithm.

The distance measure in K-NN algorithm is also a crucial problem, because the algorithm determines the proximity by measuring the distance between samples and each object. The common distance measure is Euclidean distance and Manhattan distance, and which one should be employed depends on the actual environment.

The model to determine BPA basedon K-means method

To obtain reasonable, effective and representative BPA, two aspects need to be taken into account: (1) What are the definitions and requirements for the rationality of BPA? (2) How to ensure the efficiency of BPA generation model? For the first aspect, the definition of rationality is given as follows: the generated BPA should contain as much effective information of the original data set as possible and minimize the information loss caused by the conversion of data type, which can serve as the basis for the information fusion and decision-making process. For the second aspect, the efficiency of the model is mainly concerned with the effectiveness of generating BPA and the complexity of the algorithm. To meet the above two requirements, the model developed in this article fully considers and utilizes the information of the original data set to ensure that the information loss can be minimized. The complexity of the algorithm is reduced as far as possible on the premise of rationality.

Based on the introduction and analysis of K-means algorithm in the “K-means clustering method” section, in this study, a model to generate BPA using the K-means clustering algorithm is developed, and the flow chart of the proposed model is manifested in Figure 2. The algorithm is described in detail below.

Figure 2.

The flowchart of K-means-based BPA generation model.

Construct the model to generate BPA based on K-means method

BPA generation process generally occurs before multi-sensor information fusion. Based on the investigation of a large number of practical applications,^59,60 without loss of generality, the following problem descriptions are presented. For a given data set, the number of samples is assumed to be n, the number of attributes is m, and the number of categories is p, that is, the number of elements in the FOD in DST. Let us consider a classification problem. A certain proportion of samples are selected randomly according to the practical application environment as the training set, and the rest as the test set. Let $X = {x_{i} = (x_{i}^{1}, \dots, x_{i}^{m}) | i = 1, \dots, n_{t}}$ be the training data set on $n_{t}$ m-dimensional samples ( $n_{t}$ represents the number of samples in the training set), and a set with p classes is denoted as $C = {C_{1}, \dots, C_{p}}$ . A class label $L_{i} \in {1, \dots, p}$ will be assumed to each sample $x_{i}$ in $X$ . Generally, a pair $(x_{i}, L_{i})$ is employed to represent the ith training sample. Let $x_{t}$ be a sample which needs to determine the BPAs based on the information provided by $X$ . It means that all the attributes of $x_{t}$ will be expressed as BPAs.

In the constructed model, the m attributes will be divided into $C_{m}^{s}$ subsets, where s denotes the number of elements in each subset. For example, when $m = 4$ , $s = 2$ , it means that there are four attributes in the data set, and each subset divided contains two elements. If the four attributes are represented as A, B, C, and D, then the attribute set can be divided into $C_{4}^{2} = 6$ subsets ${A, B}$ , ${A, C}$ , ${A, D}$ , ${B, C}$ , ${B, D}$ , and ${C, D}$ . In particular, when $s = m$ means that all attributes are subdivided into a subset, and when $s = 1$ denotes that each attribute is considered a subset. $C_{m}^{s}$ new training sets are obtained based on the partition of attribute sets, which are composed of samples associated with subsets of attributes. The K-means method will be employed in all training sets that will be grouped into p classes. Two properties are employed to express the clustering results, cluster centroid and radius, where the radius is defined as the distance between the centroid and the farthest point. Therefore, $p \times C_{m}^{s}$ cluster centroid and radius can be obtained as $κ_{i}^{j}$ and $ψ_{i}^{j}, i = 1, \dots, p, j = 1, \dots, C_{m}^{s}$ . Taking the jth training set as an example, all its samples will be clustered into p classes, for ith class, let the true label of the sample in it be $L_{i}^{υ}, υ = 1, \dots, p$ . The cluster center and radius of a class constitute a region whose category label is defined as the label of the points with the most samples in this region, that is, $L_{i}^{μ} = ma x_{υ} L_{i}^{υ}$ , where $\max$ denotes the maximum in number. Now, $C_{m}^{s}$ s-dimensional clustering results are obtained, which are composed of multiple labeled regions, thereby the model to determine BPA is established, and the detailed modeling process will be manifested by an illustrative example.

Generate BPAs based on the constructed model

In the previous section, the BPA generation model is built, and then we will introduce how to generate BPA according to the established model. The following details of BPA generation based on this model will be introduced. For a certain attribute value in a sample of BPA to be generated, according to the model established in the previous step, the number of s-dimensional clustering regions corresponding to it is $C_{m - 1}^{s - 1}$ . In the sample to be generated BPA, the combination of attribute values corresponding to the $C_{m - 1}^{s - 1}$ regions will be projected to these regions. The point will fall in the area identified by a class label that will be considered as the category to which the point belongs. Note that the sample point may belong to one category or several categories when the point is covered by the area of several categories simultaneously.

For example, in data set D, four attributes are represented as a, b, c, and d, respectively, and $s = 2$ . Let the attribute value of sample S under attribute a be $v_{S}^{a}$ , the number of corresponding clustering results is $C_{3}^{1} = 3$ , that is, ${A, B}$ , ${A, C}$ , ${A, D}$ . Let the coordinates of sample S under the three attribute pairs be $(v_{S}^{a}, v_{S}^{b})$ , $(v_{S}^{a}, v_{S}^{c})$ , and $(v_{S}^{a}, v_{S}^{d})$ , respectively. According to the coordinate position, determine the region of each point in the corresponding clustering graph of each attribute pair. If it only belongs to a certain region, such as the region of category $c_{1}$ , then the BPA of this point is $m ({c_{1}}) = 1$ . If this point belongs to the intersection region, such as the overlapping region of category $c_{1}$ and $c_{2}$ , then the BPA of this point can be obtained as $m ({c_{1}, c_{2}}) = 1$ . In this way, the three BPAs corresponding to the attribute value $v_{S}^{a}$ under sample S can be obtained. The next step is to average the three BPAs to calculate the final BPA of the attribute value $v_{S}^{a}$ . The average operation is defined as follows:

Definition 3. Let the FOD be $C = {C_{1}, \dots, C_{p}}$ and the number of BPAs corresponding to the attribute value of sample S under attribute a be $C_{m - 1}^{s - 1}$ . The BPAs are denoted as $m_{1}, m_{2}, \dots, m_{C_{m - 1}^{s - 1}}$ , then the arithmetic mean value of the BPA of each focal element can be calculated as follows

\begin{matrix} m (B) = & \sum_{i = 1}^{C_{m - 1}^{s - 1}} m_{i} (B), \forall B \subseteq C \end{matrix}

(5)

Perform this operation on the three BPAs corresponding to the attribute value $v_{S}^{a}$ under sample S to obtain the corresponding final BPA. All the BPAs of sample S can be obtained by performing the same operation on other attribute values in sample S as $v_{S}^{a}$ . The BPAs for each sample can be obtained by repeating this step for all samples in the test set.

Numerical example

In this section, to demonstrate the BPA generation algorithm proposed in the article, we take IRIS data set as an example to introduce the application process of the constructed model in detail. First, we briefly introduce the basic structure of IRIS data set. IRIS data set is a very famous pattern recognition data set, which was established in the mid-1930s by R.A. Fisher, a distinguished statistician. At present, it is widely cited in UCI machine learning database. The IRIS data set contains three categories, namely Setosa, Versicolor, and Virginica, with a total of 150 samples and 50 samples in each category. The data set consists of four attribute variables, namely Sepal Length, Sepal Width, Petal Length, and Petal Width. As the data types in this data set are all numerical and the selection of attribute characteristics is representative, it is generally recognized as the most useful data set in the field of data mining and analysis. The following example will be conducted based on this data set.

According to the process shown in Figure 2, we first divided the data set into training set and test set with a proportion of 80%. The following two aspects demonstrate how to generate the BPAs associated with the samples in the test set.

1. Construct the BPA generation model

In this example, we assume that $s = 2$ , because $m = 4$ , the data set can be divided into $C_{4}^{2} = 6$ subsets by attributes. Next, the K-means clustering method is employed to cluster the obtained six data subsets. The selected number of clustering clusters is equal to the number of categories of the data set samples, that is, $p = 3$ . The specific clustering results are manifested in Figure 3. Therefore, the model construction of BPA generation from IRIS data set is completed. The next step shows how to generate the BPAs associated with the samples in the test set.

2. Generate BPAs of test samples based on the established model

Figure 3.

BPA generation model for IRIS data set, as the order from left to right and from top to bottom, the two-dimensional attributes are (SL, SW), (PL, PW), (SL, PW), (SL, PL), (SW, PW), and (SW, PL).

In this part, we take the second test sample of IRIS-setosa as an example to demonstrate how to generate the BPA associated with each attribute value of this sample. According to IRIS data set, the four attribute values of Sepal Length, Sepal Width, Petal Length, and Petal Width of sample 2 are 4.9, 3.0, 1.4, and 0.2, respectively. According to introduction in section “Generate BPAs based on the constructed model,” the division of four attribute values is determined and denoted as $C_{4}^{2} = 6$ subset, respectively $(4.9, 3.0)$ , $(1.4, 0.2)$ , $(4.9, 0.2)$ , $(4.9, 1.4)$ , $(3.0, 0.2)$ , and $(3.0, 1.4)$ . The next step is to calculate the corresponding BPA for each subset using the model represented by each subplot in Figure 3. In Figure 4, we manifest the process of generating BPA of attribute sets $(4.9, 3.0)$ and $(1.4, 0.2)$ . As shown in the figure, the basic probability will be assigned to the attribute value according to the label of the region in which it is located. For instance, point $(4.9, 3.0)$ is located in the intersection area of IRIS-setosa and IRIS-versicolor, so the corresponding BPA is $m_{2}^{12} ({Setosa, Versicolor}) = 1$ , where subscript 2 represents the sample number and superscript “12” represents the attributes. Similarly, the BPA of point $(4.9, 1.4)$ is $m_{2}^{13} ({Setosa}) = 1$ . Based on the above calculation process, the corresponding BPAs of other four subsets can be obtained as

\begin{matrix} m_{2}^{14} ({Setosa}) = 1, m_{2}^{24} ({Setosa}) = 1, m_{2}^{23} ({Setosa}) \\ = 1, m_{2}^{34} ({Setosa}) = 1 \end{matrix}

(6)

Figure 4.

The process of generating BPA from attributes $(4.9, 3.0)$ and $(4.9, 1.4)$ in sample 2.

After obtaining the corresponding BPA of each subset, Definition 3 is employed to average all the BPAs obtained, and the final BPA of the first attribute value of sample 2 can be calculated as

\begin{matrix} m_{2}^{1} ({Setosa}) = \frac{1}{3} (m_{2}^{12} ({Setosa}) + m_{2}^{13} ({Setosa}) \\ + m_{2}^{14} ({Setosa})) = \frac{2}{3} \\ m_{2}^{1} ({Versicolor}) = 0 \\ m_{2}^{1} ({Virginica}) = 0 \\ m_{2}^{1} ({Setosa, Versicolor}) \\ = \frac{1}{3} (m_{2}^{12} ({Setosa, Versicolor}) \\ + m_{2}^{13} ({Setosa, Versicolor}) + m_{2}^{14} ({Setosa, Versicolor})) \\ = \frac{1}{3} \\ m_{2}^{1} ({Setosa, Virginica}) = 0 \\ m_{2}^{1} ({Virginica, Versicolor}) = 0 \\ m_{2}^{1} ({Setosa, Versicolor, Versicolor}) = 0 \end{matrix}

(7)

Note that $m_{i}^{j} (A)$ denotes the mass of belief of focal element A of jth attribute value of ith sample. Therefore, the BPA of the attribute value 4.9 in sample 2 is

m_{2}^{1} ({Setosa}) = \frac{2}{3}, m_{2}^{1} ({Setosa, Versicolor}) = \frac{1}{3}

(8)

According to the above process, the BPAs of other attribute values 3.0, 1.4, and 0.2 in sample 2 are

m_{2}^{2} ({Setosa}) = \frac{2}{3}, m_{2}^{2} ({Setosa, Versicolor}) = \frac{1}{3}

(9)

m_{2}^{3} ({Setosa}) = 1

(10)

m_{2}^{4} ({Setosa}) = 1

(11)

According to the above process, the BPAs of all samples in the test set can be calculated. The BPAs of the samples of IRIS-setosa, IRIS-versicolor, and IRIS-virginica are manifested in Tables 2 –4, respectively.

Table 2.

The BPAs of each sample to be tested in IRIS data set assigned to class “setosa.”

Sample	Attribute
	Sepal length	Sepal width	Petal length	Petal width
1	$m ({S}) = \frac{2}{3}, m ({S, C}) = \frac{1}{3}$	$m ({S}) = \frac{2}{3}, m ({S, C}) = \frac{1}{3}$	$m ({S}) = 1$	$m ({S}) = 1$
2	$m ({S}) = \frac{2}{3}, m ({S, C}) = \frac{1}{3}$	$m ({S}) = \frac{2}{3}, m ({S, C}) = \frac{1}{3}$	$m ({S}) = 1$	$m ({S}) = 1$
3	$m ({S}) = \frac{2}{3}, m ({S, C}) = \frac{1}{3}$	$m ({S}) = \frac{2}{3}, m ({S, C}) = \frac{1}{3}$	$m ({S}) = 1$	$m ({S}) = 1$
4	$m ({S}) = 1$	$m ({S}) = 1$	$m ({S}) = 1$	$m ({S}) = 1$
5	$m ({S}) = 1$	$m ({S}) = 1$	$m ({S}) = 1$	$m ({S}) = 1$
6	$m ({S}) = 1$	$m ({S}) = 1$	$m ({S}) = 1$	$m ({S}) = 1$
7	$m ({S}) = 1$	$m ({S}) = 1$	$m ({S}) = 1$	$m ({S}) = 1$
8	$m ({S}) = \frac{2}{3}, m ({S, C}) = \frac{1}{3}$	$m ({S}) = \frac{2}{3}, m ({S, C}) = \frac{1}{3}$	$m ({S}) = 1$	$m ({S}) = 1$
9	$m ({S}) = \frac{2}{3}, m ({S, C}) = \frac{1}{3}$	$m ({S}) = \frac{2}{3}, m ({S, C}) = \frac{1}{3}$	$m ({S}) = 1$	$m ({S}) = 1$
10	$m ({S}) = 1$	$m ({S}) = 1$	$m ({S}) = 1$	$m ({S}) = 1$

BPA: basic probability assignment. “S” denotes setosa, “C” denotes versicolor, and “V” denotes virginica. Unless otherwise stated below, the classes of IRIS data set are defined according to this table.

Table 3.

The BPAs of each sample to be tested in IRIS data set assigned to class “versicolor.”

Sample	Attribute
	Sepal length	Sepal width	Petal length	Petal width
1	$m ({C}) = \frac{1}{3}$	$m ({C}) = \frac{1}{3}, m ({C, V}) = \frac{1}{3}$	$m ({C, V}) = \frac{2}{3}$	$m ({C, V}) = 1$
	$m ({C, V}) = \frac{2}{3}$	$m ({C, V, S}) = \frac{1}{3}$	$m ({C, V, S}) = \frac{1}{3}$
2	$m ({C, V}) = 1$	$m ({C, V}) = \frac{2}{3}$	$m ({C, V}) = \frac{2}{3}$	$m ({C, V}) = 1$
		$m ({C, V, S}) = \frac{1}{3}$	$m ({C, V, S}) = \frac{1}{3}$
3	$m ({C}) = \frac{1}{3}$	$m ({C}) = \frac{1}{3}$
	$m ({C, V}) = \frac{2}{3}$	$m ({C, V}) = \frac{2}{3}$	$m ({C, V}) = 1$	$m ({C, V}) = 1$
4	$m ({C}) = \frac{2}{3}$	$m ({C}) = \frac{1}{3}$
	$m ({S, V}) = \frac{2}{3}$	$m ({C, V}) = \frac{1}{3}$	$m ({C}) = \frac{2}{3}$	$m ({C, V}) = 1$
		$m ({V, S}) = \frac{1}{3}$	$m ({C, V}) = \frac{1}{3}$
5	$m ({C, V}) = 1$	$m ({C, V}) = 1$	$m ({C, V}) = 1$	$m ({C, V}) = 1$
6	$m ({C}) = \frac{2}{3}$	$m ({C, V}) = \frac{2}{3}$	$m ({C}) = \frac{1}{3}$	$m ({C}) = \frac{1}{3}$
	$m ({C, S, V}) = \frac{1}{3}$	$m ({C, V, S}) = \frac{1}{3}$	$m ({C, V}) = \frac{2}{3}$	$m ({C, V}) = \frac{2}{3}$
7	$m ({C, V}) = \frac{2}{3}$	$m ({C, V}) = \frac{1}{3}, m ({S, V}) = \frac{1}{3}$	$m ({C, V}) = \frac{2}{3}$	$m ({C, V}) = 1$
	$m ({S, V}) = \frac{1}{3}$	$m ({C, V, S}) = \frac{1}{3}$	$m ({C, V, S}) = \frac{1}{3}$
8	$m ({C}) = \frac{2}{3}$	$m ({C}) = \frac{2}{3}$	$m ({C}) = \frac{2}{3}$	$m ({C}) = \frac{2}{3}$
	$m ({C, S}) = \frac{1}{3}$	$m ({C, V}) = \frac{1}{3}$	$m ({C, V}) = \frac{1}{3}$	$m ({C, V}) = \frac{1}{3}$
9	$m ({C, V}) = 1$	$m ({C, V}) = \frac{2}{3}$	$m ({C}) = \frac{1}{3}$	$m ({C}) = \frac{1}{3}$
		$m ({C, S, V}) = \frac{1}{3}$	$m ({C, V}) = \frac{1}{3}$	$m ({C, V}) = \frac{2}{3}$
			$m ({C, V, S}) = \frac{1}{3}$
10	$m ({C}) = \frac{2}{3}$	$m ({C, V}) = \frac{2}{3}$	$m ({C}) = \frac{2}{3}$	$m ({C}) = \frac{2}{3}$
	$m ({C, V}) = \frac{1}{3}$	$m ({C, S, V}) = \frac{1}{3}$	$m ({C, V, S}) = \frac{1}{3}$	$m ({C, V}) = \frac{1}{3}$

BPA: basic probability assignment.

Table 4.

The BPAs of each sample to be tested in IRIS data set assigned to class “virginica.”

Sample	Attribute
	Sepal length	Sepal width	Petal length	Petal width
1	$m ({V}) = \frac{1}{3}, m ({C, V}) = \frac{1}{3}$	$m ({V}) = \frac{1}{3}$	$m ({V}) = \frac{1}{3}$	$m ({V}) = 1$
	$m ({C, V, S}) = \frac{1}{3}$	$m ({C, V}) = \frac{1}{3}$	$m ({C, V}) = \frac{2}{3}$
		$m ({C, V, S}) = \frac{1}{3}$
2	$m ({C, V}) = \frac{2}{3}, m ({C, V, S}) = \frac{1}{3}$	$m ({C, V}) = \frac{2}{3}$	$m ({V}) = \frac{1}{3}$	$m ({V}) = \frac{1}{3}$
		$m ({C, V, S}) = \frac{1}{3}$	$m ({C, V}) = \frac{2}{3}$	$m ({C, V}) = \frac{2}{3}$
3	$m ({C}) = \frac{1}{3}, m ({V}) = \frac{2}{3}$	$m ({C}) = \frac{1}{3}$	$m ({V}) = \frac{2}{3}$	$m ({V}) = \frac{2}{3}$
		$m ({V}) = \frac{2}{3}$	$m ({C, V}) = \frac{1}{3}$	$m ({C, V}) = \frac{1}{3}$
4	$m ({C, V}) = 1$	$m ({C, V}) = 1$	$m ({V}) = \frac{1}{3}$	$m ({V}) = \frac{1}{3}$
			$m ({C, V}) = \frac{2}{3}$	$m ({C, V}) = \frac{2}{3}$
5	$m ({V}) = \frac{1}{3}, m ({C, V}) = \frac{2}{3}$	$m ({C, V}) = 1$	$m ({V}) = \frac{1}{3}$	$m ({V}) = 1$
			$m ({C, V}) = \frac{2}{3}$
6	$m ({V}) = \frac{1}{3}, m ({C, V}) = \frac{2}{3}$	$m ({C, V}) = \frac{2}{3}$	$m ({V}) = \frac{2}{3}$	$m ({V}) = \frac{2}{3}$
		$m ({C, V, S}) = \frac{1}{3}$	$m ({C}) = \frac{1}{3}$	$m ({C, V}) = \frac{1}{3}$
7	$m ({C}) = \frac{2}{3}, m ({S, V}) = \frac{1}{3}$	$m ({C, V}) = \frac{2}{3}$	$m ({C}) = \frac{1}{3}$	$m ({C}) = \frac{1}{3}$
		$m ({S, V}) = \frac{1}{3}$	$m ({C, V}) = \frac{2}{3}$	$m ({C, V}) = \frac{2}{3}$
		$m ({V}) = \frac{1}{3}$	$m ({V}) = \frac{2}{3}$	$m ({V}) = 1$
8	$m ({V}) = \frac{2}{3}, m ({C}) = \frac{1}{3}$	$m ({C}) = \frac{1}{3}$	$m ({C, V}) = \frac{1}{3}$
		$m ({C, V}) = \frac{1}{3}$
9	$m ({C, V}) = 1$	$m ({C, V}) = 1$	$m ({V}) = \frac{2}{3}$	$m ({V}) = \frac{1}{3}$
			$m ({C, V}) = \frac{1}{3}$	$m ({C, V}) = \frac{2}{3}$
10	$m ({V}) = \frac{2}{3}, m ({C}) = \frac{1}{3}$	$m ({V}) = \frac{2}{3}$	$m ({V}) = \frac{2}{3}$
		$m ({C}) = \frac{1}{3}$	$m ({C, V}) = \frac{1}{3}$	$m ({V}) = 1$

BPA: basic probability assignment.

An improved method for BPA generation based on K-NN algorithm

In “Construct the model to generate BPA based on K-means method” section, a BPA generation model is constructed based on K-means algorithm, and its use process is demonstrated by a numerical example. Inspired by K-NN algorithm, in this section, the BPA generation method in section “Generate BPAs based on the constructed model” will be improved from a novel perspective. For the sample to be tested, its K nearest “neighbors” will be obtained through distance measure, and the belief distribution of each focal element associated with the sample will be determined based on the information of “neighbors.” Then, the corresponding BPAs are generated. The method is described in detail below.

The BPA generation method developed in this section can be considered as an extended version of the one in section “Generate BPAs based on the constructed model,” which is based on the BPA generation model built in “Construct the model to generate BPA based on K-means method” section. In the section “Generate BPAs based on the constructed model,” the BPAs of samples are directly determined by attribute values. However, in practical applications, for samples to be classified, there are many cases of overlapping of class domains. In this situation, the method in section “Generate BPAs based on the constructed model” is no longer applicable, while K-NN algorithm provides a admirable solution for such problems.

The K-NN-based BPA generation method

Based on the constructed model in Section “Construct the model to generate BPA based on K-means method,” let us denote the set of attributes to be tested x, where the number of elements in x is 1 to m. Let us denote the set of the K NNs of x be $Φ$ . For any $x_{i} \in Φ$ , $i = 1, \dots K$ , its BPA can be calculated based on the method in Section “Generate BPAs based on the constructed model” and denoted as $m_{i}$ . The K BPAs of the NNs of x then can be combined by employing Dempster’s rule. This is feasible because all BPAs have the same FOD $C = {C_{1}, \dots, C_{p}}$ . In addition, the distance between the sample x to be identified and the neighbor $x_{i}$ must be taken into account. If x is far from $x_{i}$ , then $x_{i}$ is considered with a small influence (weight) on the determination of BPA. Therefore, the bigger distance, the smaller weight of the neighbor. The weight of $m_{i}$ associated with $x_{i}$ can be defined as follows⁶¹

ω_{i} = \exp (- γ \cdot d_{i})

(12)

d_{i} = \frac{d (x, x_{i})}{\min_{i = 1}^{k} d (x, x_{i})}

(13)

where function $d (\cdot)$ is a distance measure, $γ$ is a tuning parameter employed to adjust the influence of distance, and $d_{i}$ is the relative distance of the sample to the neighbor $x_{i}$ with respect to the minimum distance to the NNs.

Note that in equation (13), if one or more neighbors coincide with the sample to be tested, that is, $x = x_{i}$ , then $d (x, x_{i}) = 0$ , then this method fails. In this case, let $ω_{i} = 1$ . After all the $ω$ values are obtained, the normalization operation needs to be carried out to obtain the final weights of neighbors

w_{i} = \frac{ω_{i}}{\sum_{i = 1}^{K} ω_{i}}

(14)

it satisfies that $\sum_{i = 1}^{K} w_{i} = 1$ .

To determine the final BPA of x, the BPAs of its K neighbors need to be combined by the rule of Dempster considering the weights. First, the weighted average BPA of x associated with K neighbors can be calculated as

m_{x}^{w} (A) = \sum_{i = 1}^{K} w_{i} \cdot m_{i} (A)

(15)

Then the weighted average BPA $m_{x}^{w}$ will be combined $K - 1$ times based on the idea of Murphy⁶² as

m_{x} = \underset{K}{\underset{︸}{m_{x}^{w} \oplus m_{x}^{w} \oplus, \dots, \oplus m_{x}^{w}}}

(16)

where $m_{x}$ is the determined BPA of x. Repeat this step to obtain the BPAs of all attribute values associated with samples in the test set.

Numerical examples

To demonstrate the improved BPA generation method based on K-NN, several numerical examples are given as follows. These examples are also based on the IRIS data set. The specific information about the IRIS data set has been introduced in detail in “Numerical example” under “The model to determine BPA based on K-means method” section, which will not be repeated here. In this section, the parameter $γ \in [5, 20]$ in equation (12) is optimized using the training data and determined as $γ = 8$ .

In the first example, let $s = 1$ according to the BPA generation model in section “Construct the model to generate BPA based on K-means method,” which means that there is only one element in each set of attributes. Similarly, 80% of the samples are selected as the training set, and the rest as the test set. The BPA generation model can be constructed as manifested in Figure 5. The attribute “Petal Width” is taken as an example to illustrate the generation method of BPA based on K-NN algorithm. For sample $(5.5, 2.3, 4.0, 1.3)$ to be tested in the test set, attribute value 1.3 is taken as an example, and let $K = 8$ in this case. The eight NNs of the attribute value 1.3 are $1.2 \in C$ , $1.2 \in C$ , $1.3 \in C$ , $1.3 \in C$ , $1.3 \in C$ , $1.4 \in C$ , $1.4 \in C$ , and $1.4 \in V$ , respectively. The BPAs of the eight NNs can be determined based on the method in the section “Generate BPAs based on the constructed model,” as $m_{1} ({C}) = 1$ , $m_{2} ({C}) = 1$ , $m_{3} ({C}) = 1$ , $m_{4} ({C}) = 1$ , $m_{5} ({C}) = 1$ , $m_{6} ({C}) = 1$ , $m_{7} ({C}) = 1$ , and $m_{8} ({V}) = 1$ , whose weights can be calculated by equations (12)–(14) as $w_{1} = 0$ , $w_{2} = 0$ , $w_{3} = 1 / 3$ , $w_{4} = 1 / 3$ , $w_{5} = 1 / 3$ , $w_{6} = 0$ , $w_{7} = 0$ , and $w_{8} = 0$ . The final BPA of attribute value 1.3 can be determined by equations (15) and (16) as $m_{1.3} ({C}) = 1$ . In addition, for attribute values 5.5, 2.3, and 4.0, their BPAs can be obtained as $m_{5.5} ({C}) = 1$ , $m_{2.3} ({C}) = 1$ , and $m_{4.0} ({C}) = 1$ . What’s more, the real category of sample $(5.5, 2.3, 4.0, 1.3)$ is C, which manifests the effectiveness of this method.

Figure 5.

BPA generation model when the number of subsets is equal to the number of attributes.

In the second example, the effect of K value in K-NN algorithm on BPA generation results is demonstrated. Let $s = 2$ in this case, take the set of attributes $(SL, SW)$ as an example, and the BPA generation model can be constructed based on the method in section “Construct the model to generate BPA based on K-means method” as manifested in Figure 6. Two test samples $(4.7, 3.2)$ and $(4.9, 3.0)$ are selected to demonstrate their BPAs generation process. For sample $(4.7, 3.2)$ , when $K = 5$ , there are four NNs in class “S” and one belongs to “ $S \cap C$ “ (see the sub-figure on the left). And its final BPA can be determined as $m_{(4.7, 3.2)}^{5} {S} = 1$ , where the superscript represents $K = 5$ . When $K = 10$ , there are four NNs in class “S” and six in $″ S \cap C ″$ , in which case, the final BPA is $m_{(4.7, 3.2)}^{10} {S, C} = 1$ . This case manifests that different K values may generate different BPAs under the same conditions. For sample $(4.9, 3.0)$ , when $K = 5$ , all the NNs belong to class $″ S \cap C ″$ (see the sub-figure on the right). So its final BPA can be determined as $m_{(4.9, 3.0)}^{5} {S, C} = 1$ . When $K = 10$ , there are nine NNs in class $″ S \cap C ″$ and one belongs to “S,” in which case, the final BPA is $m_{(4.9, 3.0)}^{10} {S, C} = 1$ . This case illustrates that different K values may also generate same BPAs. Based on this example, the conclusion can be drawn that K value is crucial, and it can be determined in specific environment by the training-test method or according to experience.

Figure 6.

The process of generating BPA with different K values.

An application for classification based on multi-sensor information fusion

In this section, a K-means and K-NN based classifier (KKC) is constructed based on the proposed method to determined BPAs. The real data set in the UCI machine learning database (http://archive.ics.uci.edu/ml/datasets/) is employed to test the performance of the classifier. And the test results are compared with other classical classifiers to highlight the effectiveness of KKC in this article. The experiment is conducted based on the method of 10-fold cross-validation. Next, the classifier, KKC will be constructed first and then the experiment will be described.

A classifier based on the proposed BPA generation method

A classifier called KKC is constructed in this section based on the proposed BPA generation method. Let the number of attributes in sample S be $ε$ , then $ε$ BPAs of S can be determined on FOD $C = {C_{1}, \dots, C_{p}}$ by the built BPA generation model, which are denoted as $m_{1}, m_{2}, \dots, m_{ε}$ . The combined BPA of S can be calculated based on the fusion rule of Dempster and denoted by $m = m_{1} \oplus m_{2} \oplus \dots \oplus m_{ε}$ . Then the probability distribution of the combined BPA m can be calculated by pignistic probability function in Definition 2. Finally, the classification of sample S can be determined as $L_{S} = \max ({B e t P_{m} (C_{1}), B e t P_{m} (C_{2}), \dots, B e t P_{m} (C_{p})})$ , where $\max$ means taking the maximum value. Finally, sample S is classified as $L_{S}$ with the constructed classifier KKC that can be applied widely for multi-sensor information fusion.

Experiment

The data sets employed in this experiment are all from UCI machine learning database, which will be briefly described below. Ionosphere data set, from the Johns Hopkins university Ionosphere database, was collected by a radar system in Goose Bay, Labrador. The system consists of 16 phased-array antennas with a total transmitting power of about 6.4 kW. The data set is categorized into two categories: “good” and “bad.”“Good” radar echoes are those that show some type of structure in the ionosphere and “Bad” is evidence that there is no structure. This data set contains 351 samples associated with 34 attributes. IRIS data set has been introduced in section “The model to determine BPA based on K-means method.”Heart data set, a heart disease database, records the symptoms of 270 heart patients with 13 attributes that determine whether a person has heart disease. Wine data set records the results of chemical analysis of three different wines produced in the same region of Italy, by analyzing the content of 13 components in wine to classify different wines. Australian data set records relevant information of credit approval in Australia. The attribute names and attribute values in the data set have been replaced by meaningless symbols. This data set contains 14 attributes and records 690 samples. Hepatitis data set records 19 results for 155 patients for hepatitis classification. Connectionist Bench data set uses 60 attributes to study the classification of sonar signals. A total of 208 samples are recorded. According to the collected sonar signals, they can be divided into two categories: metal cylinder reflection and rock cylinder reflection. These data sets cover the fields of medical diagnosis, physics and science, which can more comprehensively verify the effectiveness of KKC in this article. The basic information is given in Table 5.

Table 5.

General information about the real data sets.

Data set	#Instance	#Class	#Attribute	AttrChar^a	Area
Ionosphere	351	2	34	R I	Physical
IRIS	150	3	4	R	Life
Heart	270	2	13	R C	Life
Wine	178	3	13	R	Physical
Australian	690	2	14	R I C	Economics
Hepatitis	155	2	19	R I C	Life
Connectionist Bench	208	2	60	R	Physical

^aThe suffix R is short for Real, I is short for Integer, and C is short for Categorical.

To demonstrate the superiority of the classifier in this article, several classical classification algorithms are selected: Support Vector Machine (SVM), Decision Trees, Multi-layer Perceptron Classifier, Naive Bayes, SVM with Radial Basis Function (SVM-RBF), and RBF Network (RBFN). The above classification algorithms are only employed as the comparison methods, so the detailed introduction of them is ignored here.

In order to carry out comparative experiments, all data sets are divided into training set and test set in proportion. The BPA generation models are constructed based on training sets using the method developed in Section “Construct the model to generate BPA based on K-means method.” For each sample in different test sets, BPAs of all the attributes are calculated based on the method presented in sections “Generate BPAs based on the constructed model” and “The K-NN-based BPA generation method.” The classifier KKC is employed for BPAs of attributes associated samples from different test sets, then all the samples to be classified are assigned class labels. In this experiment, for classifier KKC, we set the parameters $K = 8$ and $γ = 8$ . If the predicted result is consistent with the real class of the sample, the classification is considered accurate. The ratio of the number of accurately classified samples to the total number of samples to be tested is defined as the classification accuracy of the classifier. In the experiment, the 10-fold cross-validation method is employed, which is a common method to test the accuracy of classification algorithm. In this method, original data set is divided into 10 parts, one part is selected as the test set and the other nine parts as the training set. Next, the test set is changed until each part of the 10 have been test set, that is, a total of 10 experiments are conducted. The classification accuracy of the classifier in each experiment is recorded, and the mean value of the results of 10 experiments is taken as the standard to evaluate the accuracy of the classifier. The classification accuracy of different classifiers based on different data sets is provided in Table 6.

Table 6.

Classification accuracy of different methods.

Data set	SVM	DT	MPC	NB	SVM-RBF	RBFN	KKC
Ionosphere	83.72	85.35	91.21	81.74	71.34	91.00	90.30
IRIS	91.43	90.54	92.03	91.53	90.23	92.70	92.43
Heart	81.41	75.26	74.59	82.65	81.54	81.29	82.94
Wine	90.44	90.43	92.05	93.13	73.94	90.72	92.30
Australian	82.35	85.21	92.64	75.84	81.59	80.34	90.21
Hepatitis	80.12	75.54	75.45	82.55	74.04	82.50	82.40
Connectionist Bench	75.28	70.56	80.18	69.12	72.14	72.60	80.23
Average	83.54	81.84	85.45	82.37	77.83	84.45	87.26

SVM: Support Vector Machine; DT: Decision Trees; MPC: Multi-layer Perceptron Classifier; NB: Naive Bayes; SVM-RBF: SVM with Radial Basis Function; RBFN: RBF Network; KKC: K-means and K-NN based classifier.

As can be observed from Table 6, classifier KKC works as well as the state-of-the-art classifiers, and KKC has slight advantages over other classifiers. A large number of practical applications can be transformed into classification problems (e.g. decision making, medical diagnosis, fault diagnosis, etc.), so the developed classifier would be easily employed on demand. The performance of KKC demonstrates that the constructed model to determine BPAs is effective.

Conclusions and future research

In this article, we tried to establish a model for determining BPA in DST. Models which employ two well-known types of machine learning algorithms, K-means and K-NN, would play a crucial part in multi-sensor information fusion. The implementation details of the method presented in this article on real data sets are demonstrated with several numerical examples. The approach proposed in this article is considered sufficiently general and capable of being usefully adapted to situations involving the BPAs generation problem. In order to generalize the BPAs determination method to real environment, a classifier called KKC is constructed, which can be easily extended to other multi-source information fusion applications, such as evaluation, decision, prediction, and so on. In the empirical part of our study, the classification accuracy of KKC is compared with other advanced classifiers based on several real data sets. Experimental results indicate that KKC performs better in classification than other algorithms, which would be attributed to the BPAs generation method proposed in this article.

Note that there still remains some problems to be solved in future research. A few crucial points are summarized below. When using the model proposed in this article, the problem of parameter determination is involved, such as K in K-means, K in K-NN, and parameter $γ$ . The practical solutions to determine relevant parameters should be provided in future studies. In addition, the classifier KKC can be enhanced by improving the fusion rule, such as discounting the information source of each attribute or amending the combination rule of Dempster.

In short, although the current version of BPAs generation model has some shortcomings, it is still an effective method to determine BPAs in DST. First, a model to generate BPAs is innovatively constructed based on K-means method. Second, BPAs of objects to be identified can be determined based on the constructed model. Third, the BPA generation method is improved by K-NN algorithm. Moreover, an efficient classifier is bulit based on the proposed approaches. In the future research, the theoretical framework of the BPAs determination method could be increasingly perfected. And the presented methodology should be employed to a wider range of applications in multi-sensor information rule.

Footnotes

Acknowledgements

The authors greatly appreciate the reviewers’ valuable comments and the editor’s encouragement.

Handling Editor: Miguel Ardid

Declaration of conflicting interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: The work is partially supported by National Natural Science Foundation of China (Grant Nos. 71472053, 71429001, 91646105).

ORCID iD

Liguo Fei

References

Hall

McMullen

SA.

Mathematical techniques in multisensor data fusion. Norwood, MA: Artech House, 2004.

Hall

Llinas

An introduction to multisensor data fusion. Proc IEEE 1997; 85(1): 6–23.

Dempster

AP.

Upper and lower probabilities induced by a multivalued mapping. Ann Math Stat 1967; 38: 325–339.

Shafer

A mathematical theory of evidence. Princeton, NJ: Princeton University Press, 1976.

Yager

Liu

Classic works of the Dempster-Shafer theory of belief functions, vol. 219. New York: Springer, 2008.

Fei

Wang

Chen

et al . A new vector valued similarity measure for intuitionistic fuzzy sets based on OWA operators. Iran J Fuzz Syst 2009; 16(3): 113–126.

Xiao

Ding

Divergence measure of Pythagorean fuzzy sets and its application in medical diagnosis. Appl Soft Comput 2019; 79: 254–267.

Xiao

A novel multi-criteria decision making method for assessing health-care waste treatment technologies based on D numbers. Eng Appl Artifi Intell 2018; 71: 216–225.

Fei

Feng

Liu

et al . On intuitionistic fuzzy decision-making using soft likelihood functions. Int J Intell Syst, https://www.researchgate.net/publication/331868935_On_interval-valued_fuzzy_decision-making_using_soft_likelihood_functions

10.

Xiao

A hybrid fuzzy soft sets decision making method in medical diagnosis. IEEE Access 2018; 6: 25300–25312.

11.

Xiao

Multi-sensor data fusion based on the belief divergence measure of evidences and the belief entropy. Inform Fus 2019; 46: 23–32.

12.

Dong

Zhang

et al . Combination of evidential sensor reports with distance function and belief entropy in fault diagnosis. Int J Comput Commun Control 2019; 14(3): 293–307.

13.

Song

Deng

A new method to measure the divergence in evidential sensor data fusion. Int J Distribut Sensor Netw 2019; 15(4), https://journals.sagepub.com/doi/10.1177/1550147719841295

14.

Cui

Liu

Zhang

et al . An improved Deng entropy and its application in pattern recognition. IEEE Access 2019; 7: 18284–18292.

15.

Xia

Feng

Liu

et al . An evidential reliability indicator-based fusion rule for Dempster-Shafer theory and its applications in classification. IEEE Access 2018; 6: 24912–24924.

16.

Deng

Jiang

An evidential axiomatic design approach for decision making using the evaluation of belief structure satisfaction to uncertain target values. Int J Intell Syst 2018; 33(1): 15–32.

17.

Fei

Deng

A new divergence measure for basic probability assignment and its applications in extremely uncertain environments. Int J Intell Syst 2019; 34(4): 584–600.

18.

Xiao

A multiple-criteria decision-making method based on D numbers and belief entropy. Int J Fuzz Syst 2019; 205: 1–10.

19.

Jiang

An evidential Markov decision making model. Inform Sci 2018; 467: 357–372.

20.

Fei

On interval-valued fuzzy decision-making using soft likelihood functions. Int J Intell Syst 2019; 34(7): 1631–1652.

21.

Chang

Xue

et al . Multiple criteria group decision making with belief distributions and distributed preference relations. Eur J Operat Res 2019; 273(2): 623–633.

22.

Fei

Deng

DS-VIKOR: a new multi-criteria decision-making method for supplier selection. Int J Fuzz Syst 2019; 21(1): 157–175.

23.

Gao

Deng

The negation of basic probability assignment. IEEE Access 2019; 7(1): 101109.

24.

Kang

Zhang

Gao

et al . Environmental assessment under uncertainty using Dempster-Shafer theory and z-numbers. J Amb Intell Human Comput 2019, https://doi.org/10.1007/s12652-019-01228-y

25.

Deng

Jiang

Wang

Zero-sum polymatrix games with link uncertainty: a Dempster-Shafer theory solution. Appl Math Comput 2019; 340: 101–112.

26.

Deng

Jiang

D number theory based game-theoretic framework in adversarial decision making under a fuzzy environment. Int J Approx Reason 2019; 106: 194–213.

27.

Jiang

A correlation coefficient for belief functions. Int J Approx Reason 2018; 103: 94–106.

28.

Deng

Dependent evidence combination based on DEMATEL method. Int J Intell Syst 2019, https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=8218753

29.

Deng

et al . A new method to determine basic probability assignment from training data. Knowledge Based Syst 2013; 46: 69–80.

30.

Mahadevan

et al . A non-parametric method to determine basic probability assignment for classification problems. Appl Intell 2014; 41(3): 681–693.

31.

Zhang

Chan

et al . A new method to determine basic probability assignment using core samples. Knowledge Based Syst 2014; 69: 140–149.

32.

Xiao

An improved method to transform triangular fuzzy number into basic belief assignment in evidence theory. IEEE Access 2019; 7: 25308–25322.

33.

Jiang

Yang

Luo

et al . Determining basic probability assignment based on the improved similarity measures of generalized fuzzy numbers. Int J Comput Commun Control 2015; 10(3): 333–347.

34.

Jiang

Zhan

Zhou

et al . A method to determine generalized basic probability assignment in the open world. Math Prob Eng 2016; 2016: 3878634.

35.

Suh

Yook

A method to determine basic probability assignment in context awareness of a moving object. Int J Distribut Sensor Netw 2013; 9(8): 972641.

36.

Sun

Deng

A new method to identify incomplete frame of discernment in evidence theory. IEEE Access 2019; 7(1): 15547–15555.

37.

Sun

Deng

A new method to determine generalized basic probability assignment in the open world. IEEE Access 2019; 7: 52827–52835.

38.

Zhang

Deng

A method to determine basic probability assignment in the open world and its application in data fusion and classification. Appl Intell 2017; 46(4): 934–951.

39.

Qin

Xiao

An improved method to determine basic probability assignment with interval number and its application in classification. Int J Distribut Sensor Netw 2019; 15(1), https://journals.sagepub.com/doi/full/10.1177/1550147718820524

40.

Deng

Liu

Deng

et al . An improved method to construct basic probability assignment based on the confusion matrix for classification problem. Inform Sci 2016; 340: 250–261.

41.

Guan

. Study on algorithms of determining basic probability assignment function in Dempster-Shafer evidence theory. In: Proceedings of the 2008 international conference on machine learning and cybernetics, vol. 1, Kunming, China, 12–15 July 2008, pp. 121–126. New York: IEEE.

42.

Cui

Bicheng

Determination of basic probability assignment based on cloud model and application. J Data Acquisit Process 2015; 30(6): 1318–1324.

43.

Chang

Zhou

Chen

et al . Akaike information criterion-based conjunctive belief rule base learning for complex system modeling. Knowledge Based Syst 2018; 161: 47–64.

44.

Fan

Song

Lei

et al . Evidence reasoning for temporal uncertain information based on relative reliability evaluation. Exp Syst Appl 2018; 113: 264–276.

45.

Zhou

Liu

et al . Fault-alarm-threshold optimization method based on interval evidence reasoning. Sci China Inform Sci 2019; 62(8): 89202.

46.

Han

Dezert

Yang

Belief interval-based distance measures in the theory of belief functions. IEEE Trans Syst Man Cybernet Syst 2018; 48(6): 833–850.

47.

Huang

Yang

Jiang

Uncertainty measurement with belief entropy on the interference effect in the quantum-like Bayesian networks. Appl Math Comput 2019; 347: 417–428.

48.

Gao

Deng

The generalization negation of probability distribution and its application in target recognition based on sensor fusion. Int J Distribut Sensor Netw 2019; 15, https://journals.sagepub.com/doi/full/10.1177/1550147719849381

49.

Xiao

Qin

. A weighted combination method for conflicting evidence in multi-sensor data fusion. Sensors 2018; 18(5): 1487.

50.

Deng

Analyzing the monotonicity of belief interval based uncertainty measures in belief function theory. Int J Intell Syst 2018; 33(9): 1869–1879.

51.

Chang

Zhou

Liao

et al . Generic disjunctive belief rule base modeling, inferencing and optimization. IEEE Trans Fuzz Syst 2019, https://ieeexplore.ieee.org/document/8606961

52.

Jiang

An evidential dynamical model to predict the interference effect of categorization on decision making. Knowledge Based Syst 2018; 150: 139–149.

53.

Zhou

Liu

X-B

Chen

Y-W

et al . Evidential reasoning rule for MADM with both weights and reliabilities in group decision making. Knowledge Based Syst 2018; 143: 142–161.

54.

Deng

Jiang

Dependence assessment in human reliability analysis using an evidential network approach extended by belief rules and uncertainty measures. Ann Nuclear Energy 2018; 117: 183–193.

55.

Wang

Qiao

Zhang

Trust evaluation based on evidence theory in online social networks. Int J Distribut Sensor Netw 2018; 14(10), https://journals.sagepub.com/doi/full/10.1177/1550147718794629

56.

Smets

Decision making in the TBM: the necessity of the pignistic transformation. Int J Approx Reason 2005; 38(2): 133–147.

57.

Hartigan

Wong

MA.

A k-means clustering algorithm. J Roy Stat Soc Series C 1979; 28(1): 100–108.

58.

Hellman

ME.

The nearest neighbor classification rule with a reject option. IEEE Trans Syst Sci Cybernet 1970; 6(3): 179–185.

59.

Jiang

An improved soft likelihood function for Dempster-Shafer belief structures. Int J Intell Syst 2018; 33(6): 1264–1282.

60.

Xia

Feng

Liu

et al . On entropy function and reliability indicator for D numbers. Appl Intell 2019, https://link.springer.com/article/10.1007/s10489-019-01442-3

61.

Liu

Pan

Dezert

et al . Classifier fusion with contextual reliability evaluation. IEEE Trans Cybernet 2018; 48(5): 1605–1618.

62.

Murphy

CK.

Combining belief functions when evidence conflicts. Decis Supp Syst 2000; 29(1): 1–9.

A novel method to determine basic probability assignment in Dempster–Shafer theory and its application in multi-sensor information fusion

Abstract

Keywords

Introduction

Preliminaries

DST

K-means clustering method

K-NN method

The model to determine BPA basedon K-means method

Construct the model to generate BPA based on K-means method

Generate BPAs based on the constructed model

Numerical example

An improved method for BPA generation based on K-NN algorithm

The K-NN-based BPA generation method

Numerical examples

An application for classification based on multi-sensor information fusion

A classifier based on the proposed BPA generation method

Experiment

Conclusions and future research

Footnotes

Acknowledgements

Declaration of conflicting interests

Funding

ORCID iD

References