Hierarchical Aggregation of Uncertain Sensor Data for M2M Wireless Sensor Network Using Reinforcement Learning

Abstract

The communication among heterogeneous embedded devices could lead to correctness problems in M2M environment. Sometimes, it is not easy to classify the data because they may provide wrong or uncertain information. The data from these devices should be gathered in a safe, efficient, and right manner without the help of server or human intervention; even the low-level information from each device causes interoperability problems. This data gathering or data fusion process is very important because the data mapping result could be understood as totally different situation and hence cause different reaction, feedback, and controls. In this paper, we propose a hierarchical aggregation for uncertain sensor data using reinforcement learning to get correct and efficient data gathering result for reliable wireless sensor network. In our proposal, we add a new category for uncertain data and classify them through reinforcement learning using hierarchical subcategories. By adopting our proposed aggregation, false classification caused by uncertain data can be decreased and the correctness of data gathering can be enhanced.

1. Introduction

In wireless sensor network, a lot of sensor nodes are deployed in the field; they sense data and send them to the sink node. Sensor network technology has been utilized for military, environment monitoring, healthcare, agriculture, weather conditions, home automation, vehicular automation, and so on. This machine-to-machine (M2M) communication application area will be further extended in various industries. For decreasing communicational and computational overhead of dealing with the sensed data, we need aggregators or clusterheads which can gather the data from each sensor node and then aggregate them before sending the aggregated data to the server system. This process, which is called aggregation or data fusion, is the technique which combines data from multiple sources and gathers that information to make inferences. With this process, we can decrease the amount of redundant data and filter the wrong sensed results, which include false positive and false negative, and can increase the accuracy of the result through classification of the sensed data.

Several sensed data aggregation mechanisms have been proposed considering the efficiency and correctness of the result. However, they have disadvantages when some data are not easy to classify because they are located somewhere between the two groups and show uncertainty as in Figure 1. A linear function can be applied to data as in Figure 1(a). In some cases, we need a nonlinear function as in Figure 1(b) or a delimitation function to detect boundaries for uncertain data as in Figure 1(c).

Figure 1

Data classification problem with data uncertainty.

In this work, we propose an efficient data classification mechanism which can determine the uncertain data correctly based on risk levels of system purpose. Some sensed data stream and environment information can be easily classified in any of the groups while the others are hard to be categorized because of their subtle and slight differences. Our proposed method can assign the uncertain data to proper class through the reinforcement learning process and class hierarchy and relevance information. Automatic detection of uncertain area and organization method is well represented in our previous work [1]. In ubiquitous environment, we need several preprocessing modules, acquisition module and context management module, for service [2]. Based on the previous process, our proposal can be adopted as an aggregation module at the last phase of processing modules for mobile device. With our proposal, sensed data and environment information can be classified in a more accurate and efficient way so that the server system can analyze the status correctly and cope with the situation properly.

This paper is organized as follows. In Section 2, we present related work about general aggregation models and data classification mechanisms. In Section 3, we explain our proposed aggregation method based on hierarchical classification and reinforcement learning. Section 4 describes simulation results and analysis. We conclude the paper and describe our future work in Section 5.

2. Related Work

In ubiquitous environment, the types of information, their complexity, and their similarity have been increased. Even if the similar features and common values were extracted from each data, some of them are hard to classify because they are uncertain to be assigned to any of the groups, such as in Figure 1. Some researches dealing with this problem have been carried out.

Su et al. proposed a hierarchical aggregate classification mechanism in tree structure where each sensor node locally makes cluster analysis and forwards only its decision to the parent node [3]. The decisions are aggregated along the tree, and eventually the global agreement is achieved at the sink node. Kunal and Mohan presented a distributed reinforcement learning (DReL) middleware that provides adaptive WSN management by applying techniques from reinforcement learning and utility theory by exploiting a two-tier learning scheme [4]. Mal-Sarkar et al. proposed a soft computing approach to manage uncertainty by reasoning over inconsistent, incomplete, and fragmentary information using classical rough set and dominance-based rough set theories [5]. Yu et al. presented a comprehensive framework for managing continuously changing data objects with insights into the spatiotemporal uncertainty problem and presented an original parallel-processing solution for managing the uncertainty using the map-reduce platform of cloud computing [6]. Zhang et al. proposed a trust based framework, which is rooted in statistics and some other coupled techniques. The trustworthiness of each individual sensor node is evaluated by using an information theoretic concept, Kullback-Leibler (KL) distance, to identify the compromised nodes through an unsupervised learning algorithm [7]. Ou et al. presented a reinforcement learning approach to multisensor fusion problems with conflicting objectives when the mapping of multiple raw streams of sensory data to the appropriate actions is not easy, especially when multiple conflicting objectives are involved [8]. Savić and Limbourg proposed a reliable aggregation of sensor data for safety related systems applying the Dempster-Shafer theory to combine multiple unreliable and uncertain knowledge sources [9]. Fong et al. presented a stream-based classification to handle continuous data streams, which are unbound and unstructured and simulated on analyzing biological signal such as diagnostic tests in real time [10]. In [11], Hońko proposed a framework for generating classification rules from relational data. The framework was intended for mining relational data and was defined in granular computing theory. Pedrycz and Bargiela designed granular prototypes being reflective of the structure of data to a higher extent than the representation that is provided by their numeric counterparts [12]. The design was formulated as an optimization problem, which is guided by the coverage criterion, meaning that it maximizes the number of data for which their granular realization includes the original data.

3. Hierarchical Classification and Reinforcement Learning for Data Uncertainty

In this work, our goal is to increase reasonability while minimizing predictive error using class hierarchy reinforcement learning.

Figure 2 simply shows elements of reinforcement learning and explains the process of how action influences next states in transition model. Reward $r_{i}$ is an immediate value of state-action transition and policy maps states to action $a_{i}$ , and an agent can be used to predict how the environment will respond to its actions. Given a state and an action, the model produces a prediction of the resultant next state and next reward. If the model is stochastic, there are several possible next states and next rewards, each with some probability of occurring [13].

Figure 2

Elements of reinforcement learning.

Most classification systems assume that the level of all classes is flat and each document is labeled by one class. Hierarchically structured classes were examined in some researches where the classes are reorganized into a hierarchical structure for increasing specificity Figure 3. Though a document assigned to a child class is automatically considered as belonging to a parent class, the document is not allowed to belong to more than two classes but on a generalization path in the hierarchy. Hierarchical classification can achieve much better performance than flat classification; however, it is also known as the complex related to class organization. Combining functions for class assignment is also very high. In general, top-down method is known as more superior than bottom-up. However, bottom-up method is more flexible than top-down because we need only to adjust combining functions without entire retraining process to update existing rules.

Figure 3

Comparison of classifications depending on hierarchy: solid line represents the need to apply a classification algorithm.

Figure 4 shows the overview of our proposed classification method based on hierarchical classification.

Figure 4

Aggregation method using hierarchical classification.

The training set construction part of each target category is based on our previous work, and it was proposed for automatic finding of uncertain boundary area X through clustering algorithms, well presented in previous study [2]. In our system, uncertain information collected from sensor data stream and current status are reorganized to subcategories based on class relevance and hierarchy. And the information is used for learning process, and then future classification can be reassigned more correctly performing a bottom-up hierarchical classification. New input data can also be classified more accurately in progress through multiple stages in a manner of reinforcement learning. And they can be decided under predefined condition or action in the final stage. After the aggregated data are sent to the server, the result is sent back to learning process. Processes are described in more detail.

3.1. Category Design and Definition

In Figure 1, decision boundary is not a line but a region, and the data on this boundary region could be predicted as false positive. Thus we separate the training set into target and intermediate categories as in (c). Figure 4 shows the categorization scheme for hierarchical training and assignment rules of reinforcement approach. For hierarchical training, we add category X, in addition to the target category C. A set of target categories C is predefined categories given by the user for the classification task, and the set is ultimately classified. The set of subcategories ${SC}_{n}$ is divided relatively into small classes from a target category $c_{i}$ , whose subcategories $c_{i j}$ are disjoint from one another. Aset of intermediate categories X is for uncertain data. We need to analyze the relevance among target classes to categorize the data in the class X, and these unclassified data categorized in X need to be assigned to target classes later. Here is the definition of each category.

3.1.1. A Set of Target Categories

A set of target categories, $C = {c_{1}, c_{2}, \dots, c_{n}}$ , is a set of predefined categories given by user for classification task. Also, $Tr {C}$ is a set of training documents for a set of target categories, C. They are constructed from documents representing each $c_{i}$ .

3.1.2. A Set of Boundary Categories

A set of boundary categories, $X = {x_{1}, x_{2}, \dots, x_{m}}$ , means a conceptual region representing uncertain boundary area distinguishing of each target class. It is to analyze the relevance among target classes. The data located around decision boundary belong to X. Figure 1(b) represents a confusional area between $c_{i}$ and $c_{j}$ , and this area becomes meaningless when there are few common features and when there is also low correlation between $c_{i}$ and $c_{j}$ .

3.1.3. A Set of Subclasses

A set of subclasses, ${Sc}_{i} = {c_{i 1}, c_{i 2}, \dots, c_{i j}}$ , is a set of more relatively small classes from a target category, $c_{i}$ , where each $c_{n j}$ is disjoint with one another. Also, the entire set of subclasses for a set of target categories, $SC$ , is presented as $SC = {sc}_{1} \cup {sc}_{2} \cup, \dots, \cup {sc}_{n}$ .

3.2. Reinforcement Learning and Classification for Data Uncertainty

In aggregation module, sensed data are checked if they are authentic or not and then gathered according to environment information representing current status. After being gathered, they are aggregated and finally transmitted to server system. Uncertain information of the entire data can be assigned to the proper categories, and then feedback rules are found for a guideline on the previous stages in reinforcement manner.

Generally, traditional classifier makes a list of candidates in given data. And the list is simply assigned to the first candidate in the list, even though there may be subtle differences among the candidates. But, it is usually prone to make misclassification errors in an area of uncertain data in that case. Figure 4(b) shows how computation is adapted in the assigning process for uncertain data.

In stage 1, to classify the data in low layer category, we used well trained training dataset. They were trained by the reinforcement learning scheme. We can get the first candidate list made by bottom-up hierarchical classification based on learning scheme. These lists are sent to the next stage and analyzed on comparisons using the candidates' scores given by the classifier. In stage 2, we define a simple rule and parameter values as threshold to filter the uncertain data. When the data score satisfies a condition rule, including parameter values, the data are classified in the final target category. When there are two categories, max_support and min_confidence represent the threshold scores to be considered for target categorization.

Time period in learning policy is optimally decided by the experts. Constrained-decision is also set by the expert for training. However, it can be computed dynamically depending on the environment information collected by the clusters as rounds are repeated. In constrained-decision matrix, $D = {D_{level_0}, D_{level_1}, \dots, D_{level_n}}$ , level_i is the signal level to be monitored. And $D_{level_i} = [D (t, t + τ), w]$ is the pair of data stream patterns in time interval τ and the related weight.

Figure 5 shows the learning result of sensed value modification with weighted values applied. To compare the original data stream for learning and the learning result adopting our proposal, we accomplished some computations using basic data. We set the learning time as 5, 10, 15, 20, 25, 30, and 35 and showed the difference between learning results. $T_{1}$ shows that the current status is negative and $T_{3}$ shows positive, where $T_{2}$ is not clear. Solid line shows the original values sensed by the nodes, and these values are adjusted in dotted line according to $T^{'} (t, t + τ)$ in Algorithm 1 to get the better decision on the situation. Our proposal makes it clear whether the result is positive or negative. By adjusting the data, we can make more accurate decisions.

Algorithm 1: Finding optimal parameters using reinforcement learning.

Reinforcement Learning Phase:

(i) Goal: finding optimal parameters for signal boundaries, where # of Signal level is n

(ii) Input: data stream of sensor node gathered by learning policy P.

P is composed of a learning schedule, i.e. interval of learning time, and constrained-decision matrix

$D = {D_{l e v e l_0}, D_{l e v e l_1}, \dots, D_{l e v e l_n}}$ .

Stage 1:

(a) Learning Process—finding optimal parameters, max_support and min_confidence

We perform reinforcement learning with Schedule S and decision D given by experts.

(1) making learning data sets from sensor data streams segmented into different class and time between

t and $t + τ$ , $T (t$ , $t + τ)$ , where $0 < t < \infty$

(2) analyzing segmented sensor data to be covered or be diverged by constrained-decision, $T^{'} (t, t + τ)$ ,

weight applied

$T^{'} (t, t + τ) = T (t, t + τ) + w$ ,

w = weight parameter given by expert-constraint by (1)

(3) aggregating values in segments

$A_{level_i} = | \min (T (t, t + τ)) - \max (T^{'} (t, t + τ)) | + σ (T (t, t + τ)), level_i = {0,1, 2}$

(4) computing threshold parameters for setting boundary region by (2),

parameter(0) means max_support, upper_limit value in signal level_0.,

parameter(1) means min_confidence, lower_limit value in signal level_2.

these approximate factors, max_support and min_confidence are used in the next stage for hierarchical classification.

(b) Classifying Process—applying the category scheme segmented by optimal parameters to sensor data

Figure 5

The effect of aggregation scheme by reinforcement learning (x-axis: time, y-axis: data values).

In constrained-decision matrix $c D = {D_{level_0}, D_{level_1}, \dots, D_{level_n}}$ , level_i is signal level of data to be monitored. $D_{level_i}$ is denoted by $[D (t, t + τ), w]$ . It means a set of pairs with data pattern and corresponding weight value to each pattern in interval of time τ. Constrained-decision is set by experts at the initial stage and can be computed dynamically according to the aggregation result by aggregators or clusterheads

\begin{matrix} w = {\begin{cases} < 1, & if signal_level = 0 \\ = 1, & if signal_level = 1 \\ > 1, & if signal_level = 2 \end{cases} \\ in # of signal_level = 2, \end{matrix}

(1)

\begin{matrix} [parameters] = average (A_{level_i} + A_{level_i + 1}) . \end{matrix}

(2)

In aggregation phase, we can determine values of certain data stream based on parameters and aggregate values of uncertain data in intermediate segment Algorithm 2. When the signal level is 2, we can get two parameters, $p_{0}$ and $p_{1}$ . $p_{0}$ means upper_limit value in signal level_0 and $p_{1}$ means a lower_limit value in signal level_2. We finally send the aggregated data to sink, and the feedback of these results comes back to learning phase. These classified results analyzed through each phase will be trained as informative and useful learning cases for uncertain data and they are used for further analysis.

Algorithm 2: Assignment process in aggregation module.

Hierarchical Aggregation Phase:

Goal: send proper aggregated values for certain and uncertain data stream

Input: data stream of each cluster i, interval of sensing time τ and parameters [p]

Stage 2: Aggregation Process—aggregation of data stream represented as each signal level _i, $i = 0,1, 2$

we can determine values of certain data stream based on the parameters:

$p_{0}$ = max_support and $p_{1}$ = min_confidence, also aggregate values of uncertain data in intermediate segment.

for $i = 0$ to ∞, i += $i + τ$ {

if (average ( $T (i, i + τ)$ ) < $p_{0}$ ∣ average ( $T (i, i + τ)$ ) > $p_{1}$ ) {

$A i = | \min (T (t, t + τ)) - \max (T (t, t + τ)) | + σ (T (t, t + τ))$

else $A i = argmax (T (i, i + τ)))$

}

Stage 3: Transmission process—sending the set of aggregated data, A, to sink and the feedback to Stage 1, learning Process with

classification result. The result uses new learning cases for uncertain data.

4. Simulation and Analysis

4.1. Simulation Environment

In this section, we describe our simulation and analyze the result. We used Matlab for our simulation. Figure 6 shows the experiment environment for earthquake sensing. We divided the area into six sections, and a lot of sensor nodes and a clusterhead in each cluster are located in each location. Sensor nodes sense the vibrations of the earth and send the data to each clusterhead, and then the clusterheads aggregate the data before sending the result to server system or base station. We assume that there is an earthquake in time slot 3.

Figure 6

Simulation environment for sensing earthquake.

In the experiment, we used 0.3 and 1.6 as weight values for reinforcement learning phase. Then we get a set of parameters for thresholds in each location. Upper_limit value, $p_{0}$ , and lower_limit value, $p_{1}$ , are given by the learning results. Sensing values are between 0 and 8, and when sensors sense more than or equal to 4, it could be an earthquake, while 7 or 8 means a strong earthquake. We assume the tremor lasts more than 10 seconds. Sensor nodes sense the earth every one second and clusterheads aggregate data every 20 seconds.

4.2. Simulation Result and Analysis

We have compared our proposal and the other methods about sample data streams. As in Figure 7, our proposal shows a more informative and correct result by showing higher sensing value when there is a shake in location 1 in timeslot 3 and lower value in other timeslots or in different time slots.

Figure 7

Comparison of earthquake sensing results.

It is possible that these data streams can send unclear results and are prone to analytical error. In our proposal, when data in the field delivered to the clusterhead are uncertain, we categorize the data to classify them more correctly based on threshold parameters given by hierarchical categorizing and reinforcement learning. The result can be more accurate when more learning is carried out.

Figure 8 shows the aggregation result from six locations in 100 unit times. Solid line is from aggregated values and the dotted lines are from the values with our proposal adopted. With the values in dotted line, we can see that the aggregated values in solid line with slight differences are getting valid enough to be categorized in different classes.

Figure 8

Simulation result of our aggregation process (unit time = 100).

5. Conclusion and Future Work

In a sensor network composed of many sensor nodes and a server system, intermediate devices between two elements need to communicate with one another to send and aggregate the sensed data before the data are delivered to server system. As the complex applications in specific domain are increased, management for data uncertainty is required, and aggregation needs to be dealt with in the process.

In this work, we have proposed a hierarchical aggregation mechanism for classifying uncertain information from sensor data streams and environment information based on reinforcement learning to get correct and efficient data gathering result for reliable wireless sensor network. Different from traditional aggregation mechanisms, our proposal gives feedback to previous learning phase after sending aggregated information to server system. This feedback makes the data reorganized and retrained and it helps classification of uncertain data through the bottom-up hierarchical classification in reinforcement method. It provides additional advantages of reducing human efforts for identifying complex information.

Our proposal can be adopted in various application systems which may include characteristics of uncertainty, and it can increase the reliability and robustness of the system. Our future work is to simulate the mechanism in multiple vector spaces such as physical sensor stream and context information of time and locations. We will show that our work can improve the efficiency in data processing and management.

Footnotes

Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.

Acknowledgment

This research was supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education, Science, and Technology (no. 2012R1A1A3019459).

References

Choi

Y. J.

J. G.

Park

S. S.

Construction scheme of training data using automated exploring

Journal of Korea Information Processing Systems B 2009 16 6 1479 1488

Choi

Y. J.

Doh

Security based semantic context awareness system for M2M ubiquitous healthcare service

Ubiquitous Information Technologies and Applications 2013 214 187 196 Lecture Notes in Electrical Engineering

Gao

Yang

Abdelzaher

T. F.

Ding

Han

Hierarchical aggregate classification with limited supervision for data reduction in wireless sensor networks

Proceedings of the 9th ACM Conference on Embedded Networked Sensor Systems (SenSys '11)

November 2011

40 53

2-s2.0-83455176305

10.1145/2070942.2070948

Kunal

Mohan

DReL: a middleware for wireless sensor networks management using reinforcement learning techniques

Proceedings of the 5th International Workshop on Middleware Tools, Services and Run-Time Support for Sensor Networks (MidSense '10)

2010

1 7

10.1145/1890784.1890786

Mal-Sarkar

Sikder

I. U.

Chansu

Konangi

V. K.

Uncertainty-aware wireless sensor networks

International Journal of Mobile Communications 2009 7 3 330 345

2-s2.0-62449290581

10.1504/IJMC.2009.023675

Sen

Jeong

D. H.

An integrated framework for managing sensor data uncertainty using cloud computing

Information Systems 2013 38 8 1252 1252

10.1016/j.is.2011.12.003

2-s2.0-84856250245

Zhang

Das

S. K.

Liu

A trust based framework for secure data aggregation in wireless sensor networks

Proceedings of the 3rd Annual IEEE Communications Society on Sensor and Ad hoc Communications and Networks (Secon '06)

September 2006

60 69

2-s2.0-43849098907

10.1109/SAHCN.2006.288409

Fagg

A. H.

Shenoy

Chen

Application of reinforcement learning in multisensor fusion problems with conflicting control objectives

Intelligent Automation and Soft Computing 2009 15 2 223 235

2-s2.0-63849192802

Savić

Limbourg

Aggregating uncertain sensor information for safety related systems

Proceedings of the European Safety and Reliability Conference (ESREL '06)

September 2006

1909 1913

2-s2.0-51649118838

10.

Fong

Hang

Mohammed

Fiaidhi

Stream-based biomedical classification algorithms for analyzing biosignals

Journal of Information Processing Systems 2011 7 2 717 732

11.

Hońko

Granular computing for relational data classification

Journal of Intelligent Information Systems 2013 4 2 187 210

12.

Pedrycz

Bargiela

An optimization of allocation of information granularity in the interpretation of data structures: toward granular fuzzy clustering

IEEE Transactions on Systems, Man, and Cybernetics 2012 42 3 582 590

2-s2.0-81555230323

10.1109/TSMCB.2011.2170067

13.

Dayan

Balleine

B. W.

Reward, motivation, and reinforcement learning

Neuron 2002 36 2 285 298

2-s2.0-0037057808

10.1016/S0896-6273(02)00963-7