Abstract
Wireless sensor network consists of a large number of resource-constrained sensor nodes and is usually deployed in unattended area to collect specific information. Energy consumption is always a major concern in the research field of wireless sensor network. Thus, data aggregation schemes emerged and were deployed for prolonging network lifetime by reducing data transmitted within the network. Meanwhile, along with the wide application of the data aggregation schemes, the security issues of data aggregation have been increasingly drawing attention and the designing of secure data aggregation scheme is becoming a hot spot. In this paper, we proposed an energy-efficient secure data aggregation scheme, ESMART: energy-efficient slice-mix-aggregate based on the technique of data slicing and mixing. The proposed scheme performs secure data aggregation in a more efficient way while keeping a good performance of privacy preservation. And the simulation result shows that the security performance of ESMART scheme is better than that of some existing and widely used schemes.
1. Introduction
A wireless sensor network is composed of a large number of resource-constrained sensor nodes. The base station of the network sends queries to the child nodes, and the sensor nodes that received the queries will collect the information following the request of the query. The data acquired by sensor nodes will be transmitted to the base station with multihop transmission scheme. Obviously, the energy consumption of transmission will affect the life cycle of the sensor nodes dramatically. Hence, the technology of data aggregation has been widely used in wireless sensor network as an effective mechanism to reduce the amount of data transfer in the network. In the mechanism of data aggregation, the nodes which aggregate raw data from their neighbor nodes and process it by specific algorithms are called aggregators, and the other sensor nodes which center on the aggregators and only perform information collecting and transmitting are called leaf nodes. The processed and aggregated data will be returned to the base station through the aggregation tree constructed dynamically. After the application of data aggregation technology, the amount of data transfer in WSNs has been reduced greatly and then prolonged the operating life of the network.
Meanwhile, some security problems have arisen because of the deployment of data aggregation scheme. As mentioned above, sensor nodes are usually resource constrained, the computing and communicating capacity of the nodes are very limited which makes the designing of the secure scheme in WSN more difficult than in traditional wired network, and most of the existing, mature secure protocols could not be applied to WSN directly. From the perspective of the adversary, the sensor nodes are usually deployed in unattended even hostile environment which makes sensor nodes very vulnerable to the attacks. We take a simple and widely used data aggregation scheme TAG: Tiny Aggregation proposed by Madden et al. in [1] as an example. We can see that there is no protection of data privacy in this scheme which means a captured node in the network can eavesdrop on and decrypt the raw readings of neighbor nodes easily without being detected and then lead to a serious security problem which is not acceptable [2, 3]. On the basis of the above analysis, a good secure data aggregation scheme for WSN should be energy efficient while ensuring data security. In this paper, we propose a novel secure data aggregation scheme for WSN, ESMART (energy-efficient slice-mix-aggregate), which addresses both energy-efficient and privacy preservation of the network. To achieve the privacy preservation of the network, we introduced data slicing and mixing technique into the scheme designing. A secure data aggregation scheme based on data slicing and mixing technique was detailed in [4]; in this scheme, in order to achieve the preservation of data privacy in the network, the raw readings of the sensor nodes will be sliced into pieces and part of the pieces will be encrypted and sent to the neighbor nodes. But this comes at the price of some additional communication cost which is shown in the simulation result of SMART scheme; the communication cost of SMART is about J times the communication cost of TAG, where J denotes the number of data pieces. Thus we introduced the dynamic J value into the scheme which can effectively decrease the communication overhead while maintaining the privacy preservation of the network. Meanwhile, a lower communication overhead means a lower transmission collision probability, which leads to a higher aggregation accuracy in the network. Moreover, in ESMART, the times of data slicing of a sensor node are only known by itself, so even if a malicious node eavesdropped on several data pieces from the node, it is not sure that all the data pieces of the node are eavesdropped on. Thus, it is more difficult for the malicious nodes to steal private data. Given the above, the use of data slicing and mixing technique in ESMART is more efficient which consequently reduces the communication overhead and increases the aggregation accuracy effectively.
Wireless sensor network is usually deployed in unattended environment and consisted of a large number of low-powered, resource-constrained sensor nodes. Each node in the network can sense the environment information as requested by the base station and perform wireless communication within a small range of its location. To reduce the energy consumption of the nodes during the transmission, data aggregation scheme has been widely used in WSN. Madden et al. proposed TAG scheme as a simple data aggregation scheme based on the assumption that every node in the network is trusted, and there is no protection of data privacy in this scheme. But in practice the application of WSN is facing various kinds of security threats. A good secure data aggregation scheme should possess the capability to resist the security threats while staying within energy constraint. To meet this requirement, some secure data aggregation schemes have been provided. He et al. proposed a novel data aggregation scheme with disjoint aggregation tree structure to ensure data integrity in [5]. Liu et al. proposed a secure data aggregation scheme HEEP with a novel formation method of aggregation tree in [6] in 2013. Li et al. proposed a secure data aggregation scheme based on the improved SMART scheme which is more efficient [7]. Both Li et al. [8] and Bista and Chang [9] proposed survey papers of the privacy-preserving techniques for WSN. Meanwhile, some researchers analyzed the existing network attack technologies against WSN and proposed corresponding solutions [10–12]. Zhu et al. proposed a secure data aggregation scheme, in which the base station composes a secret configuration matrix and each sensor node is preloaded with a limited part of the matrix known as a secret share containing certain local instructions [13]. Dong and Li presented a secure data aggregation approach based on monitoring in WSN [14]. In this paper, a grid-based network architecture for monitoring in WSN is designed and the algorithms for network division, initialization, and grid tree construction are proposed. Lei et al. proposed a secure data aggregation solution based on dynamic multiple cluster key management model in [15]; this key management model consumes little storage space and can resist the collusion attack effectively. Sun et al. proposed a lightweight secure data aggregation protocol to find the compromised nodes and help the base station to verify the final results [16]. Poornima and Amberker proposed a secure data aggregation scheme in [17]. This scheme provides end-to-end data privacy and can effectively reduce the energy consumption.
2. Network Model and the Requirement of Design
In this paper, we used aggregation tree to organize the sensor nodes in the network and consider three types of nodes in the network, base station, aggregator node, and leaf node. There is only one base station in the network which can be seen as the network control center with unlimited resources and the final destination for the aggregation results. As the root of the aggregation tree, it is responsible for broadcasting queries to all the other nodes and receiving and processing the aggregation results. A sensor node in the network can choose to be an aggregator node with the probability
2.1. Network Model
We consider a wireless network consisting of N nodes and one base station. Each node in the network has the functionalities of sensing, computing, and transmitting. The collected primitive data of node i can be expressed as
The formula above can also be interpreted as the probability that a secure link between two nodes is eavesdropped on by a third node and we can see from the formula that the larger the value of
2.2. The Design Goals of the Scheme
Efficiency. Due to the energy constrained character of sensor nodes, data aggregation scheme was applied to reduce the communication overhead, but consequently the application of secure scheme for data aggregation would inevitably cause some security problems. Therefore, how to maintain a balance between security and energy consumption has become one of the core issues in the designing of secure data aggregation scheme. In this paper, the proposed scheme improved the network security while keeping the network system at a low energy cost level.
Privacy Preservation. The preservation of private data is always a critical security issue in the design of secure data aggregation scheme. It means any node in the network could only know the private data of itself. Some of the widely used data aggregation schemes like TAG ignored this issue which lead to the fact that a malicious node can eavesdrops on the communications of its neighbors to steal the private data easily and then cause a major security problem. Thus, the preservation of data privacy is a key point in the designing of the proposed scheme in this paper.
Accuracy. The accuracy of the aggregation result is a crucial criterion in assessing the performance of data aggregation scheme. It is affected by the success rate of data transmission in the network directly, and the factors influencing the success rate of data transmission include data loss, transmitting collision, and so forth. In a certain time, the more the data is transmitted in the network, the higher the risk of data loss or transmitting collision. Therefore, in the designing of the proposed scheme, we need to reduce the data traffic in network while providing the preservation of data privacy. In other words, the more efficient the scheme is, the more accurate the aggregation result would be.
3. The Proposed Secure Data Aggregation Scheme
3.1. The Formation of Aggregation Tree
In the proposed scheme, an aggregation tree rooted at the base station needs to be formed for organizing the sensor nodes in the network. First of all, the base station broadcasts “Hi” message to all the neighbor nodes within one hop. Any nodes without parent that received “Hi” message should reply with “Join_Request” message to the originator, but if the node received multiple “Hi” messages from different senders, then it should randomly select one of them to be its parent node. Once a message of “Join_Request” is received, the BS accepts the node to be its child node by replying to the message of “Join_Accept.” The probability that a child node becomes an aggregator node is

The formation of aggregation tree.
3.2. The Overview of the Proposed Secure Data Aggregation Scheme
In this section, we present the work mechanism of ESMART that can be divided into three stages: data slicing, data mixing, and data aggregation. The detailed introduction is as follows.
Figure 2 describes an aggregation tree consisting of one base station, two aggregator nodes, and six leaf nodes. We set

Aggregation tree (

Data slicing.
After the stage of data slicing, the nodes decrypt all the received data pieces and sum them up with their raw data piece. The process can be described as

Data mixing.
When all the sent data pieces are received, leaf nodes begin to encrypt their aggregation results and send them to the aggregator nodes. The aggregators need to wait for a certain time to receive all the encrypted data pieces and then aggregate all the received data pieces with their own data. After the aggregation, the aggregators encrypt and send the aggregation results to their parent nodes. The process continues until all the aggregation results arrived at the base station. This procedure is described in Figure 5.

Data aggregation.
As we can see from the work process of ESMART, a leaf node needs to perform 2 encryption operations and 1 decryption operation during the process of data aggregation. For an aggregator node, only 1 encryption operation and 1 decryption operation are needed to be performed. Obviously, this degree of computational complexity has very little effect on the life cycle of sensor network and does not require a highly computational capacity. Meanwhile, compared to computational complexity, communication overhead has always been a major factor that affects the life cycle of sensor network. Thus, the analysis of the communication overhead of ESMART will be one of the focuses in Section 4.
4. Simulation and Analysis
In this section, simulations are carried out for analyzing and comparing the performance of TAG, SMART, and ESMART scheme. First, we set up the simulation environment in MATLAB which is an area of 400 m * 400 m and there are 600 sensor nodes deployed in this area randomly with
4.1. Privacy-Preservation Analysis of ESMART
In ESMART scheme, a sensor node which received a query from the base station will start a collection of target data and slice the collected data into j pieces and then encrypt
Table 1 illustrates the probability distribution of j when
The probability distribution of j with
As we can see from Table 1, the larger value of j, the lower probability of the value to be selected. That is, the larger value of j, the lower performance-cost ratio of the scheme. After the mathematical derivation, the simulations are carried out to compare the privacy-preservation performance of SMART and ESMART. TAG scheme was excluded from the comparison due to lacking protection mechanism of data privacy. The two schemes are simulated in the same network environment with

Privacy preservation of SMART and ESMART.
As we can see from Figure 6, the disclose probability of the private data in ESMART scheme is lower than that in SMART. That is because the amount of communication in ESMART is smaller than that in SMART, which means less chance to eavesdrop on the data pieces of one node, that is, the disclose probability of the private data is reduced. Meanwhile, the introduction of variable j brings another advantage: the times of data slicing of a sensor node in the network are only known by itself, so even if a malicious node eavesdropped on several data pieces from the node, it is not sure that all the data pieces of the node are eavesdropped on, unless the number of eavesdropped data pieces equals J which is the theoretical maximum value of j. Thus, it is more difficult for the malicious nodes to steal private data. Given the above, the privacy-preservation performance of ESMART is better than that of SMART.
4.2. Communication Overhead Analysis
How to make a reasonable tradeoff between security and communication overhead is always a critical issue in the designing of data aggregation scheme for wireless sensor network. As we can see from the work in [4], SMART scheme introduced the technique of data slicing and mixing to achieve privacy preservation of wireless sensor network but also brings some additional communication overhead. In our proposed scheme ESMART, we decrease the communication overhead during the procedure of aggregation while keeping a good performance of privacy preservation. In this section, to validate the efficiency of ESMART scheme, we conclude the calculating formulas of the communication overhead of TAG, SMART, and ESMART and then deploy these aggregation schemes in the same network environment separately to avoid the influence of other factors on the simulation results. First, we assume a network consisting of N sensor nodes. In TAG scheme, the nodes upload their raw data directly without any protection of private data; thus the communication overhead can be expressed as below:

Communication overhead of TAG, SMART, and ESMART.
It is concluded from the simulation results above that because of lacking protection of data privacy, the number of messages transmitted during the procedure of data acquisition and aggregation under the TAG scheme is very small. Meanwhile, owing to the introduction of data slicing technique, the communication overhead of the network deployed with SMART is much higher than that with TAG. And the communication overhead of the proposed scheme ESMART is about 37% to 46% lower than that of SMART which means that ESMART is more energy efficient than SMART while keeping a better performance of privacy preservation.
4.3. Accuracy Analysis
The accuracy of aggregation result is an important indicator of the performance of data aggregation scheme. Theoretically, the accuracy of aggregation result is 100% which means the final aggregation result equals the sum of the collected data of every node in the network. In reality, due to channel sharing character of wireless network, data loss is inevitable during the procedure of data aggregation. The more transmissions per unit time in the network, the more collisions and then the more data loss. In other words, the data accuracy in wireless network is inversely associated with the communication overhead. Thus, from the simulation results of communication overhead of the schemes, we can theoretically deduce that the accuracy performance of TAG is the best in these schemes, secondly ESMART, thirdly SMART. After the analysis, we deployed the schemes in the simulation environment with

Accuracy of TAG, SMART, and ESMART.
As we can see from the simulation results, due to the low amount of communication, the accuracy of TAG scheme is about 90% when the interval time is long enough. Instead, the accuracy of SMART scheme is approximately 67% which is caused by the large amount of data traffic. The proposed scheme ESMART can provide a better accuracy of 83% than SMART while keeping a good performance of privacy preservation. The simulation results confirmed the previous deduction that the aggregation scheme with low communication overhead can achieve a good performance of data accuracy.
5. Conclusions
Wireless sensor network consists of a large number of low-powered, resource-constrained sensor nodes and is usually deployed in unattended environment. Sensor nodes in the network can sense specific information and perform wireless communication within a small range of their location. To reduce the energy consumption of the nodes during the transmission, data aggregation scheme has been widely used in WSN. Meanwhile, how to provide the protection of data privacy during the aggregation is becoming a desiderative problem to be solved.
In this paper, we proposed an energy-efficient secure data aggregation scheme ESMART. The simulation results and analysis proved that whether in energy saving ability or data accuracy ESMART scheme is better than SMART scheme while keeping a good performance of privacy preservation which is not provided in TAG scheme. It is an efficient secure data aggregation scheme with a value of practical application and it achieved the expected design requirements. In future research work, we will focus on the designing of secure data aggregation scheme with the ability to guarantee data integrity.
Footnotes
Acknowledgments
This research is supported by Beijing Science and Technology Program under Grant Z121100007612003, Beijing Natural Science Foundation under Grant 4132057, and National Natural Science Foundation under Grant 61371071.
