Abstract
Wireless sensor networks (WSNs) are composed of sensor nodes with limited energy which is difficult to replenish. In-network data aggregation is the main solution to minimize energy consumption and maximize network lifetime by reducing communication overhead. However, performing data aggregation while preserving data confidentiality and integrity is challenging, because adversaries can eavesdrop and modify the aggregation results easily by compromised aggregation nodes. In this paper, we propose an efficient confidentiality and integrity preserving aggregation protocol (ECIPAP) based on homomorphic encryption and result-checking mechanism. We also implement ECIPAP on SimpleWSN nodes running TinyOS. Security and performance analysis show that our protocol is quite efficient while persevering both aggregation confidentiality and integrity.
1. Introduction
Wireless sensor networks consist of a large number of sensor nodes which can be deployed to sense, transmit, and process the data collected from the environment in many applications such as battlefield surveillance, health care monitoring, and traffic regulation. The environment data will be transmitted to the base station (BS) hop by hop. Sensor nodes have limited energy storage and their computational ability is not so powerful as base station. We cannot supply energy to the nodes at all times because they are usually deployed in the areas humans hardly reach. If some key nodes cannot work as we supposed, the whole network will break down. The energy of the node is mainly spent on data transmission. On TelosB nodes, sending and receiving one bit data cost 0.72 μJ and 0.81 μJ which are more than 600 times than 1.2 nJ cost on processing one instruction by processor [1]. Therefore, how to reduce communication overhead is the key issue to prolong the life of wireless sensor networks [2, 3].
Other than transmitting data directly to the base station, wireless sensor nodes can perform aggregation operations, such as SUM and AVERAGE, on aggregation nodes. Through this approach, the communication overhead will be reduced largely. An aggregation tree should be built before data aggregation. The topology can be tree-based or cluster-based depending on the applications. Then the sensors need to collect environment data from the monitoring area and transmit the data to the bases station hop by hop. When the intermediate nodes receive the data from their child nodes, they aggregate them using aggregation functions.
Sensor nodes should be deployed in the hostile areas in some applications, such as battlefield surveillance and target tracking. If these data was leaked to the enemies, it would bring heavy losses. The adversaries can contact the physical nodes easily and they can tamper the original sensing data so that the base station will receive the false aggregation results. If the base station cannot receive the true data, it will make wrong decisions. Some other attacks also affect the aggregation process like denial of service (DoS) or data selective forwarding. General data aggregation protocols do not consider these attacks, so they are useless in military applications. Designing a data aggregation protocol should take security into consideration. All above, secure data aggregation mandates three security factors.
1.1. Data Confidentiality
The original sensing data and the intermediate aggregation results should not be disclosed to the unauthorized parties during the transmission process.
1.2. Data Integrity
If the original sensing data or the intermediate aggregation results were tampered by the adversaries or modified due to the poor quality of communications, the base station can detect the changes from final aggregation results.
1.3. Data Freshness
Sensing data received by the base station should represent the very recent state of the environment. If the aggregation values are too old, base station should discard such results.
Researchers have proposed many data aggregation protocols which can provide some secure protections. Most protocols can only provide confidentiality protection [4, 5] or integrity protection [6, 7]. Some protocols claim that they can protect both data confidentiality and data integrity, but some problems still exist. Data must be transmitted in some special forms if data integrity needs to be protected. But data form may be changed when some encryption algorithms are used to protect data confidentiality. They are contradictory in a way. Designing a data aggregation protocol which can provide both data confidentiality and data integrity is a challenge.
In this paper, we propose an efficient confidentiality and integrity preserving aggregation protocol (ECIPAP) in wireless sensor networks motivated by protocols SIES [8] and SHIA [9]. Our scheme is based on a lightweight homomorphic encryption which is energy efficient. Through the result-checking phase, every node in the wireless sensor network can verify if its data was added to the final aggregation results. We use a random number dissemination mechanism to update the keys stored in sensor nodes so as to guarantee the data freshness against replay attacks. This protocol also uses μTESLA [10] to broadcast authenticated queries along with the random numbers. We implement our protocol on TelosB physical nodes. Through the results of the experiment and the theoretic analysis, we will show the practicability and high efficiency of our protocol.
The paper is organized as follows. In Section 2, we introduce the related work in this research area. Section 3 explains the system model in our protocol. Section 4 describes our protocol in detail. Section 5 analyses the security, experiment, and performance of ECIPAP. We conclude this paper in Section 6.
2. Related Work
Hu and Evan proposed the first integrity-preserving hierarchy data aggregation protocol in 2003 [11]. The main idea of this protocol is delayed aggregation and delayed authentication. Girao et al. proposed an aggregation scheme that can guarantee the end-to-end data confidentiality using symmetric key based homomorphic encryption [12]. After that, a series of secure data aggregation protocols have been proposed. Existing protocols, which claim that they can protect both data confidentiality and data integrity, have some potential risks. Generally, we can think that both confidentiality and integrity preserving protocols should satisfy the following conditions.
The encrypted sensing data should only be decrypted at the base station but not on the intermediate nodes thus to implement end-to-end confidentiality. If the adversaries tamper the sensing data, base station can verify the data integrity through checking the final aggregation results. The energy consumption should be reasonable because of the limited energy in each node.
The protocols proposed in [13–16] use pair keys to encrypt the data transmitted between two nodes based on the existing integrity-preserving data aggregation protocols. These protocols achieve hop by hop confidentiality but not end-to-end confidentiality. Based on a confidentiality protection mechanism, paper [17] provides data integrity protection through using a global key
Chan et al. presented a provable secure tree-based in-network data aggregation protocol named SHIA [9]. Without assuming a particular data structure, SHIA can detect any manipulation of the in-network aggregation. SHIA has three phases: query dissemination phase, aggregation-commit phase, and result-checking phase. After the queries were broadcast to the whole network, sensor nodes send their environment data and commitment upward. In the result-checking phase, associated off-path values would be passed down to the aggregation tree. Through this way, sensor nodes can verify if their data was indeed added to the final aggregation results. After node i is convinced, it will send an authentication message
Combining homomorphic encryption and secret sharing, Papadopoulos et al. proposed a secure data aggregation protocol SIES which can protect both data confidentiality and data integrity [8]. In this protocol, each sensor node has a secret message
Our protocol improves SIES [8] by using result-checking mechanism motivated by SHIA [9] to protect data integrity instead of secret sharing. We also use homomorphic encryption to allow intermediate nodes perform aggregation directly so as to achieve end-to-end data confidentiality. When the final aggregating result was sent to the base station, the result-checking process will detect whether all the data is added to the final aggregation result.
3. System Model
Different secure data aggregation protocols support different aggregation functions. The security protection abilities are also various and the adversaries can launch a variety of attacks. This section gives the general problem definition, network assumption, and attack model in detail.
3.1. Network Assumption
We assume that an aggregation tree is already set up in the deployment phase. If not ready, TAG [20] can be used to build such tree-based networks. In our protocol, base station BS has unlimited energy and computing ability. BS can broadcast query messages to the whole network using authenticated method μTESLA [10]. In the aggregation tree, there are two types of nodes, leaf nodes and the intermediate nodes. Leaf nodes only sense environment data and encrypt them, while intermediate nodes not only sense environment data but also aggregate their child nodes' data. The distance between two nodes is about 10 m and nodes can only communicate with its child nodes or parent nodes.
Each sensor node has a unique ID and an initial key
3.2. Problem Definition
In our protocol, we use an efficient symmetric additively homomorphic encryption algorithm. A key distribution mechanism is also used to generate the keys in each sensor node.
3.2.1. Homomorphic Encryption
As shown in Algorithm 1, M is a large integer. Let
(i) Key Generation: Round key (ii) Encryption: Ciphertext (iii) Decryption: (iv) Addition:
3.2.2. Message Format
The message m sent by sensor node has a fixed format which is a data tuple:
3.2.3. Aggregation Function
An intermediate node i has n child nodes
(i) Count Function: (ii) Sum Function: (iii) Average Function: (iv) MAC Aggregation:
3.3. Attack Model
The most threatening attack against data aggregation in wireless sensor networks is node compromising launched by the adversaries. Through this attack, adversaries can obtain the keys of the nodes. After that, they can use fake messages to disturb the final aggregation results. Such stealthy attacks [12] can make base station accept the false aggregation results without being detected. We assume that the adversaries can compromise a fraction of sensor nodes in the network.
Denial-of-service attack is out of our consideration, because this type of attack can be detected easily by the base station when the network works improperly.
4. ECIPAP
We improve SIES [8] by using result-checking mechanism instead of secret sharing. Then we propose an efficient confidentiality and integrity preserving aggregation protocol (ECIPAP). The homomorphic encryption used in our protocol can guarantee end-to-end data confidentiality. Our protocol has three phases: query dissemination phase, data aggregation phase, and result-checking phase.
4.1. Network Deployment
Before the sensor nodes were deployed in the monitory area, each sensor node shared a private key
4.2. Query Dissemination Phase
In each aggregation round, base station
4.3. Data Aggregation Phase
Sensor node collects environment data
Then it encrypts these data as follows:
The reason why we use different keys to encrypt count, value, and
The data tuple can be created now as
When the nodes prepare the data tuples ready, they send them to their parent nodes. We show the aggregation process in Figure 1. There are

Data aggregation phase in ECIPAP.
4.4. Result-Checking Phase
When the final aggregation result is received, base station broadcasts the aggregated data tuple down to the whole network using authenticated method. To enable result checking, each sensor node will send a checking message to its child node. We can regard the result checking as a reverse process of data aggregation. We show this in Figure 2. Base station sends the final data tuple

Result-checking phase in ECIPAP.
Every sensor node can verify if its own data was added to the aggregation data by comparing its own data to the data sent by parent nodes. If the result passes the verification, then every sensor node prepares an authentication message
5. Analysis
5.1. Security Analysis
Adversaries can compromise a fraction of sensor nodes in the wireless sensor network. When a sensor node is compromised, its private information such as encryption keys will be leaked. Adversaries can sniff confidential data sent by the sensor nodes. They can launch stealthy attack to make the base station accept false data without being detected.
We assume the adversaries can eavesdrop messages sent between sensor nodes and they already know the range of sensing data. If we use the same key to encrypt both value and
Authors in [9] define the direct data injection and optimally secure as follows.
5.1.1. Direct Data Injection
A direct data injection attack occurs when an adversary modifies the data readings reported by the nodes under its direct control, under the constraint that only legal readings in [
5.1.2. Optimally Secure
An aggregation algorithm is optimally secure if, by tampering with the aggregation process, an adversary is unable to induce the base station to accept any aggregation result which is not already achievable by direct data injection.
The message format in our protocol is the same as the message format
5.2. Experiment Analysis
We have implemented ECIPAP using SimpleWSN experimental platform to sense average temperature in lab. Characteristics of SimpleWSN nodes are shown in Table 1.
Characteristics of simpleWSN nodes.
The temperature data DataNum sensed by SimpleWSN nodes are 16-bit hexadecimal integers. We can use the following formula to transform such data to readable Celsius degree:
Table 2 describes the parameters used in our experiment. We use sensor nodes to form an aggregation tree whose root is a powerful PC.
Parameters used in experiment.
Table 3 shows the data received by base station before decryption. The data in
Encrypted aggregation results received by
Then, we use the keys shared between sensor nodes and base station to decrypt the aggregation results and show them in Table 4. The total number of nodes used is 5 and the average temperature is about 23.2 degrees Celsius. Base station broadcasts query messages every 1000 million seconds. Our experiment shows the practicability and high efficiency of ECIPAP.
The original environment data received by
5.3. Performance Analysis
We give a theoretical analysis of the performance in this section. Through comparing the communication overhead between ECIPAP, SHIA, and SIES, we show that ECIPAP not only protect both data integrity and data confidentiality but also reduce the communication overhead in result-checking phase.
We assume every intermediate node in the network has d child nodes and the height of the aggregation tree is h. The length of the data tuple sent by sensor nodes in ECIPAP, SHIA, and SIES are
Communication overhead of ECIPAP, SHIA, and SIES.
In ECIPAP, we use 4-byte integers to present encrypted count, value, and

Communication overhead in authentication process when node degree
The total communication overhead reduces by 79% comparing with SHIA when

Total communication overhead in ECIPAP, SHIA, and SIES when node degree
6. Conclusion
It is difficult to design an aggregation protocol to protect both data confidentiality and data integrity in wireless sensor networks. Existing protocols which claim to achieve this goal still have some problems. In this paper, we improve SIES by using result-checking mechanism instead of secret sharing. We also use homomorphic encryption algorithm to protect data end-to-end confidentiality. We concentrate on the communication overhead cost in the result-checking phase and propose an efficient confidentiality and integrity preserving aggregation protocol (ECIPAP). We implement our protocol on physical nodes running TinyOS. Through this experiment and theoretical analysis, we show the practicability and high efficiency of ECIPAP.
Footnotes
Conflict of Interests
The authors declare that there is no conflict of interests regarding the publication of this paper.
Acknowledgments
This paper is supported by National Natural Science Foundation of China (no. 61272512, no. 61003262, and no. 61300177), Program for New Century Excellent Talents in University (NCET-12-0047), and Beijing Natural Science Foundation (no. 4121001, no. 4132054).
