ID List Forwarding Free Confidentiality Preserving Data Aggregation for Wireless Sensor Networks

Abstract

Wireless sensor networks (WSNs) are composed of sensor nodes with limited energy which is difficult to replenish. Data aggregation is considered to help reduce communication overhead with in-network processing, thus minimizing energy consumption and maximizing network lifetime. Meanwhile, it comes with challenges for data confidentiality protection. Many existing confidentiality preserving aggregation protocols have to transfer list of sensors' ID for base station to explicitly tell which sensor nodes have actually provided measurement. However, forwarding a large number of node IDs brings overwhelming extra communication overhead. In this paper, we propose provably secure aggregation scheme perturbation-based efficient confidentiality preserving protocol (PEC2P) that allows efficient aggregation of perturbed data without transferring any ID information. In general, environmental data is confined to a certain range; hence, we utilize this feature and design an algorithm to help powerful base station retrieve the ID of reporting nodes. We analyze the accuracy of PEC2P and conclude that base station can retrieve the sum of environmental data with an overwhelming probability. We also prove that PEC2P is CPA secure by security reduction. Experiment results demonstrate that PEC2P significantly reduces node congestion (especially for the root node) during aggregation process in comparison with the existing protocols.

1. Introduction

Wireless sensor networks (WSNs) integrate microelectro-mechanical systems (MEMSs) technology, sensor technology, and communication technology. WSN can sense, transport and process different environmental data in its deployment area by hundreds of sensor nodes with limited computation and energy capacities. WSNs have been extensively used in military surveillance, environmental monitoring, production control, and real-time traffic monitoring [1].

Because WSNs are usually deployed in remote, unattended, or even hostile environment, the energy of sensor nodes is not easy to get replenished. Hence, how to reduce energy cost and prolong the network lifetime has become key issues for WSNs [2, 3]. It is generally believed that power consumption of each sensor node tends to be dominated by data transmission. According to [4], energy cost of transmitting a single bit of data is equivalent to that of 800 instructions. Data aggregation [2, 5] mechanisms avoid transmitting environmental data through in-network process of summarizing and combining sensor data, thus reducing the amount of data transmission and effectively maximizing network lifetime.

Data confidentiality [6–11] is crucial in many WSN applications, like military surveillance. If data confidentiality is compromised, the sensitive information collected will be leaked to adversary. However, there is a conflict between data aggregation and data confidentiality protocols [12]: data aggregation prefers to operate on plain data and confidentiality protection requires data to be encrypted. Extensive secure data aggregation research [6–11, 13–15] has been conducted. Data aggregation protocols usually cannot operate on encrypted data such that intermediate node has to decrypt packets received from downstream, aggregate the plaintext data with its own, encrypt the aggregated result, and forward to upstream.

Two common approaches to preserve data confidentiality without decryption/encryption are homomorphic encryption [7, 8, 10], and secret perturbation [6, 9, 11]. Homomorphic encryption is an encryption transformation that allows direct computation on encrypted data. However, end-to-end security of symmetric homomorphism [7] is easily compromised if any node is corrupted and the computational cost and communication overhead of asymmetric homomorphism [8, 10] are not preferable. In comparison, secret perturbation-based schemes add a perturbation to the value of each reporting node using shared secret key with base station ( $B S$ ). $B S$ retrieves the final aggregation result by removing all these perturbation. Since the key shared between each node and $B S$ is unique, adversary will not compute other nodes' sensed data or intermediate aggregation result if one key is compromised.

$B S$ has to know which sensor nodes have provided measurement before it can correctly remove the perturbations brought by these sensor nodes. A straightforward solution is to require every node participating/not participating in aggregation process to report its ID, according to the proportion of nodes satisfying $B S$ 's query.

However, this approach may bring high extra overhead. Feng et al. [9] proposed a family of secret perturbation-based schemes that can protect sensor data confidentiality while trying to minimize the number of ID to be transferred. In FSP scheme, every sensor node must reply a perturbed actual or dummy data item, no matter the node has satisfying data or not. $B S$ will simply subtract hash value for every sensor node to compute final aggregation result, and communication overhead caused by ID transmission is avoided. However, it requires all sensor nodes to report data no matter whether they have data satisfying the query. This may result in high extra communication overhead when only a small number of sensor nodes have data to report and communication overhead caused by extra perturbed data can be much larger than that of forwarding ID. Hence, in their ideal scheme O-ASP, aggregating node first has to compute whether overhead of transmitting ID and perturbed data or overhead of transmitting all perturbed data is larger. Either way, O-ASP endures high communication overhead, and it is unrealistic for each sensor node to know the membership and topology of the whole network, and it knows whether each of these nodes has data satisfying each particular query.

Moreover, the transmission of nodes' ID makes [9, 11] not suitable for the scenario shown in Figure 1, where we want to monitor the activities of tanks and battleships, and there is a long path to travel through before aggregation result gets to $B S$ . To achieve this goal, a cluster or a tree of sensor nodes is deployed in the battlefield, while $B S$ is in a secure location away to collect data reported by sensors. All data has to be forwarded on a long path from the targeted area to $B S$ . For [9, 11], ID list is transmitted such that the energy is wasted on the long path, and “single point of failure” could happen if there are not enough nodes on such path. The application scenarios in military surveillance also include the case that US army uses REMBASS to collect data (like ground motion, sound, infrareds, and magnetic fields) and forward the aggregation result to command center. PEC2P fits in this scenario and does not have any requirement on the type of data.

Figure 1

An example of environmental surveillance system in battlefield.

In this paper, we present perturbation-based efficient confidentiality preserving protocol ( $PEC 2 P$ ) which can protect data confidentiality without transmitting any ID information. Generally, we use one-way hash function as perturbation added to the environmental data. Since $B S$ usually has powerful computational capability in WSNs, we propose to trade computation consumption at $B S$ for energy cost of sensors and introduce a new approach for $B S$ to compute and tell which nodes have actually sensed data and contributed to the aggregation process after receiving the final aggregation result. Our approach specifically fits for scenarios where aggregation result has to travel a long path before arriving at $B S$ . In summary, contribution of this paper includes the following. (1)

We draw attention to the ID-list transmitting problem in WSNs and propose the first approach which does not require forwarding any node ID but computing and selecting by $B S$ . As a result, communication overhead is reduced and reporting nodes' information is further hidden.

(2)

We avoid using the random number r verified by commonly applied authenticated broadcasting, thus reducing network delay. Instead, we update the secret key of all reporting nodes after each data aggregation to keep indistinguishability from adversary.

(3)

We prove that our protocol is CPA secure by security reduction.

(4)

We measure the performance of our protocol through both theoretic analysis and experiments on TinyOS [16]. We analyze the accuracy of $PEC 2 P$ and compare its communication overhead with existing protocols.

2. Related Work

Girao et al. proposed CDA [7] using symmetric key-based privacy homomorphic encryption. In their approach, sensor nodes share a common symmetric key with the $B S$ which is hidden from aggregators, and aggregators can perform aggregation functions directly on the ciphertext instead of carrying out costly decryption and encryption operations. Symmetric homomorphism has the advantage of fast computation. However, secret key is shared among all nodes such that data confidentiality is lost once a sensor node and its shared key are compromised.

Mykletun et al. [8] investigated several additive homomorphic public-key encryption schemes and their applicability to WSNs. In general, these schemes preload public key in sensor nodes and aggregate encrypted data. Then $B S$ can decrypt aggregation result by its secret key. Albath and Madria [10] proposed an ECC-ElGamal based homomorphic encryption scheme to achieve confidentiality for in-network aggregation in wireless sensor networks. Even if the adversary compromises a node and obtains the public key, it cannot obtain the plaintext of intermediate aggregation results. Hence, public key-based homomorphic encryption schemes are resilient to node compromise attacks. However, the computational cost and communication overhead of public key encryption scheme are not quite tolerable for WSNs, especially when sensors are collecting diverse statistics (like temperature, humidity, and pressure).

Castelluccia et al. [6] first proposed an additively homomorphic encryption scheme which simply adds secret key k to environmental data x as ciphertext $c = x + k$ . Each node has a unique secret key such that one node's corruption does not affect the data confidentiality of other nodes. Castelluccia et al. [11] improved their scheme in [11] by proposing a simple and provably secure encryption scheme that allows efficient additive aggregation of encrypted data. Each reporting node i encrypts plaintext data $m_{i}$ as: $c_{i} = m_{i} + h (f_{e k_{i}} (r))$ . The security of their scheme is based on the indistinguishability property of a pseudorandom function (PRF). However, ID-list of sensors has to be transferred and cannot be aggregated.

Feng et al. [9] tried to alleviate the ID-list problem and proposed a family of secret perturbation-based schemes that can protect sensor data confidentiality without disrupting the additive data aggregation result. BSP and FSP are two basic schemes which take nonredundant reporting approach and fully reporting approach, respectively. The ideal scheme O-ASP assumes that each sensor node knows the membership and topology of the whole network, and it knows whether each of these nodes has data satisfying each query. Then, $B S$ computes aggregation and communication cost of two approaches for each cell before selecting one. To overcome the unrealistic assumption, D-ASP is proposed to enable nodes to make decisions based only on their locally available information, and interactions only take place within a cell or between neighboring cells. However, it is difficult for nodes to decide whether to report their ID with locally available information and it makes no difference when the number of reporting nodes is the same as nonreporting nodes. It also causes extra communication cost and network delay for waiting and deciding.

PRDA [15] pointed out that the transmission of sensor node IDs along with aggregated data packets increases the communication overhead of the network. Therefore, it keeps a table that consists of sensor node IDs and their corresponding small index numbers in each data aggregator. After the cluster forming, data aggregator generates the index table and sends it to $B S$ . During data aggregation, instead of sending 2-byte sensor node IDs, data aggregators send corresponding index numbers. $B S$ can find the ID of sensor nodes in the index table. However, index numbers are only used within clusters.

Although existing schemes tried to reduce the amount of IDs, they still suffer from related communication cost, and dropping ID or sending false ID will lead $B S$ to compute false aggregation result.

Our work requires no ID to be forwarded and achieves a good trade-off between confidentiality and efficiency by adopting perturbation. With this improvement, we manage to simultaneously preserve data confidentiality and significantly reduce overall communication overhead, avoiding high energy consumption in aggregation phase.

3. System Model

3.1. Network Assumption

We assume a multilevel sensor network tree that consists of N (less than 1000) sensor nodes and certain amount of relay nodes. Sensor nodes are deployed in areas of interest, and they can sense and aggregate data. Both tree and cluster topologies can be applied in aggregation structure. In this paper, we use aggregation tree to illustrate our protocol. Aggregation tree could be formed as in TAG [4]. Relay nodes just forward messages, and they consist of a long path from targeted areas to $B S$ . The powerful $B S$ with transmission range covering the whole network is capable of broadcasting messages to all nodes directly. Each sensor node has a unique ID picked from the set ${0,1, \dots, N - 1}$ . After the aggregation tree is formed, each sensor node monitors its surrounded environment to generate environmental data which is an integer ranging from $[v_{\min}, v_{\max}]$ . Environmental data (e.g., temperature) can be converted to integers if necessary. Each reporting node and aggregator sends their messages up the aggregation tree. The message has the following format:

\begin{matrix} 〈 c, h a x 〉, \end{matrix}

(1)

where c is the number of reporting nodes in network and

h a x

is the sum of environmental data and perturbation.

3.2. Design Goals

When designing confidentiality protection schemes, we aim to achieve the following goals. (1)

Data accuracy: $B S$ can correctly retrieve the sum of environmental data with an overwhelming probability.

(2)

Data confidentiality: the aggregation result should only be known by $B S$ and $PEC 2 P$ is CPA secure.

(3)

Efficiency: the protocol should help to reduce communication overhead and prolong the network lifetime.

Definition 1 (Chosen Plaintext Attack).

In this attack, the adversary has the ability to obtain the encryption of plaintexts of its choice. It then attempts to determine the plaintext that was encrypted in some other plaintext [17].

Definition 2 (Negligible Function).

A function F is negligible if for every polynomial $p (\cdot)$ , there exists an N such that for all integers $n > N$ , it holds that $F (n) < 1 / p (n)$ . An equivalent formulation of the above is to require that for all constants c, there exists an N such that for all $n > N$ , it holds that $F (n) < n^{- c}$ .

We define an experiment for any private-key encryption scheme $Π = (Gen, Enc, Dec)$ , any adversary A, and any value n of security parameter.

The CPA Indistinguishability Experiment $P r i K_{A, Π}^{CPA} (n)$ . (1)

A random key k is generated by running $Gen (n)$ .

(2)

The adversary A is given input $1^{n}$ and oracle access to $En c_{k} (\cdot)$ , and outputs a pair of messages $m_{0}, m_{1}$ of the same length.

(3)

A random bit $b \leftarrow {0,1}$ is chosen, and then a ciphertext $c \leftarrow En c_{k} (m_{b})$ is computed and given to A. We call c the challenge ciphertext.

(4)

A continues to have oracle access to $E n c_{k} (\cdot)$ , and outputs a bit $b^{'}$ .

(5)

The output of the experiment is defined to be 1 if $b^{'} = b$ , and 0 otherwise. (A succeeds if $P r i K_{A, Π}^{CPA} (n) = 1$ ).

Definition 3 (CPA secure).

A private-key encryption scheme $Π = (Gen; Enc; Dec)$ has indistinguishable encryptions under a chosen-plaintext attack (or is CPA secure) if for all probabilistic polynomial-time adversaries A there exists a negligible function negl such that:

\begin{matrix} P r [P r i K_{A, Π}^{CPA} (n) = 1] \leq \frac{1}{2} + negl (n) . \end{matrix}

(2)

3.3. aAttacker Model

We assume the existence of a global probabilistic polynomial time (PPT) adversary, which can choose to compromise a small subset of nodes and obtain all secrets of these nodes. With oracle access, it can also obtain the ciphertext for any chosen plaintext from any of the uncompromised nodes. Once the adversary compromises a sensor node, it will obtain its secret key and may modify, forge or discard messages, or simply transmit false aggregation results.

In this paper, we do not consider stealthy attacks [18] where the attacker's goal is to make the $B S$ accept false aggregation results while not being detected. Also, we do not consider the denial-of-service (DoS) attack in various protocol layers [19, 20] where the adversary prevents the querier from getting any aggregation result at all. However, if a node does not respond to queries, it is clear that something is wrong, and solutions can be implemented to remedy this situation. Sybil/node replication attacks [21] or “wormhole” formation [22, 23] are beyond the scope of this paper.

4. PEC2P

The proposed scheme $PEC 2 P$ mainly consists of bootstrapping phase, data aggregation phase, and result retrieving phase.

4.1. Bootstrapping Phase

In bootstrapping phase, modulus $M = 2^{l}$ is stored in all nodes, and so is a collision-resistant cryptographic hash function $H : {0, 1}^{*} \leftarrow {0, 1}^{l}$ and a PRF $f : {0, 1}^{l} \leftarrow {0, 1}^{l}$ .

We further assume that $B S$ first runs Algorithm 1 such that a unique initialization vector $I V_{i}$ is generated, and secret key $k_{i} = I V_{i}$ is stored in $B S$ 's local record and node i.

Algorithm 1: Bootstrapping algorithm.

begin

randomly pick master key $S K \in {0,1}^{l}$ ;

for $i \leftarrow 0$ to $N - 1$ do

${I V}_{i} \leftarrow f_{S K} (i)$ ;

store ${I V}_{i}$ in $B S$ ;

store $k_{i} = {I V}_{i}$ in $B S$ and node i;

end

4.2. Data Aggregation Phase

Each sensor node in targeted area may behave as a sensing node, an aggregator, or combined. To simplify the discussion, we assume that each node can perform one role of sensing or aggregating without the loss of generality. Any node with combined role can be logically split into a sensing node and an aggregating node. As shown in Figure 2, aggregator C both senses data and aggregates data from downstream. It is divided into sensing node $C_{0}$ and aggregating node $C_{1}$ . After the transformation, only leaf nodes sense environmental data.

Figure 2

An example of data aggregation phase.

In aggregation phase, when a targeted event happens or $B S$ disseminates a query, each leaf sensor node i with environmental data $x_{i}$ runs Algorithm 2 to compute individual aggregation result $〈 c_{i}, h a x_{i} 〉$ . First, i inputs environmental data $x_{i}$ , then sets $c_{i} = 1$ and $h a x_{i} = x_{i} + H (k_{i})$ since it has no children nodes. Second, i forwards the result to its parent node for data aggregation. Finally, i updates its secret key $k_{i} = f (k_{i})$ . Other leaf senor nodes remain hibernated.

Algorithm 2: Perturbation algorithm.

begin

Input: environmental data $x_{i}$ ;

$c_{i} \leftarrow 1$ ;

$H_{i} \leftarrow H (k_{i})$ ;

$h a x_{i} \leftarrow (x_{i} + H_{i}) \mod M$ ;

$k_{i} \leftarrow f (k_{i})$ ;

return $〈 c_{i}, h a x_{i} 〉$ ;

end

During each aggregation, upon receiving a message from one of its children nodes for the first time, each aggregator i starts a timer t and collects other messages before t fires. Then, it runs Algorithm 3 to compute partial aggregation result $〈 c_{i}, h a x_{i} 〉$ . First, i computes partial count $c_{i} = \sum_{j \in S_{i}} ‍ c_{j}$ and partial perturbed data $h a x_{i} = \sum_{j \in S_{i}} ‍ h a x_{j} \mod M$ . Then i forwards the result to its parent node. Aggregators receiving no messages from downstream just remain hibernated. Note that we count number to trace the contributing nodes in $B S$ ; hence, synchronization among sensors is not needed.

Algorithm 3: Aggregation algorithm.

begin

$c_{i} \leftarrow \sum_{j \in S_{i}} ‍ c_{j}$ ;

$h a x_{i} \leftarrow \sum_{j \in S_{i}} ‍ h a x_{j}$ mod M;

return $〈 c_{i}, h a x_{i} 〉$ ;

end

Definition 4.

$S_{i}$ is a set of reporting node's ID, and these nodes are node i's children nodes.

To show how our scheme works, we take Figure 2 as an example. Node B and D are leaf sensor nodes with their own environmental data $x_{B}$ and $x_{D}$ . Node C is divided into $C_{0}$ and $C_{1}$ such that $C_{0}$ runs Algorithm 2 and node $C_{1}$ runs Algorithm 3 respectively. Aggregator A just forwards messages after aggregating data received from B and C. $B S$ obtains the final aggregation result: $C_{B S} = 3$ and $H A X_{B S} = 0 x C B 66 E 916$ .

4.3. Result Retrieving Phase

In result retrieving phase, after receiving final aggregation result $〈 C_{B S}, H A X_{B S} 〉$ , $B S$ runs Algorithm 7 to retrieve ID list and actual aggregation result. First, $B S$ orderly selects a list $IDL$ of $C_{B S}$ nodes and corresponding shared keys $k_{j}$ from the N nodes, and $B S$ computes

\begin{matrix} A g g = (H A X_{B S} - \sum_{i \in I D L} ‍ H (k_{i})) \mod M \end{matrix}

(3)

A g g \in [C_{B S} * v_{\min}, C_{B S} * v_{\max}]

, and then

B S

will admit that

A g g

is the actual aggregation result

\sum_{i \in I D L} ‍ x_{i}

and update secret keys for the found

C_{B S}

nodes. If not,

B S

will continue searching.

To improve searching efficiency for $B S$ , we can first divide the network into clusters of trees each containing part of N nodes. Further analysis is in Section 5.3.

5. Analysis and Experiments

5.1. Accuracy Analysis

Theorem 5.

$PEC 2 P$ has a probability of at least

\begin{matrix} 1 - \frac{C_{B S} * (v_{\max} - v_{\min})}{M - 1 - C_{B S} * v_{\min}} \end{matrix}

(4)

in finding the correct combination in result the retrieving phase, given the environmental data in the range

[v_{\min}, v_{\max}]

, the hash value in the range

[0, 2^{l} - 1]

, and modulus M is

2^{l}

Proof.

We assume that ${t \leftarrow {0, 1}^{λ} : H (t)}$ is the uniform distribution over ${0, 1}^{λ}$ , and then $H (x)$ is independent of $H (y) (x \neq y)$ . The probability of $H (x) = H (y)$ is $1 / 2^{λ}$ . When $c = N$ : $| T_{1} | = | T_{2} | = c$ , $T_{1} \neq T_{2}$ , then we believe the probability of $\sum_{i \in T_{1}} ‍ Hash (k_{i}) = \sum_{j \in T_{2}} ‍ Hash (k_{j})$ is $N / 2^{λ}$ .

Thus, adding $C_{B S}$ environmental data together will result in a number in the range $[C_{B S} * v_{\min}, C_{B S} * v_{\max}]$ , and the aggregation result is in the range $[C_{B S} * v_{\min}, C_{B S} * v_{\max} + M - 1] = [C_{B S} * v_{\min}, M - 1]$ . If the result is valid, it has to belong to range $[C_{B S} * v_{\min}, C_{B S} * v_{\max}]$ . Then, the probability that $B S$ accepts a false aggregation result is at most

\begin{matrix} \frac{C_{B S} * (v_{\max} - v_{\min})}{M - 1 - C_{B S} * v_{\min}} . \end{matrix}

(5)

Hence, (4) holds.

If we have 1024 nodes in the network and the data sensed from the environment is in the range $[0, 2^{32} - 1]$ , we use SHA-1 as our hash function, and the output is in the range $[0, 2^{160} - 1]$ . We can calculate the probability that $B S$ accepts a false aggregation result is $2^{- 118}$ which can be ignored.

We have implemented $PEC 2 P$ using simple WSN experimental system to sense temperature in lab. Characteristics of SimpleWSN node is shown in Table 1.

Table 1

Characteristics of simple WSN node.

CPU 8-bit	8 MHz
Storage	10 Kbytes RAM
Storage	48 Kbytes FLASH
Communication	2.4 GHz
Bandwidth	250 Kbps
Operating system	TinyOS

Results are shown in Table 2. $B S$ has ID ′01′, and sensor node's ID ∈ {′01′, ′02′, ′03′, ′04′, ′05′}. Column 1 displays the number of participating nodes C from aggregation result $〈 C, H A X 〉$ . Column 2 displays the perturbed data $H A X$ from $〈 C, H A X 〉$ . Column 3 displays the sum of hash value computed by $B S$ . Column 4 displays the sum of environmental data after $B S$ searching and subtracting the sum of hash value from $H A X$ . The IDs of found nodes are shown in column 5. The temperature sensed is hexadecimal integer. We use Temperature $({}^{°}C) = ((t / 4096) * 1.5 - 0.986) / (0.00355)$ , provided by the SimpleWSN experimental platforms, to transform environment data to floating-point number which represent the Celsius degree. The average temperature is about 30 degrees Celsius in our experiment. The results justified the accuracy of $PEC 2 P$ such that if we subtract data in column 3 from data in column 2, we will end up with data in column 4. The results verified that both the exact IDs and actual aggregation result are retrieved correctly.

Table 2

Results of $B S$ running selection algorithm after receiving aggregation results.

C	HAX				Hash sum						ID list
03	50	63	46	27	50	63	23	3D	22	EA	00	03	00	05	06
03	49	77	C2	A4	49	77	9F	8F	23	15	00	03	04	00	06
02	56	A4	A8	A3	56	A4	91	38	17	6B	02	00	04	00	00
04	E1	1D	4A	91	E1	1D	1B	CA	2E	C7	02	03	04	00	06
01	2D	48	11	FF	2D	48	06	4B	0B	B4	00	00	04	00	00
00	00	00	00	00	00	00	00	00	00	00	00	00	00	00	00
04	BD	98	F6	19	BD	98	C7	8A	2E	8F	02	00	04	05	06
02	ED	48	FF	EE	ED	48	E8	78	17	76	00	03	04	00	00
03	A3	A6	20	EB	A3	A5	FD	F7	22	F4	02	00	04	05	00
01	5B	AD	89	B0	5B	AD	7D	EB	0B	C5	00	03	00	00	00
03	AE	AF	C4	01	AE	AF	A0	FF	23	02	02	03	00	05	00
01	EC	09	B6	05	EC	09	AA	6B	0B	9A	00	00	00	00	06
02	DB	23	9A	C9	DB	23	83	76	17	53	02	00	00	00	06
03	15	8B	95	E4	15	8B	72	ED	22	F7	02	00	04	05	00
02	16	E3	FB	2A	16	E3	E3	EC	17	3E	00	00	04	05	00
01	64	91	C9	E1	64	91	BE	29	0B	B8	02	00	00	00	00

5.2. Security Analysis

We assume that each sensor node shares a unique key with $B S$ and a common one-way hash function H is used. When an event happens, all nodes which are collecting environmental data will add the hash value computed on $f (k)$ to the environmental data x. Intuitively, since key $k_{i}$ is only shared between node i and $B S$ , other node $j (j \neq i)$ cannot successfully compute $H (k_{i})$ with the probability ϵ that is not negligible. And it is also difficult for adversaries to compute the correct hash value of any given x. Hence, both privacy and confidentiality are achieved. We will prove this by security reduction. First, we construct an encryption scheme (Algorithm 4).

Algorithm 4: Construction $Π^{*}$ .

/*Define a private-key encryption scheme for messages of

length L and key of length n as follows:*/

(i) Gen: on input $1^{n}$ , choose $k \leftarrow {0,1}^{n}$ uniformly at

random and output it as the key.

(ii) Enc: on input a key $k \leftarrow {0,1}^{n}$ and a message

$m \leftarrow {0,1}^{L}$ , output the ciphertext:

$〈 c = 1, s = H (k) 〉$

(iii) Dec: on input a key $k \leftarrow {0,1}^{n}$ and a ciphertext

$〈 c, s 〉$ , search for matching set S and output the

plaintext:

$m = s - \sum_{i \in S} ‍ H (k_{i})$

(iv) Addition of Ciphertext: given two ciphertext

$〈 c_{i}, s_{i} 〉$ and $〈 c_{j}, s_{j} 〉$ , output $〈 c_{l}, s_{l} 〉$ as

aggregation ciphertext:

$c_{l} = c_{i} + c_{j}$

$s_{l} = (s_{i} + s_{j}) \mod M$

Lemma 6.

Algorithm 4 is CPA secure if H is a pseudorandom function (PRF). One has

\begin{matrix} P r [P r i K_{A, Π^{*}}^{CPA} (n) = 1] \leq \frac{1}{2} + negl (n) . \end{matrix}

(6)

Proof.

If we replace the hash function H in Algorithm 4 with a truely random function F, we can have a new construction $Π^{'}$ . It is obvious that

\begin{matrix} P r [P r i K_{A, Π^{'}}^{CPA} (n) = 1] \leq \frac{1}{2} + negl (n) . \end{matrix}

(7)

If H fulfills the requirement, then

{t \leftarrow {0, 1}^{λ} : H (t)}

is the uniform distribution over

{0, 1}^{λ}

. Therefore, (6) holds.

Theorem 7.

$PEC 2 P$ is secure against CPA hash function if the following distributions are to be identical:

\begin{matrix} {t \leftarrow {0, 1}^{λ} : H (t) + m_{0}}, {t \leftarrow {0, 1}^{λ} : H (t) + m_{1}} . \end{matrix}

(8)

Proof.

Proof for the nonhashed scheme. we assume that adversary A attacks (CPA) $PEC 2 P$ with success probability $(1 / 2) + ϵ (n)$ . Now, we can construct a fast algorithm $A^{'}$ to “break” Construction $Π^{*}$ , and $A^{'}$ tries to achieve its goal by running A as in Algorithm 5.

\begin{array}{l} P r_{A^{'}}^{H} [Success] \\ = \frac{1}{2} {P r [b^{′′} = 0 ∣ b = 0] + P r [b^{′′} = 1 ∣ b = 1]} \\ = \frac{1}{2} {\frac{1}{N} P r [b^{′′} = 0 ∣ b = 0, b^{'} = 0] \\ + \frac{N - 1}{N} P r [b^{′′} = 0 ∣ b = 0, b^{'} = 1] \\ + \frac{1}{N} P r [b^{′′} = 1 ∣ b = 1, b^{'} = 1] \\ + \frac{N - 1}{N} P r [b^{′′} = 1 ∣ b = 1, b^{'} = 0]} \\ = \frac{1}{2} {\frac{1}{N} P r [P r i K_{A, PEC 2 P}^{CPA} (n) = 1] + \frac{N - 1}{N} * \frac{1}{2} \\ + \frac{1}{N} P r [P r i K_{A, PEC 2 P}^{CPA} (n) = 1] + \frac{N - 1}{N} * \frac{1}{2}} \\ = \frac{N - 1}{N} * \frac{1}{2} + \frac{1}{N} P r [P r i K_{A, PEC 2 P}^{CPA} (n) = 1] \\ = \frac{1}{2} + \frac{1}{N} (\frac{1}{2} + ϵ (n)) = \frac{1}{2} + \frac{ϵ (n)}{N} . \end{array}

(9)

According to Lemma 6, we should have

\begin{matrix} \frac{1}{2} + \frac{ϵ (n)}{N} \leq \frac{1}{2} + negl (n) . \end{matrix}

(10)

Therefore,

ϵ (n) \leq negl (n)

Security of the Hashed Version. Only a few modifications to this security proof are needed in order to prove the security of the hashed variant. First, in Algorithm 5, all ciphertext are of now generated using the hashed values of k. Second, the security proof of the hashed scheme relies on the fact that ${t \leftarrow {0, 1}^{λ} : H (t) + m_{0}}$ and ${t \leftarrow {0, 1}^{λ} : H (t) + m_{1}}$ are identical distribution. If H fulfills the requirement, then ${t \leftarrow {0, 1}^{λ} : H (t)}$ is the uniform distribution over ${0, 1}^{λ}$ . Consequently, the two distributions are identical. This thus concludes the proof that the hashed scheme is semantically secure. Thus, $PEC 2 P$ is CPA secure.

Algorithm 5: $A^{'}$ .

/* $A^{'}$ tries to break $E n c_{k} (x) = x + H (k)$ */

(1) $A^{'}$ initiates other $N - 1$ nodes and has access to N oracle $E n c (\cdot)$ .

(2) A implements $P E C 2 P$ l times and obtain the ciphertext of message $x_{i}$ $(i = 1,2, \dots, l)$ .

(3) $A^{'}$ forwards the queries to the network and return $H (f (k_{i}))$ to A.

(4) A outputs two messages $m_{0}$ , $m_{1}$ , sending them to $A^{'}$ .

(5) A random bit $b \leftarrow {0,1}$ is chosen and $A^{'}$ makes an encryption query for $m_{b}$ to $E n c_{k} (\cdot)$ and get back challenge ciphertext $c_{b}$

$(b \in {0,1})$ .

(6) If $c_{b}$ is from the node which holds secret key k, then $A^{'}$ returns $c_{b}$ to A.

(7) A output a bit $b^{'}$ and returns it to $A^{'}$ .

(8) $A^{'}$ outputs $b^{''} = b^{'}$ .

(9) Else $A^{'}$ outputs $b^{''} = 0$ with the probability of $1 / 2$ and outputs $b^{''} = 1$ with the probability of $1 / 2$ .

(10) Output 1 if $b^{''} = b$ and output 0 otherwise.

5.3. Efficiency Analysis

For a reporting leaf node, the computational cost only consists of one hash computation and one modular addition. For an aggregator, the computational cost consists of the sum operation of count and sum of perturbed data. If an aggregator has reporting data, it also has one hash computation.

We assume that there are N sensor nodes in reporting area and aggregation tree has a branching factor d of 3. Perturbed data $Per = header + data + append$ . We choose the packet format used in TinyOS [16], and the packet header is 56 bits. Data is in the range of $[0, 127]$ . Let count length, ID length, and append length be $lo g_{2} N$ bits. We consider two different scenarios: (1) only nodes at the lowest level may have data satisfying $B S$ 's query and (2) nodes at each level may have data satisfying $B S$ ’ query.

O-ASP [9] is designed based on an ideal and unrealistic assumption that each sensor node knows the membership and topology of the whole network and it knows whether each of these nodes has data satisfying each particular query. In each aggregation, a decision node (say $B S$ ) first compares the communication cost of [All-reporting] $(A)$ and [Non-redundant-reporting] $(N)$ for each cell and then decides which strategy will be chosen.

In Claude_09 [11], in the data aggregation phase, for scenario (1), each reporting node sends $(| Per |)$ bits of message to its parent node, and nodes at second lowest level decide which group if IDs to send: the reporting nodes' IDs or the nonreporting nodes' IDs. For scenario (2), each reporting node will send $(| ID | + | Per |)$ bits of message.

For $PEC 2 P$ , in the data aggregation phase, for scenarios (1) and (2), each reporting node sends $(| count | + | Per |)$ bits of message to its parent node, and the same length of message will also be sent from aggregators. No ID is transmitted in the aggregation tree.

We show the number of bits sent by leaf node in Table 3. Then, we calculate the average/maximum/minimum communication overhead $CO$ in aggregation phase for Claude_09 and PEC2P in Table 4. In minimum case, reporting nodes are located in the high levels of aggregation tree, and we can find them through breadth-first search. In maximum case, reporting nodes should be located from the lowest level to higher levels. Tables 5 and 6 list the number of bits sent per node for each level with Claude_09 and PEC2P.

Table 3

Number of bits sent per node for leaf node.

Protocol	Number of bits
O-ASP [All]	$\| h \| + 2 * \| Per \|$
O-ASP [Non]	$\| h \| + 2 * \| Per \| + \| ID \|$
Claude_09	$\| h \| + \| Per \|$
PEC2P	$h + \log_{2} N + \| Per \|$

ID: node ID; h: header; Per: perturbed data; N: number of nodes in network.

Table 4

Theoretical analysis of total communication overhead.

(a)
	Claude_09

Average	$\sum_{i = ⌈ \log_{d} N ⌉ - 1}^{1} ((1 - {(1 - P)}^{\sum_{j = 1}^{⌈ \log_{d} N ⌉ - i} ‍ d^{j}}) * \| H \| * d^{i}) + P * \| H \| * d^{⌈ \log_{d} N ⌉} + P^{'} * \| ID \| * d^{⌊ \log_{d} N ⌋} * (⌈ \log_{d} N ⌉ - 1)$
Minimum	$\| H \| * n + \| ID \| * (\sum_{i = 1}^{⌊ \log_{d} n ⌋} i * d^{i} + ⌊ \log_{d} n ⌋ * (n - d^{⌊ \log_{d} n ⌋}))$
Maximum
$C 1$	$\| H \| * (\sum_{i = 1}^{⌊ \log_{d} n ⌋} d^{i} + (n - d^{⌊ \log_{d} n ⌋}) * (⌊ \log_{d} N ⌋ - ⌊ \log_{d} n ⌋) + \| ID \| * n * ⌊ \log_{d} N ⌋)$
$C 2$	$\| H \| * (N - d^{⌊ \log_{d} N ⌋} + n) + \| ID \| * n * ⌊ \log_{d} N ⌋$
$C 3$	$\| H \| * N + \| ID \| * (\sum_{j = \log_{d} ⌈ \log_{d} N ⌉}^{i} j * d^{j} + (i - 1) * n - S^{i})$

	PEC2P

Average	$\sum_{i = ⌈ \log_{d} N ⌉ - 1}^{1} (1 - {(1 - P)}^{\sum_{j = 1}^{⌈ \log_{d} N ⌉ - i} ‍ d^{j}}) * \| m \| * d^{i} + P * \| m \| * d^{⌈ \log_{d} N ⌉}$
Minimum	$\| m \| * n$
Maximum
$C 1$	$\| m \| * (\sum_{i = 1}^{⌊ \log_{d} n ⌋} d^{i} + (n - d^{⌊ \log_{d} n ⌋}) * (⌈ \log_{d} N ⌉ - ⌊ \log_{d} n ⌋))$
$C 2$	$\| m \| * (N - d^{⌈ \log_{d} N ⌉} + n)$
$C 3$	$\| m \| * N$

(b)
	Claude_09

Average	$\sum_{i = ⌈ \log_{d} N ⌉ - 1}^{1} (P * \| ID \| * d^{i} * (i - 1) + (1 - {(1 - P)}^{\sum_{j = 0}^{⌈ \log_{d} N ⌉ - i} ‍ d^{j}} * \| H \| * d^{i - 1})) + P * d^{⌈ \log_{d} N ⌉} * \| ID \| * (⌈ \log_{d} N ⌉ - 1)$ $+ P * \| H \| * d^{⌈ \log_{d} N ⌉}$
Minimum	$\| H \| * n + \| ID \| * (\sum_{i = 1}^{⌊ \log_{d} n ⌋} i * d^{i} + (⌊ \log_{d} n ⌋ + 1) * (n - d^{⌊ \log_{d} n ⌋}))$
Maximum
$C 1$	$\| H \| * (\sum_{i = 1}^{⌊ \log_{d} n ⌋} d^{i} + (n - d^{⌊ \log_{d} n ⌋}) * (⌊ \log_{d} N ⌋ - (⌊ \log_{d} n ⌋)) + \| ID \| * n * ⌊ \log_{d} N ⌋)$
$C 2$	$\| H \| * (N - d^{⌊ \log_{d} N ⌋} + n) + \| ID \| * n * ⌊ \log_{d} N ⌋$
$C 3$	$\| H \| * N + \| ID \| * (\sum_{j = \log_{d} ⌈ \log_{d} N ⌉}^{i} j * d^{j} + (i - 1) * n - S^{i})$

	PEC2P

Average	$\sum_{i = ⌈ \log_{d} N ⌉ - 1}^{1} (1 - {(1 - P)}^{\sum_{j = 0}^{⌈ \log_{d} N ⌉ - i} ‍ d^{j}) * \| m \| * d^{i - 1}} + P * \| m \| * d^{⌈ \log_{d} N ⌉})$
Minimum	$\| m \| * n$
Maximum
$C 1$	$\| m \| * (\sum_{i = 1}^{⌊ \log_{d} n ⌋} d^{i} + (n - d^{⌊ \log_{d} n ⌋}) * (⌈ \log_{d} N ⌉ - ⌊ \log_{d} n ⌋))$
$C 2$	$\| m \| * (N - d^{⌈ \log_{d} N ⌉} + n)$
$C 3$	$\| m \| * N$

Note: only nodes at the lowest level may have data satisfying $B S$ 's query.

$n = P * N$ ; d: degree; N: number of nodes in network; and $P^{'} = P$ if $P \leq 0.5$ or $P^{'} = 1 - P$ .

ID: node ID; $H =$ header + data + appendedBit; $m =$ count + header + data + appendedBit.

$C_{1}$ : $n < d^{⌊ \log_{d} N ⌋ - 1}$ ; $C_{2}$ : $d^{⌊ \log_{d} N ⌋ - 1} < n < d^{⌊ \log_{d} N ⌋}$ ; $C_{3}$ : $n > d^{⌊ \log_{d} N ⌋}$ ; $S_{j} = \sum_{\log_{d} ⌈ N ⌉}^{j} ‍ d^{j}$ , and $S^{i} < n < S^{i - 1}$ .

Note: nodes at each level may have data satisfying $B S$ 's query.

$n = P * N$ ; d: degree; N: number of nodes in network; and $P^{'} = P$ if $P \leq 0.5$ or $P^{'} = 1 - P$ .

ID: node ID; $H = header + data + appendedBit$ ; $m = count + header + data + appendedBit$ .

Table 5

Number of bits sent per node for each level with Claude_09 scheme.

Level	Number node	A (100%)	A (90%)	A (70%)	AV (100%)	AV (90%)	AV (70%)	HBH-A	HBH-AV	No-Agg
1	3	75	949.8	2699.4	100	974.8	2724.4	73	97	68859
2	9	75	366.6	949.8	100	391.6	974.8	72	94	22932
3	27	75	172.2	366.6	100	197.2	391.6	70	91	7623
4	81	75	107.4	172.2	100	132.4	197.2	68	87	2520
5	243	75	85.8	107.4	100	110.8	132.4	67	84	819
6	729	75	78.5	83.8	100	103.5	108.1	65	81	252
7	2187	75	67.5	52.5	100	90	70	63	63	63

Note: only the nodes in the lowest level may have data satisfying $B S$ 's query.

Table 6

Number of bits sent per node for each level with PEC2P scheme.

Level	Number Node	A (100%)	A (90%)	A (70%)	AV (100%)	AV (90%)	AV (70%)	HBH-A	HBH-AV	No-Agg
1	3	87	87	87	112	112	112	73	97	68859
2	9	87	87	87	112	112	112	72	94	22932
3	27	87	87	87	112	112	112	70	91	7623
4	81	87	87	87	112	112	112	68	87	2520
5	243	87	87	87	112	112	112	67	84	819
6	729	87	86.9	84.7	112	111.9	109	65	81	252
7	2187	87	78.3	60.9	112	100.8	78.4	63	63	63

Note: only nodes in the lowest level may have data satisfying $B S$ 's query.

Figures 3 and 4 show the trend of communication overhead in two different scenarios.

Figure 3

Communication overhead with different probability of reporting data when only nodes in the lowest level may have data satisfying $B S$ ’ query.

Figure 4

Communication overhead with different probability of reporting data when nodes at each level may have data satisfying $B S$ ’ query.

We assume that only the nodes in the lowest level have a probability of $P (= 0.1, 0.5, 0.9)$ to sense environmental data. Results are shown in Figures 5(a), 5(b), and 5(c).

Figure 5

Bandwidth consumption in data aggregation phase when only nodes in the lowest level may have data satisfying $B S$ ’ query.

We further assume that all nodes in aggregation tree has a probability of $P (= 0.1, 0.5, 0.9)$ to sense environmental data. Results are shown in Figures 6(a), 6(b), and 6(c).

Figure 6

bandwidth consumption in data aggregation phase when nodes at each level may have data satisfying $B S$ ’ query.

Results show that, compared with existing protocols, $PEC 2 P$ can greatly reduce communication overhead in aggregation phase. We notice that the major communication overhead is caused by transferring the hash value which was computed by SHA-1 in the comparison. Performance can be further optimized by choosing other hash functions with shorter output in case of lower security level requirement.

Result Retrieving Algorithm Test. We used a computer with a Pentium(R) D CPU of 3.40 GHZ and 2.00 GB memory to test Algorithm 7. Since sensor nodes are relatively uniformly distributed and their communication range is from 50 meters to 100 meters, a local event will be detected by a small group of sensor nodes. Therefore, we choose to use a small N. Results show that choosing 5 nodes from 10 nodes only needs 8 milliseconds and choosing 10 nodes from 20 nodes only needs approximately 2 seconds. In WSNs, the capability of $B S$ is more powerful than our experimental computer; thus, the searching time will be shorter in real applications. To make the search efficiently, we can first divide the network into clusters of trees.

6. Conclusion

Confidentiality protection and energy efficiency are two conflict, but equally crucial requirements in WSNs. To achieve a trade-off between these two goals simultaneously, remains a challenge. We propose $PEC 2 P$ to protect data confidentiality which also achieves energy efficiency. Specifically, we need no ID list and use one-way hash function as perturbation added to the environmental data. Since $B S$ usually has powerful computation capacities, we utilize $B S$ to the fullest and let it compute which nodes have actually contributed to the aggregation process after receiving the final perturbed aggregation result. Consequently, we manage to preserve data confidentiality, avoid high energy consumption, and obtain lower overall communication overhead. Analysis and experiments have also been conducted to evaluate the proposed protocol. The results show that our protocol provides confidentiality protection for both raw and aggregated data with an overhead lower than that of the existing related protocols. $PEC 2 P$ can be adopted to tree/cluster-based aggregation and any protocol using ID-list transmission. We focus on collecting the number of contributing nodes and its perturbed data, instead of how the information is gathered. For uniformity, we use tree topology in our paper. We also did cluster-based comparison with existing protocols, and the results show no significant difference.

Footnotes

Appendix

For more details, see Algorithms 6 and 7.

<bold>Algorithm 6:</bold> Matching algorithm.

Input: $〈 I D L i s t, H A X_{B S} 〉$ :

begin

$A g g \leftarrow - 1$ ;

while $I D L \neq ⊥$ do

$h a x_{B S} \leftarrow 0$ ; $j \leftarrow 0$ ;

for $i \leftarrow 0$ to $N - 1$ ; do

if $I D L [i]$ = 1 then

$h a x_{B S} \leftarrow h a x_{B S} + H (k_{r i})$ ;

if $H A X_{B S} - h a x_{B S} \in [C_{B S} * v_{\min}, C_{B S} * v_{\max}]$

then

$A g g \leftarrow$ $H A X_{B S} - h a x_{B S}$ ;

for $j \leftarrow 0$ to $C_{B S} - 1$ do

$k_{temp [j]} \leftarrow f (k_{temp [j]})$ ;

break;

return $A g g$ ;

end

<bold>Algorithm 7:</bold> Result retrieving algorithm.

begin

$A g g \leftarrow - 1$ ; $i \leftarrow 0$ ; $c \leftarrow 0$ ;

for $i \leftarrow 0$ to $C - 1$ do

$I D L [i] \leftarrow 1$ ;

for i: C to $N - 1$ do

$I D L [i] \leftarrow 0$ ;

$A g g \leftarrow$ Matching( $I D L, H A X_{B S}$ );

if $A g g \neq - 1$ then

return $〈 I D L, A g g 〉$ ;

/*search $C_{N}^{C} - 1$ times*/

for $o r d e r \leftarrow 1$ to $C_{N}^{C} - 1$ do

for $i \leftarrow 0$ to $N - 1$ do

if $I D L [i] = 1$ then

$c \leftarrow c + 1$ ;

/*find the last $^{'} 1^{'}$ in $I D L []$ */

if $c = C$ and $i \leq N - 1$ then

$I D L [i + 1] \leftarrow 1$ ; $I D L [i] \leftarrow 0$ ;

$A g g \leftarrow$ Matching( $I D L, H A X_{B S}$ );

if $A g g \neq - 1$ then

return $〈 I D L, A g g 〉$ ;

$i \leftarrow N$ ; $c \leftarrow 0$ ;

/*The last $^{'} 1^{'}$ is in the last position, then

move the last continuous $^{'} 1^{'}$ s to the first

found $^{'} 1^{'}$ before them*/

if $c = C$ and $i = N - 1$ then

/how many $^{'} 1^{'}$ s should be moved*/

$g r o u p \leftarrow 0$ ;

/*from behind*/

for $j \leftarrow N - 1$ to 0 do

if $I D L [j] = 1$ then

$g r o u p \leftarrow g r o u p + 1$ ;

/*found the empty position and

move the newly found $^{'} 1^{'}$ and the

continuous $^{'} 1^{'}$ s*/

else

/*newly found $^{'} 1^{' *}$ /

$m \leftarrow 0$ ;

/continuous $^{'} 1^{'}$ s $^{'}$ new location*/

$N e w L o c a t i o n \leftarrow 0$ ;

for $m \leftarrow j - 1$ to 0 do

if $I D L [m] = 1$ then

$I D L [m] \leftarrow 0$ ;

$I D L [m + 1] \leftarrow 1$ ;

$N e w L o c a t i o n \leftarrow m + 2$ ;

/*searching ends*/

$m \leftarrow - 1$ ;

/*move the continuous $^{'} 1^{'}$ s*/

if $N e w L o c a t i o n < N$ then

for $p \leftarrow N - 1$ to $N - g r o u p$

$I D L [p] \leftarrow 0$ ;

for $q \leftarrow N e w L o c a t i o n$ to

$g r o u p + N e w L o c a t i o n - 1$ do

$I D L [q] \leftarrow 1$ ;

$j \leftarrow - 1$ ;

$A g g \leftarrow$ Matching( $I D L, H A X_{B S}$ );

if $A g g \neq - 1$ then

return $〈 I D L, A g g 〉$ ;

$i \leftarrow N$ ; $c \leftarrow 0$ ;

end

Acknowledgments

This paper is supported by the National Natural Science Foundation of China (nos. 61272512, 61003262, and 61100172), Program for New Century Excellent Talents in University (NCET-12-0047), and Beijing Natural Science Foundation (no. 4121001).

References

Akyildiz

I. F.

Sankarasubramaniam

Cayirci

Wireless sensor networks: a survey

Computer Networks 2002 38 4 393 422

Yick

Mukherjee

Ghosal

Wireless sensor network survey

Computer Networks 2008 52 12 2292 2330

2-s2.0-46449122114

10.1016/j.comnet.2008.04.002

Liu

Iso-Map: energy-efficient contour mapping in wireless sensor networks

IEEE Transactions on Knowledge and Data Engineering 2010 22 5 699 710

2-s2.0-77949914437

10.1109/TKDE.2009.157

Madden

Franklin

M. J.

Hellerstein

J. M.

Hong

Tag: a tiny aggregation service for ad-hoc sensor networks

Proceedings of the 5th Symposium on Operating Systems Design and Implementation ACM SIGOPS Operating Systems Review (OSDI ′02)

2002

131 146

Akkaya

Demirbas

Aygun

R. S.

The impact of data aggregation on the performance of wireless sensor networks

Wireless Communications and Mobile Computing 2008 8 2 171 193

2-s2.0-39649120580

10.1002/wcm.454

Castelluccia

Mykletun

Tsudik

Efficient aggregation of encrypted data in wireless sensor networks

Proceedings of the 2nd Annual International Conference on Mobile and Ubiquitous Systems—Networking and Services (MobiQuitous ′05)

July 2005

109 117

2-s2.0-33749525209

10.1109/MOBIQUITOUS.2005.25

Girao

Westhoff

Schneider

CDA: concealed data aggregation for reverse multicast traffic in wireless sensor networks

Proceedings of IEEE International Conference on Communications (ICC ′05)

May 2005

3044 3049

2-s2.0-24144459865

Mykletun

Girao

Westhoff

Public key based cryptoschemes for data concealment in wireless sensor networks

Proceedings of the 41st IEEE International Conference on Communications (ICC ′06)

July 2006

2288 2295

2-s2.0-42549159545

10.1109/ICC.2006.255111

Feng

Wang

Zhang

Ruan

Condentiality protection for distributed sensor data aggregation

Proceedings of the 27th IEEE International Conference on Computer Communications (INFOCOM ′08)

2008

68 76

10.

Albath

Madria

Secure hierarchical data aggregation in wireless sensor networks

Proceedings of IEEE Wireless Communications and Networking Conference (WCNC ′09)

April 2009

1 6

2-s2.0-70349179495

10.1109/WCNC.2009.4917960

11.

Castelluccia

Chan

A. C. F.

Mykletun

Tsudik

Efficient and provably secure aggregation of encrypted data in wireless sensor networks

ACM Transactions on Sensor Networks 2009 5 3 1 36

2-s2.0-67651030465

10.1145/1525856.1525858

12.

Ozdemir

Xiao

Secure data aggregation in wireless sensor networks: a comprehensive overview

Computer Networks 2009 53 12 2022 2037

2-s2.0-67549118456

10.1016/j.comnet.2009.02.023

13.

Evans

Secure aggregation for wireless networks

Proceedings of the 3rd IEEE Symposium on Applications and the Internet Workshops (SAINT ′03)

Washington, DC, USA

IEEE Computer Society

384

14.

Mahimkar

Rappaport

T. S.

SecureDAV: a secure data aggregation and verification protocol for sensor networks

Proceedings of the 47th IEEE Global Telecommunications Conference (GLOBECOM′04)

December 2004

2175 2179

2-s2.0-18144428307

15.

Ozdemir

Xiao

Polynomial regression based secure data aggregation for wireless sensor networks

Proceedings of the 54th IEEE Global Telecommunications Conference (GLOBECOM ′11)

2011

1 5

16.

Karlof

Sastry

Wagner

TinySec: a link layer security architecture for wireless sensor networks

Proceedings of the 2nd International Conference on Embedded Networked Sensor Systems (SenSys ′04)

November 2004

162 175

2-s2.0-26444574670

17.

Katz

Lindell

A. Y.

Modern Crytography 2008

Chapman & Hall

18.

Przydatek

Song

Perrig

SIA: secure information aggregation in sensor networks

Proceedings of the 1st International Conference on Embedded Networked Sensor Systems (SenSys ′03)

November 2003

255 265

2-s2.0-18844457825

19.

McCune

J. M.

Shi

Perrig

Reiter

M. K.

Detection of denial-of-message attacks on sensor network broadcasts

Proceedings of the 25th IEEE Symposium on Security and Privacy (SP ′05)

May 2005

64 78

2-s2.0-27544432072

20.

Wood

A. D.

Stankovic

J. A.

Denial of service in sensor networks

Computer 2002 35 10 54 62

2-s2.0-0036793924

10.1109/MC.2002.1039518

21.

Newsome

Shi

Song

Perrig

The Sybil attack in sensor networks: analysis and defenses

Proceedings of the 3rd International Symposium on Information Processing in Sensor Networks (IPSN ′04)

April 2004

259 268

2-s2.0-3042785862

22.

Parno

Perrig

Gligor

Distributed detection of node replication attacks in sensor networks

Proceedings of the 25th IEEE Symposium on Security and Privacy (SP ′05)

May 2005

49 63

2-s2.0-27544460282

23.

Brands

Chaum

Distance-bounding protocols

839

Proceedings of the Workshop on the Theory and Application of Cryptographic Techniques on Advances in Cryptology (EUROCRYPT ′93)

1994

Springer

344 359 Lecture Notes in Computer Science

C	HAX				Hash sum						ID list
03	50	63	46	27	50	63	23	3D	22	EA	00	03	00	05	06
03	49	77	C2	A4	49	77	9F	8F	23	15	00	03	04	00	06
02	56	A4	A8	A3	56	A4	91	38	17	6B	02	00	04	00	00
04	E1	1D	4A	91	E1	1D	1B	CA	2E	C7	02	03	04	00	06
01	2D	48	11	FF	2D	48	06	4B	0B	B4	00	00	04	00	00
00	00	00	00	00	00	00	00	00	00	00	00	00	00	00	00
04	BD	98	F6	19	BD	98	C7	8A	2E	8F	02	00	04	05	06
02	ED	48	FF	EE	ED	48	E8	78	17	76	00	03	04	00	00
03	A3	A6	20	EB	A3	A5	FD	F7	22	F4	02	00	04	05	00
01	5B	AD	89	B0	5B	AD	7D	EB	0B	C5	00	03	00	00	00
03	AE	AF	C4	01	AE	AF	A0	FF	23	02	02	03	00	05	00
01	EC	09	B6	05	EC	09	AA	6B	0B	9A	00	00	00	00	06
02	DB	23	9A	C9	DB	23	83	76	17	53	02	00	00	00	06
03	15	8B	95	E4	15	8B	72	ED	22	F7	02	00	04	05	00
02	16	E3	FB	2A	16	E3	E3	EC	17	3E	00	00	04	05	00
01	64	91	C9	E1	64	91	BE	29	0B	B8	02	00	00	00	00

C	HAX				Hash sum						ID list
03	50	63	46	27	50	63	23	3D	22	EA	00	03	00	05	06
03	49	77	C2	A4	49	77	9F	8F	23	15	00	03	04	00	06
02	56	A4	A8	A3	56	A4	91	38	17	6B	02	00	04	00	00
04	E1	1D	4A	91	E1	1D	1B	CA	2E	C7	02	03	04	00	06
01	2D	48	11	FF	2D	48	06	4B	0B	B4	00	00	04	00	00
00	00	00	00	00	00	00	00	00	00	00	00	00	00	00	00
04	BD	98	F6	19	BD	98	C7	8A	2E	8F	02	00	04	05	06
02	ED	48	FF	EE	ED	48	E8	78	17	76	00	03	04	00	00
03	A3	A6	20	EB	A3	A5	FD	F7	22	F4	02	00	04	05	00
01	5B	AD	89	B0	5B	AD	7D	EB	0B	C5	00	03	00	00	00
03	AE	AF	C4	01	AE	AF	A0	FF	23	02	02	03	00	05	00
01	EC	09	B6	05	EC	09	AA	6B	0B	9A	00	00	00	00	06
02	DB	23	9A	C9	DB	23	83	76	17	53	02	00	00	00	06
03	15	8B	95	E4	15	8B	72	ED	22	F7	02	00	04	05	00
02	16	E3	FB	2A	16	E3	E3	EC	17	3E	00	00	04	05	00
01	64	91	C9	E1	64	91	BE	29	0B	B8	02	00	00	00	00

ID List Forwarding Free Confidentiality Preserving Data Aggregation for Wireless Sensor Networks

Abstract

1. Introduction

2. Related Work

3. System Model

3.1. Network Assumption

3.2. Design Goals

Definition 1 (Chosen Plaintext Attack).

Definition 2 (Negligible Function).

Definition 3 (CPA secure).

3.3. aAttacker Model

4. PEC2P

4.1. Bootstrapping Phase

Algorithm 1: Bootstrapping algorithm.

4.2. Data Aggregation Phase

Algorithm 2: Perturbation algorithm.

Algorithm 3: Aggregation algorithm.

Definition 4.

4.3. Result Retrieving Phase

5. Analysis and Experiments

5.1. Accuracy Analysis

Theorem 5.

Proof.

5.2. Security Analysis

Algorithm 4: Construction Π * .

Lemma 6.

Proof.

Theorem 7.

Proof.

Algorithm 5: A ′ .

5.3. Efficiency Analysis

6. Conclusion

Footnotes

Appendix

Acknowledgments

References

Algorithm 4: Construction $Π^{*}$ .

Algorithm 5: $A^{'}$ .

C	HAX				Hash sum						ID list
03	50	63	46	27	50	63	23	3D	22	EA	00	03	00	05	06
03	49	77	C2	A4	49	77	9F	8F	23	15	00	03	04	00	06
02	56	A4	A8	A3	56	A4	91	38	17	6B	02	00	04	00	00
04	E1	1D	4A	91	E1	1D	1B	CA	2E	C7	02	03	04	00	06
01	2D	48	11	FF	2D	48	06	4B	0B	B4	00	00	04	00	00
00	00	00	00	00	00	00	00	00	00	00	00	00	00	00	00
04	BD	98	F6	19	BD	98	C7	8A	2E	8F	02	00	04	05	06
02	ED	48	FF	EE	ED	48	E8	78	17	76	00	03	04	00	00
03	A3	A6	20	EB	A3	A5	FD	F7	22	F4	02	00	04	05	00
01	5B	AD	89	B0	5B	AD	7D	EB	0B	C5	00	03	00	00	00
03	AE	AF	C4	01	AE	AF	A0	FF	23	02	02	03	00	05	00
01	EC	09	B6	05	EC	09	AA	6B	0B	9A	00	00	00	00	06
02	DB	23	9A	C9	DB	23	83	76	17	53	02	00	00	00	06
03	15	8B	95	E4	15	8B	72	ED	22	F7	02	00	04	05	00
02	16	E3	FB	2A	16	E3	E3	EC	17	3E	00	00	04	05	00
01	64	91	C9	E1	64	91	BE	29	0B	B8	02	00	00	00	00