Abstract
We consider a hybrid two-tiered sensor network consisting of regular resource-limited sensor nodes and powerful master nodes with abundant resources. In the architecture, master nodes take charge of storing data collected by sensor nodes and processing queries from the base station. Due to the important role of master nodes, they might easily become the target for the adversary to compromise in an untrusted or hostile circumstance. A compromised master node may leak sensitive data in its storage to the adversary, which breaches the data privacy. This paper proposes EMQP, a novel and energy-efficient privacy-preserving MAX/MIN query protocol which is capable of preventing adversaries from obtaining sensitive data collected by sensor nodes. To preserve privacy, the 0-1 encoding verification, keyed-hash message authentication coding, and symmetric encryption are applied to achieve the secret comparison of data items without knowing their real values. On the basis of secret comparison mechanism, the data submission and query processing protocols are proposed to describe the details of EMQP. And the analyses on privacy protection and energy consumption are also given. Moreover, a hash-based optimization method is presented to save more energy of the resource-limited sensor nodes. The simulation result shows that EMQP is more efficient than the current work in energy consumption.
1. Introduction
Wireless sensor networks (WSNs) have been widely used in a variety of important areas, such as environment sensing, battlefield monitoring, and volcanic eruption predication. In this paper, we consider a two-tiered wireless sensor network (two-tiered WSNs) [1, 2] as shown in Figure 1, which consists of a large number of sensor nodes at the lower tier and relatively fewer master nodes at the upper tier. Sensor nodes are resource limited (computation, storage, energy, etc.) and take charge of collecting data and periodically submitting it to a nearby master node for storage, while master nodes have rich resources, and answer for the ad hoc data queries from the base station which are issued via an on-demand wireless (e.g., satellite) link. It is necessary to maintain such in-network data storage and query processing in remote and tough environments, where it is infeasible or difficult to keep connection between the sensor networks and the base station with the high-speed and always-on manner. The two-tiered architecture is also known to be indispensable for increasing network capacity and scalability, reducing system complexity, and prolonging network lifetime.

A two-tiered sensor network architecture.
As master nodes are responsible for data storage and query answering in networks, they are much more attractive and vulnerable to adversaries in a hostile environment. Once a master node is compromised, serious threats could be brought out. For example, adversaries could use compromised master nodes to steal information about patients in a human health monitoring sensor network, leading to the privacy breach of patients. It is a challenge for master nodes to process queries in such an environment with privacy, since they have to gain information about the collected data items for query result computing, which is conflictive with the privacy preserving objective.
Data query is an important operation for events monitoring or data analysis in sensors networks. Recently, privacy-preserving range query [3–8] and data aggregation [9–12] have been well addressed, however, research efforts on MAX/MIN query are limited, which is to query the maximum or minimum value in an interested area. In this paper, we focus on the MAX/MIN queries, which are important in many applications. For example, the MAX/MIN query can be applied to monitor the forest fires according to the maximum temperature acquiring.
To the best of our knowledge, only [13] proposed a preliminary solution to privacy-preserving MAX/MIN query in two-tiered WSNs, but it is still with the problem of inefficient energy consumption. This paper proposes an energy-efficient privacy-preserving MAX/MIN query processing (EMQP) for two-tiered WSNs. The basic idea is that sensor nodes first encode their collected data and send them to their nearby master nodes for storage, for the convenience that the master nodes can correctly process MAX/MIN queries over encoded data without knowing their real values. An adversary cannot steal any data items or query results in the master nodes, even when they were compromised. The main contribution of our work is that we introduce 0-1 encoding verification scheme to achieve the secret comparison between the collected data items, without knowing their real values. Based on that method, we propose a novel privacy-preserving MAX/MIN query protocol. To reduce the energy consumption of sensor nodes, we also give a hash-based optimization method, which demonstrates a significant energy-saving benefit. We evaluate EMQP by comprehensive simulation, and the results indicate that EMQP has a good performance compared with other methods.
The rest of this paper is organized as follows. Section 2 gives a brief review of the related work. Section 3 describes the models and the problem statement. In Section 4, we present the details of our energy-efficient privacy-preserving MAX/MIN query protocol. Section 5 gives an optimization for saving energy of sensor nodes. We evaluate the performance of our approach in Section 6 and conclude this paper in Section 7.
2. Related Work
Data storage models for sensor networks have drawn much attention in existing research work. In [14, 15], a novel data storage system is proposed by introducing an intermediate tier between the base station and sensor nodes, which can provide abundant storage for data caching and an efficient access to the data collected by sensor networks for query processing. We consider the same system model in this paper, in which some master nodes are deployed as the intermediate tier for data storage and query answering. In practice, several products of master nodes have been manufactured and are commercially available, such as StarGate [16] and RISE [17].
Recently, verifiable privacy-preserving range query in two-tiered WSNs has been widely studied [3–8], aiming to protect the privacy and integrity of range queries. Hacigümüş et al.firstly proposed a bucket partitioning [18] based scheme [3, 4], whose basic idea is to divide the domain of collected data values into multiple continuous but no overlapping buckets. In each epoch of time, sensor nodes collect data items, put them into corresponding buckets, encrypt them together in each bucket, and then send the ciphertext along with the corresponding bucket ID to a nearby master node. For each bucket without data items, an encoding number will be generated and transmitted to a nearby master node, which can be used by the base station to verify that the bucket is empty. When the base station executes a range query, it first generates the smallest set of bucket IDs covering the range in the query and then sends the ID set as the query to master nodes. Upon receiving the bucket IDs, the master nodes return the corresponding ciphertext in all those buckets. The base station can then decrypt the ciphertext to get the query result and verify its integrity by encoding numbers. Shi et al. proposed an optimized version [5, 6] of integrity verification scheme of [3, 4], with the objective to reduce the communication cost of sensor nodes. Since all the works in [3–6] are based on the bucket partitioning scheme, there is an inherited drawback that the bucket partitioning allows compromised master nodes to obtain a reasonable estimation on the real values of both data items and queried ranges [19]. To solve data estimation problem, Chen and Liu proposed a secure and efficient range query processing protocol, SafeQ [7, 8], which is based on Prefix Membership Verification (PMV) [20, 21] and neighborhood chains. The PMV scheme can be used to check a data item x whether it is in a range
For privacy-preserving MAX/MIN query in two-tiered WSNs, only [13] has presented a preliminary solution. In [13], the same PMV scheme as in [7, 8] is used to privately compute the maximum or minimum data item. However, it still has a problem of inefficient energy consumption of sensor nodes. This paper will propose an energy-efficient privacy-preserving MAX/MIN query, the evaluation results of which indicate that it has a better performance than [13] in energy consumption of sensor nodes.
3. Models and Problem Statement
3.1. Network Model
We consider a similar two-tiered sensor networks model as in [3–8]. As shown in Figure 1, the network is partitioned into multiple cells, each containing several sensor nodes and a master node. The two types of nodes are different in resource owning. In particular, the master nodes are powerful devices and have abundant resource in energy, storage, and computation, while the sensor nodes are cheap sensing devices with limited resource. Each sensor node periodically transmits its collected data to its nearby master node in the same cell. The base station is in charge of converting users' questions into queries and then disseminating the queries to the corresponding master nodes, which process the queries based on their stored data items and return the query results to the base station via an upper-tier multihop network formed by the resource-rich master nodes and an on-demand wireless (e.g., satellite) link between some master nodes and the base station.
As in [3–8], we assume that master nodes and sensor nodes know their respective locations and affiliated cells. The time is assumed to be divided into epochs. At the end of each epoch, each sensor node submits all its collected data items to the affiliated master node in its cell.
3.2. MAX/MIN Query Model
A MAX/MIN query is an operation to obtain the maximum or minimum value from an interested area. For simplicity, the following atomic MAX/MIN query will be considered, which is denoted as a four-element tuple:

A complicated MAX query example.
Since each master takes charge of a unique cell, the adversary will not gain more from the collaboration of multiple honest-but-curious masters. The subsequent discussion in this paper focuses on a cell C consisting of a master M and n sensor nodes
3.3. Threat Model and Problem Statement
In two-tiered WSNs, the master nodes are too attractive to be easily under attacks from adversaries, since they not only store all the data items collected by sensor nodes, but also take charge of processing queries received from the base station. We assume that the sensor nodes and the base station are trusted but the master nodes. And we adopt the same honest-but-curious threat model as [13], where master nodes may try to breach privacy to obtain sensitive data items but faithfully obey protocols during query processing.
In this paper, we focus on how to provide data privacy preservation and efficient query processing schemes for MAX/MIN queries, while confronting the honest-but-curious master nodes. In addition, we will use the metric of energy consumption of sensor nodes, which directly affects the lift time of the whole networks, to evaluate the performance of our proposed schemes.
4. 0-1 Encoding-Verification-Based MAX/MIN Query Processing
To preserve privacy, it seems natural to have sensor nodes encrypting their collected data items; however, the key challenge is how the master nodes process MAX/MIN queries over encrypted data without knowing their real values.
The basic idea for preserving privacy MAX/MIN query is as follows. We assume that each senor
4.1. 0-1 Encoding Verification
0-1 encoding verification was first introduced by Lin and Tzeng in [22] for solving the millionaires' problem [23], which is to find the richest from several millionaires without leaking the sensitive personal information of their properties.
Definition 1.
Let
According to Definition 1, we can get the properties as follows:
Theorem 2.
For two numbers x and y, if they are encoded into
The proof of Theorem 2 refers to [22]. In order to verify whether a number x is not greater than the other number y using Theorem 2, we can convert x and y to
To verify whether

0-1 encoding verification.
4.2. Data Submission Protocol
The data submission protocol (DSP) is concerned with how a sensor node transmits its collected data items to the master node M. For each sensor node Compute the maximum of Convert Compute the keyed-hash message authentication code (HMAC) [25] of each data item in Encrypt Submit the following message to M:
The above steps indicate that the aforementioned encoding function ℜ is defined as follows:
Since the HMAC function is with one-wayness and collision resistance properties, and sensor nodes only share the secret key with the base station, given
4.3. Query Processing Protocol
The query processing protocol (QPP) is concerned with how the master node M executes a query and returns response to the base station. When M receives a query
Lemma 3.
Given two data items x and y with corresponding 0-1 encoding c-factors,
We omit its proof here since it can be easily derived from the collision resistance property of
Lemma 3 shows that the aforementioned secret comparing function ℑ is defined as follows, where
Theorem 4.
Given n data items
Proof.
Suppose that the above condition is satisfied, for each
On the basis of Theorem 4, the master node M performs the following steps to implement query processing.
Load
Find the encrypted data
Upon receiving the above message, the base station loads the secret key
4.4. Privacy Protection Analysis
As the privacy protection is the focus in this paper, we propose the privacy analysis about EMQP on the following two aspects.
(1) Privacy of Collected Data. According to the data submission protocol, the submitted information from each sensor node to the affiliated master node is not plaintext but encrypted and HMAC data. Since the HMAC function is with one-wayness and collision resistance properties, and sensor nodes only share the secret key with the base station, given
(2) Privacy of Query Result. The query processing protocol shows that the master node can use the secret comparing function to obtain the query result, which is the maximum or minimum of data items collected by the queried sensor nodes. Because the secret comparing is built upon the HMAC data items and the collected data items are all encrypted for master node storage, it is also computationally infeasible for master nodes to obtain the value of the query result without keys. As a consequence, we have that EMQP is capable of preserving query result from master nodes.
Since [13] also uses similar HMAC and encryption to protect privacy, the capability of privacy preservation is the same between our work and [13].
4.5. Energy Consumption Analysis
In two-tiered WSNs, sensor nodes have limited energy resource while master nodes are abundant in energy. Therefore, the life time of network is mainly determined by the energy consumption of sensor nodes. In this section, we discuss the energy consumption of sensor nodes in our proposed schemes.
We assume that (1) a cell C have of n sensor nodes; (2) each epoch number and node ID are of
The total energy consumption of sensor nodes is composed of two aspects, one is communication cost including sending and receiving messages and the other is computation cost such as encryption and HMAC computing. We use
As shown in DSP, each sensor node will encrypt the maximum or minimum of its collected data in an epoch and generate its 0-encoding and 1-encoding c-factors having w HMAC data items in total. The encrypted data and c-factors will both be transmitted to M. Then, we have
According to (8), (9), we have
In [13], for each senor node
5. Energy Optimization
The above query scheme will consume much energy in sensor nodes because each sensor node needs to submit c-factors which consist of multiple HMAC data, and each HMAC data may have several bits such as 128 bits with HMAC-MD5 [26] or 160 bits with HMAC-SHA1 [27]. In this section, we focus on the c-factor compressing method with the basic idea to compress the HMAC data of c-factors, which can significantly reduce the communication cost in sensor nodes. As a result, the energy consumption of sensor nodes will be decreased and the lifetime will be promoted.
Assume that each HMAC data have τ bits and are randomly distributed in
After applying ℋ, each HMAC data of a c-factor can be converted to a fewer-bits number, which is called cHMAC data and is randomly distributed in
If the HMAC data is replaced with the corresponding cHMAC data in (5), we can get a similar secret comparing function
Lemma 5.
For two data items x and y, one has
If If
Proof.
For any
As shown in Lemma 5 and its proof, there could have false positive in secret comparing while using the compressed c-factors, which is represented as the probability of a false decision in data comparison. And only if
We assume that each collected data item is of w bits, X and Y are the c-factors of data items x and y where
If the 128-bit HMAC-MD5 is used

Impacts of w and μ on Pr.
6. Performance Evaluation
To evaluate the performance of our proposed EMQP and the current work [13], which is denoted as PMQP, we implement both schemes and perform energy consumption comparison on the simulator of [28] with the same data set as [13] which is from Intel Lab [29]. We use PMQP(bot) and PMQP(top) to represent the lower and upper bounds of energy consumption of PMQP, and EMQP(bas) and EMQP(opt) to represent energy consumption of EMQP before and after hash-based optimization. We carry out evaluations on a MAX query in a cell with n sensor nodes and a master node, and we consider the following two aspects: firstly, the total energy consumption
The evaluations are conducted on a PC with a P4 3.0 GHz CPU and 512 MB memory running Ubuntu operating system. The placement of sensors nodes of a cell follows a uniform distribution over a two-dimensional region covering a
Default evaluation parameters.
In each measurement, we randomly distribute the sensor nodes and generate 20 networks with different topologies which are represented by different network IDs. The total energy including communication and computation costs of each measurement is the average of 20 networks.
(1) Energy Consumption versus Network ID. Figure 5(a) shows that

Impact of network ID on energy consumption.
Figure 5(b) indicates that
(2) Energy Consumption versus
n
and
w. Figures 6(a) and 7(a) both show that

Impact of n on energy consumption.

Impact of w on energy consumption.
According to the above evaluations, we can conclude that our proposed EMQP are more energy efficient than the current PMQP. Particularly, the optimized EMQP has a significant saving (about 75%) in energy consumption even compared with the lower bound of PMQP. And communication costs much more energy than computation (more than 99%).
7. Conclusion
As the wireless sensor networks are deployed and used in many important areas, preserving the privacy of sensitive collected data items during query processing is a critical problem in sensor network applications. In this paper, we propose EMQP, a novel and energy-efficient protocol for handling privacy-preserving MAX/MIN queries in two-tiered sensor networks. To implement privacy-preserving MAX/MIN query processing without exposing the real value of collected data to master nodes, the technique of 0-1 encoding verification and encryption are applied. Furthermore, we also give a hash-based optimization for saving more energy of the resource-limited sensor nodes. The result of our evaluations shows that the proposed EMQP has a better performance than the current work in energy consumption. Under our optimized circumstance, in comparison with the lower bound of the current work, the EMQP has a significant improvement in energy saving. Last but not least, communication costs much more than computation in all consumptions.
Footnotes
Acknowledgments
This research is supported by the National Key Basic Research Program (973 Program) of China under the Grant no. 2011CB302903, the National Natural Science Foundation of China under the Grants nos. 61272084, 61202004, 61201163, and 61202353, the Specialized Research Fund for the Doctoral Program of Higher Education under the Grant nos. 20113223110003 and 20093223120001, the Key Project of Natural Science Research of Jiangsu University under the Grant no. 11KJA520002, and the Natural Science Foundation of Jiangsu Province under the Grants nos. BK2011754 and BK2011072, the Introduction of Talent Research Foundation of Nanjing University of Posts and Telecommunications under the Grant no. NY211043, and the Priority Academic Program Development Foundation of Jiangsu Higher Education Institutions (Information and Communication yx002001).
