Abstract
With the distributed sensors that are deployed to monitor the urban environment in smart cities, the sensed data are overwhelming and beyond the scope of the wireless sensor networks (WSNs) capability. Due to the communication range limits of the sensors, most of these data will be outsourced and stored on some untrusted servers. Thus, how to maintain the data confidentiality and integrity, as well as source authentication and data query privacy of the outsourced data, is a challenging problem. In this paper, we propose a secure searchable encryption scheme, named SSE, for urban sensing and querying to address the problem. Specifically, our SSE constructs a secure hidden vector encryption-(HVE-) based rang query predicate. The sensed data can be stored on an untrusted server in encrypted form. A requester can obtain the correct ciphertexts when his authorized range query matches the HVE-based encryption predicate. With the help of the base station, the ciphertexts can be decrypted and data integrity can be verified; then, the requester can obtain the correct original data. Security analysis demonstrates that; in the SSE, only the authorized requesters can obtain the query results, while the data confidentiality and integrity and source authentication are also preserved.
1. Introduction
Nowadays, with the continuous expansion of urban populations, there are increasing needs in many aspects related to urban living, such as environment monitoring, and public safety supervision [1]. Especially, a city can be defined as “smart” when modern information and communication infrastructures (ICI) communication can automatically control the urban living in a secure means. Wireless sensor networks (WSNs) increasingly become viable solutions to urban sensing and querying for smart cities. Widely distributed sensors monitor the urban environment in real time and collect data for intelligent decision making by multi-hop wireless transmission, as shown in Figure 1, which can facilitate various services and improve the quality of urban living [2]. However, deploying sensors without security in mind has often proved to be dangerous in hostile environments [3]. Especially, when there is terrorism, the detection of suspicious activities should be reported to the proper authorities in a secure approach.

The conceptional model of wireless sensor network.
In the worst case scenario, a single corrupted message can lead to the ICI systems in a total tizzy. For example, if a fake traffic accident information is broadcasted in rush hours, the ICI systems maybe make some misleading decisions to the drivers; thus, traffic jams will be caused in that city. This kind of fake data injection attacks not only prevent the authorities from collecting correct messages but also exhaust the energy of the forwarders [4]. Consequently, it is essential to prevent the malicious node from impersonating good nodes for spreading misleading information intentionally. Hence, sensitive monitoring data should be transmitted in encrypted form to achieve data confidentiality, integrity, and authentication between communicating parties. Furthermore, with the distributed sensors, the data collections are going to be several millions a day. In order to save energy and relieve the burden of data storage and maintenance [5], data outsourcing is the best way to let sensors store their sensed data on some untrusted servers (cloud or third party provided servers) in encrypted form and execute computation and queries using server's computational capabilities. In addition, universal data access with independent geographical locations, avoidance of capital expenditure on hardware, software, and personnel maintenances will make the smart city's ICI system more reliable and efficient.
Naturally, storing data in encrypted form on untrusted server can maintain data confidentiality for the stored data, as long as a secret key corresponding to a certain encryption is not given out to untrusted servers [6]. However, if these data are needed later for matters such as: calculation, analysis, or for intelligent decision-making, searchers (e.g., decision maker) are endowed with the task of querying these data, which can help them to gain organization-wide situational awareness. In this case, querying encrypted data is a major logistic problem. The server is anticipated to be able to evaluate various query predicates against the data without having to decrypt it [7]. Especially, when the query conditions are composed of many conjunctive range requirements (e.g., date ranges and geographic regions, etc.), rang query over encrypted data is much more difficult.
In this paper, we propose a secure searchable encryption scheme (SSE) for urban sensing and querying. The SSE addresses the data confidentiality, integrity, and source authentication by using a secure hidden vector encryption-(HVE-) based range query predicate. The main contributions of this paper are twofold.
Firstly, we propose a secure HVE-base range query predicate. Specifically, a sensor node encrypts its data by using the shared key with the Secondly, we analyze the security strengths of the SSE. The analysis results demonstrate that, compared with existing schemes, the SSE is the only one which can achieve data confidentiality and integrity, as well as source authentication and data query privacy.
The remainder of this paper is organized as follows. In Section 2, we discuss the related works. In Section 3, we introduce our system model and security requirements. Then, in Section 4, we review some preliminaries. In Section 5, we construct a secure HVE-based range query predicate. In Section 6, we present our SSE scheme, followed by its security analysis in Section 7. Finally, we conclude this paper in Section 8.
2. Related Works
2.1. Security in Sensor Network
Security in sensor network has been widely studied. Many security proposals for WSNs have focused on efficient key management in WSNs. Some are based on the symmetric cryptosystems. For instance, Perrig et al. [8] propose SPINS, a suite of efficient symmetric key-based security building blocks. Eschenauer and Gligor [9] look at random key predistribution schemes, which open the way to a large number of follow-up works [10]. Zhu et al. [11] proposed LEAP, a rather efficient scheme based on local distribution of secret keys among neighboring nodes. These strategies, however, do not provide perfect resilience, as after a certain percentage of nodes have been compromised, the whole network can be compromised as well. Others have employed public key cryptography (PKC) to adjust conventional algorithms (e.g., RSA [12]) to sensor nodes or to employ more efficient techniques (e.g., NTRU [3], ECC [13]) in this resource constrained environment. However, they all use interactive protocols and therefore nodes are required to exchange messages to agree on keys. Moreover, they are not suitable for data outsourcing and data query. He and Li schedule the sequence of query processing to achieve the optimized overall energy efficiency by fully utilizing the implications among sensor networks [14]. It also cannot be applied to query over encrypted data.
2.2. Range Query over Encrypted Data
The problem of range query over encrypted data is a hot issue in both cryptography and database communities [6, 15–17]. There are essentially four categories of solutions that have been developed for range queries. One general approach is to ensure that order amongst plaintext data is preserved in the ciphertext domain by using order-preserving encryption-based (OPE) techniques [15]. This allows direct translation of range predicates from the original domain to the domain of the ciphertext. As discussed in [18], the coupling distribution of plaintext and ciphertext domains might be exploited by attackers to guess the scope of the corresponding plaintext for a ciphertext. Another bucketization-based (Buck) technique [6] is to use distributional properties of the dataset to partition and index them for efficient querying while trying to keep the information disclosure to a minimum. Queries are evaluated in an approximate manner where the returned set of records may contain some false positives. These records will expose unrelated data's confidentiality and increase the computational and communication overhead of the system.
Thirdly, those that use some specialized data structure for range query evaluation while trying to preserve notions of semantic security of the encrypted data [17], such as B+tree [19]. In order to hide the access patterns, they run fake queries. In addition, they buffer the first few levels of the tree on the client side. Finally, they shuffle from time to time. The last approach is using a specialized type of predicate encryption, the HVE-based approach [20], where two vectors over attributes are associated with a ciphertext and a token, respectively. Under predicate translator, the ciphertext matches the token if and only if the two vectors are component-wise equal. Although most of the existing range query approaches can maintain part of the data confidentiality, all of them are not secure enough to keep the untrusted server from peering the original data when the requester's query is successful. Therefore, the goal of this paper is to propose a secure range query predicate to support data confidentiality and integrity, as well as source authentication and data query privacy.
3. System Model and Security Requirements
In this section, we formalize the system model and identify the security requirements and our design goals.
3.1. System Model
In this paper, we consider a large-scale WSNs consisting of a database server (DB), a base station (

System model of SSE.
All the SNs encrypt their data by using the shared keys with the
3.2. Security Requirements
We identify the security requirements for our SSE. In our security model, the Data Confidentiality. Sensed data should not be disclosed by unauthorized parties and it is the most important issue in mission critical applications. Moreover, in many applications, SNs transmit highly sensitive data, for example, terrorism supervision data, and therefore it is extremely important to build secure channels between SNs and the authorities. The standard approach for keeping sensitive data secret is to encrypt the data with a secret key that only intended authorized receivers possess, hence achieving confidentiality. Data Integrity. A malicious node may just corrupt messages to prevent network from functioning properly. Thus, sensitive data needs to be protected from being altered. Message authentication code (MAC) is a widely used approach to guarantee that the message being transferred is never corrupted. Therefore, data integrity can be achieved. Source Authentication. Since wireless sensor networks use a shared wireless medium, without source authentication, an adversary could masquerade a node, thus gaining unauthorized access to resource and sensitive information and interfering with the operation of other nodes. Moreover, a compromised node may send data to its CH under several fake identities so that the decision-making is misled. Data Query Privacy. Since SN's sensed data are stored on the untrusted DB in an encrypted form, generally, the server cannot know the original data. In this case, when a requester's query vectors in the query tokens match the encryption vectors, the original data can be exposed to the DB. This is also the security vulnerability of the data privacy. Therefore, double encryption is needed to ensure data privacy when range query succeeds.
4. Preliminaries
In this section, we will briefly describe the basic definition and properties of bilinear pairings, polynomial-based key predistribution, and HVE-based comparison query predicate.
4.1. Bilinear Pairing
Bilinear pairing is an important cryptographic primitive [22]. Let bilinear: nondegenerate: computable: there is an efficient algorithm to compute
Definition 1.
A bilinear parameter generator
4.2. Polynomial-Based Key Predistribution
In this section, we briefly review the polynomial-based key predistribution approach [23]. To predistribute shared keys, the TA randomly generates a bivariate t-degree polynomial
4.3. HVE-Based Comparison Query Predicate
HVE [16] is a type of predicate encryption where two vectors over attributes are associated with a ciphertext and a token, respectively. The ciphertext matches the token if and only if the two vectors are component-wise equal. There are two character sets
In key generation phase, a TA distributes the public/private key pair
Then, in the data encryption phase, a SN chooses a vector
Next, in the token generation phase, the requester chooses a vector
In the data query phase, let
5. Construction of Secure HVE-Based Range Query Encryption Predicate
In this section, we modified the HVE scheme in [7] to support data query privacy. To protect the original data from exposing to the untrusted server even when the range query succeeds, double encryption is needed. Firstly, sensed data m can be encrypted into a ciphertext C by using its shared secret key with the
Three entities: sender, server, and requester are involved in this section. It mainly consists of the following five phases: key generation phase, data encryption phase, token generation phase, data query phase, and data decryption phase.
In key generation phase, a TA also distributes the public/private key pair In the data encryption phase, the SN should define two encryption vectors: In the token generation phase, the requester's range query is defined with two vectors: In the data query phase, when the query tokens In the data decryption phase, with the help of the
The larger condition vectors.
The less condition vectors.
6. Proposed SSE Scheme
In this section, we describe the details of our SSE scheme. There are also five phases: key generation phase, data encryption phase, token generation phase, data query phase, and data decryption and verification phase.
6.1. Key Generation Phase
For our scheme, we assume that there is a TA (actually is the
6.2. Data Encryption Phase
If
In order to support range query “ The CH also adds the cluster number
6.3. Token Generation Phase
In this phase, firstly, the
The Select a random Compute the token
6.4. Data Query Phase
In this phase, S deposits its range query tokens to the DB to query their required data. After finish receiving the query tokens from S, the DB searches its database to find if there is a data ciphertext which matches S's query. For each data ciphertext, the DB checks that
Thus, the DB can recover
6.5. Data Decryption and Verification
Note that, there are more than one query results which match and the range query tokens. When S received the query results from DB, it will send them to the
Then, the
7. Security Analysis
In this section, we analyze the security properties of the proposed SSE scheme. In particular, following the security requirements discussed earlier, our analysis will focus on how the proposed SSE scheme can achieve the data confidentiality, data integrity, source authentication, and data query privacy.
The SN's data confidentiality is preserved in the proposed SSE scheme. In the SSE, each SN's data The SN's data integrity is preserved in the proposed SSE scheme. In the SSE, each SN's data packet is followed by an MAC. The MAC is generated by using the shared secret key The source authentication is preserved in the proposed SSE scheme. In the SSE, except the data authentication message, The data query privacy is achieved in the proposed SSE scheme. In the SSE, when the query vectors in the query tokens match the HVE-base encryption vectors, the original data will not expose to the DB. The reason is that the secret keys needed for data decryption phase is kept secret. As described in the key generation phase, the polynomial share
Compared with the original OPE-based scheme [15], Buck-based scheme [6], and the original HVE-based scheme [20], as shown in Table 3, our SSE is the only one which can achieve all of the data confidentiality and intergrity, source authentication, and data query privacy. Furthermore, our SSE has no false positive results when querying the DB, which makes it more secure and efficient than the Buck-based scheme [6].
Comparison of security properties.
8. Conclusion
In this paper, we propose a secure searchable encryption scheme, named SSE, for urban sensing and querying to address the data security, source authentication, and range query problems of the sensed data in smart cities. The SSE allows the sensed data to be stored on a untrusted server in encrypted form. An authorized requester can obtain the correct ciphertexts when his authorized range query matches the HVE-based encryption predicate. With the help of the base station, the data integrity and source authentication can be verified. Security analysis demonstrates that the SSE can achieve data confidentiality and integrity, source authentication, and data query privacy. For our future work, we intend to enhance our SSE to support fine-grained search with logical operations.
Footnotes
Acknowledgments
This work is supported by the National Natural Science Foundation of China under Grant nos. 61073189, 61272437, and 61202369, Innovation Program of Shanghai Municipal Education Commission nos. 13ZZ131 and 14YZ129, the Foundation Key Project of Shanghai Science and Technology Committee no. 12JC1404500, and the Project of Shanghai Science and Technology Committee no. 12510500700.
