Abstract
As a convergence of traditional power system engineering and information technology, smart grid, which can provide convenient environment of operation and management for the power provider, has attracted considerable interest recently. However, the flourish of smart grid is still facing many challenges in data security and privacy preservation. In this paper, we propose an efficient privacy-preserving multidimensional aggregation scheme for smart grid, called PAS. Without disclosing the privacy-sensitive information (e.g., identity and power consumption) of users, the operation center can obtain the number of users and power consumption at each step in different dimensions. Based on an improved Paillier cryptosystem, the operation center can acquire more valid information to regulate the generated energy, and an efficient anonymous authentication scheme is employed to protect the privacy of user's identity from the regional center. Detailed security analysis shows the security and privacy-preserving ability of PAS. In addition, performance evaluations via extensive simulations demonstrate that PAS is implemented with great efficiency for smart grid in terms of computation and communication overhead.
1. Introduction
The 22nd world energy congress under the theme of “Securing Tomorrow's Energy Today” was held on October 13, 2013, in Daegu, South Korea [1], which offered a unique opportunity for participants to get a better understanding of energy issues and solutions from a global perspective. Simultaneously, the optimization of energy structure and the improvement of energy utilization were taken into consideration inevitably. Especially in the electric area, the lack of reasonable electrical structure to meet growing demands has stretched the power grid to its limits. Furthermore, in the past 100 years, there has been no dramatic change in the basic structure of electrical power grid. Experience has shown that the hierarchical, centrally controlled power grid cannot meet the growing demands of the 21st century [2]. Therefore, researching for a new power grid, which can support the new power system and enhance the efficiency of power consumption, is currently an emergency for most countries.
Recently the appearance of smart grid has attracted plenty of countries' interest and has been regarded as the next generation of power grid [3–8]. Compared with traditional grid, smart grid supports centralized two-way transmission and efficiency-driven response. Besides, smart grid relies on cyber-physical systems or the Internet of Things to provide intelligent scheduling for power transmission and distribution [9]. For instance, smart meter equipped with network interfaces (e.g., wireless sensors) reports power consumption to the operation center via the advanced meter controlled by the regional center, as shown in Figure 1. The operation center spans different geographic regions and can transmit the result of data analysis to the power supplier in a distributed way, whose responsibility is to adjust power supply dynamically to meet demands and detect and respond to the weaknesses or failures of power system in real time.

The conceptual architecture of smart grid.
Smart meter (SM) which is two-way communication device and deployed at users premise is indispensable component in the smart grid, since it can record real-time power consumption periodically. With SM, smart grid is capable of collecting real-time information about grid operations and status at the operation center. SM will send the encrypted power consumption to the regional center after being authorized. Through the regional center, smart grid can transmit the user's detailed electricity information to the operation center, which may then change electricity price accordingly or even adjust user's usage. However, if there is an adversary in the regional area, he/she may infer user's physical activities based on the power consumption of a family. For instance, unusual low daily power consumption of a household indicates that the home owners are probably away from home. Therefore, such privacy-sensitive information must be protected from unauthorized access [10, 11].
However, the majority of existing schemes [12–15] can offer total power consumption to the power supplier; this user-centric information is far from enough for the service provider (e.g., the electric company) to allocate power resource. For example, many service providers adopt the multistep electricity pricing policy [16] where the price of electricity is based on the user's power consumption. In order to utilize and dispatch different kinds of step electricity for service providers, more information related to users (e.g., the number of users and power consumption at each step, etc.) is required. With regard to the power supplier, not only can it classify users depending on the power consumption, but also it works out a practical and reasonable business strategy.
In this paper, from the perspective of power supplier, under the promise of protecting the confidentiality of user's identity and power consumption, we propose an efficient privacy-preserving multidimensional aggregation scheme for smart grid, called PAS. The characteristics of proposed PAS are offering more valid user information to the power supplier and achieving user's identity privacy and data confidentiality. The main contributions of this paper are threefold.
First, PAS provides more valid user information without disclosing user's data and identity privacy. Upon the employment of an improved Paillier cryptosystem [17] technique, the operation center can acquire both the number of users and power consumption at each step by decrypting the preliminary aggregations from the regional center. Simultaneously, compared with the existing aggregation schemes [18–20], PAS provides more effective information to the power supplier to make more planed electricity strategy. Second, PAS can guarantee fine-grained privacy preserving. Although the aggregation scheme [12] can guarantee the privacy of user's information, the authenticator still knows the identity which corresponds to the signature. However, in our scheme, the improved group signature [21] scheme makes user's identity unrecognized by the regional center. Meanwhile, data privacy preservation requirements can be achieved through the improved Paillier cryptosystem. Third, PAS achieves the traceable anonymity of smart grid. We employ a specific group signature scheme between the smart grid and the regional center, which can obtain the authentication and traceability of message. The regional center checks out the message and signature without knowing the smart grid's identity. But if the smart grid's signature is proved to be invalid, the identity of malicious smart grid could be ferreted out with the help of TA.
The remainder of this paper is organized as follows. In Section 2, we formalize the system model and security requirements and identify our design goal. Then, we recall the discrete logarithm and Paillier cryptosystem as the preliminaries in Section 3. Then, we present our PAS scheme in Section 4, followed by the security analysis and performance evaluation in Sections 5 and 6, respectively. We also discuss some related works in Section 7. Finally, we draw our conclusions and identify some future work in Section 8.
2. System Model, Security Requirements, and Design Goal
In this section, we formalize the system model and security requirements and identify our design goal.
2.1. System Model
In our system model, we mainly focus on how to report user's privacy-preserving power consumption information to the operation center. Specifically, the system consists of four parts: trusted authority (TA), residential area (RA), operation center (OC), and regional center (RC), as shown in Figure 2.

System model under consideration.
TA: the trusted authority is fully trustable in the whole system, whose duty is to bootstrap the system initialization and distribute secret keys to the SM, RC, and OC (some parameters must be kept private, while others are public). In addition to system initialization, TA is offline in other phases. Only in the case that errors exist in the process of authentication, TA can respond to it in time; for example, when errors arise in the authentication, TA can recover the malicious identity of SM based on the group signature scheme. Therefore, TA is an indispensable part in this system.
RA: we consider a typical residential area, which mainly consists of a connection with RC and a great number of users that are equipped with smart meters
RC: RC is a powerful workshop, which mainly performs three functions: authentication, aggregation, and relaying. Some authentication operations must be performed to guarantee the privacy of user's identity before SM transmits user's encrypted data to RC. The aggregation component is responsible for aggregating user's multidimensional power consumption into a compressed one. The relaying component transmits aggregation data to OC.
OC: the responsibility of OC is to analyze the aggregated regional data. It can supply power supplier with valid user information which includes the number of users and power consumption at each step, so as to carry out the dynamic step tariff and locate the trouble spot.
2.2. Security Requirements
In our security model, we consider RA (e.g., a typical residential area) mainly consists of a connection with RC (e.g., the property management) and a great number of users that are equipped with SM. OC (e.g., the electric company) spans different RC and analyzes the aggregation regional data. Specifically, we assume that RC and OC are not in collusion with each other and both of them are honest but curious. That is, RC and OC aggregate the encrypted data from SM exactly, but it is curious to SM's real-time data and identity. In addition, the confidentiality and the integrity of data need be guaranteed. Therefore, in order to prevent the user's data in RC and OC, PAS must satisfy the following three security requirements.
Confidentiality. The scheme should protect user's data from being analyzed by RC, and guarantee the user's identity privacy. Though RC has all the data from Authentication. The scheme should guarantee that an encrypted report at RC is really sent by a legitimate residential user and has not been transformed during the transmission to RC; that is, if the adversary forges or modifies a report, the malicious operations should be detected at RC. Thus, only the correct user information can be received by RC. Anonymity. The identity of user should be unknown in the scheme. On one hand, when RC receives the messages from
2.3. Design Goal
According to the aforementioned system model and security requirements, our design goal is to develop an efficient privacy-preserving multidimensional aggregation scheme for smart grid. Specifically, the following three objectives should be achieved.
The scheme should provide more effective information to the power supplier and satisfy a classified management. The power supplier should obtain the total power usage information that is the number of users and power consumption at each step. With these valid data, not only can it develop a reasonable step tariff strategy, but also the power consumption in one RA can realize load balancing. Through the implementation of step tariff strategy, we can get energy conservation to some extent. Therefore, compared with electric energy production, system engineers can discover the trouble spot in the process of power transport. The confidentiality of user's identity and power consumption should be guaranteed. If the smart grid does not take the security of user information into account, the residential user's privacy information would be revealed, and the real-time power consumption data could be modified. Therefore, the proposed scheme should achieve identity and data privacy preservation requirements simultaneously. The proposed PAS scheme should be effective and practical to the smart grid. Although the performance of smart grid is continuously improved today, it still cannot afford the high computational overhead. The proposed scheme should also consider the effectiveness in terms of computation and communication to be acceptable to smart grid.
3. Preliminaries
In this section, we review the discrete logarithm problem and the Paillier cryptosystem [17], which support the security of signature and can serve as the basis of proposed PAS scheme.
3.1. Discrete Logarithm Problem
The difficulty of computing discrete logarithms [22] is the basic problem underlying the results of this thesis.
Definition 1.
Let G be a finite cyclic group and let
Usually, the discrete logarithm is also called index of an element a. If
Definition 2.
The discrete logarithm problem (DLP) is stated as follows: given a finite cyclic group G, a generator
Definition 3 (computational Diffie-Hellman (CDH)).
Let p and q be primes such that
3.2. Paillier Cryptosystem
The Paillier cryptosystem is widely used in privacy-preserving applications [23, 24], which can achieve the privacy-preserving properties. Specifically, the Paillier cryptosystem is comprised of three parts: key generation, encryption, and decryption.
Key generation ( Encryption: given a user's data Decryption: given the ciphertext
4. Proposed Schemes
In this section, we propose the efficient privacy-preserving multidimensional aggregation scheme for smart grid, called PAS, which mainly consists of the following four parts: system initialization, user data acquisition, privacy-preserving regional aggregation, and data analysis. For easier expression, we give the description of notations used in PAS in the following notations.
Notations Used in PAS
g: k: a security parameter k for ω: the maximum number of households in a residential area, l: the number of step electricity,
4.1. System Initialization
For a single authority smart grid system under consideration, a trusted TA bootstraps the whole system which is responsible for distributing secret keys to SM, RC, and OC. In particular, TA first chooses a security parameter k to obtain the Paillier cryptosystem's key
Assuming that the maximum number of SM in RC is less than ω, the power consumption at each step is
When a local
Finally, TA sends
4.2. User Data Acquisition
In order to achieve the nearly real-time user's power consumption at every 15 minutes, each user in the regional area equips a
Step 1.
Step 2.
Step 3.
Step 4.
4.3. Privacy-Preserving Regional Aggregation
While receiving
If the validity is checked, RC performs the following steps to achieve the privacy-preserving data aggregation.
Step 1.
RC computes the aggregated and encrypted data C as
Step 2.
RC chooses a random number
Step 3.
RC sends the preliminary aggregation data and the signature
If the above verification fails, RC will verify the signature with the help of TA and announce the malicious
4.4. Data Analysis
After receiving the message
Step 1.
Let M and R be
Step 2.
By executing Algorithm 1, OC can recover the aggregated data
set return
Step 3.
After OC achieving power information at each step, OC invokes Algorithm 2 to obtain the real power consumption
set return
Since
Finally, OC can obtain the valid information, which contains the true power consumption
5. Security Analysis
In this section, we analyze the security properties of PAS scheme. According to the security requirements discussed earlier, our analysis will focus on how PAS can achieve the privacy preservation of user's data and identity.
(i) The Confidentiality of User's Data and the Aggregated Data Are Obtained. In the proposed PAS scheme, each
(ii) The Authentication of User's Data and the Aggregated Data Are Achieved. In the proposed PAS scheme, each user's data and the aggregated data are signed by an improved group signature algorithm.
Definition 4 (the improved efficient group signature scheme).
The improved efficient group signature scheme is defined by the following algorithms.
Setup. Providing some security parameter k as input, the key generation center (KGC) runs this algorithm to create a master key
SetSecretValue. Taking params as input, this algorithm picks
SetPublicKey. Providing params and
PartialPrivateKeyExtract. Providing params and
Sign. Providing params,
Verify. Providing params,
Definition 5 (security of the signature scheme).
A signature scheme is said to achieve unforgeability against a chosen message attack (UF-CMA secure) if no polynomials bounded adversary
Setup. The challenger runs setup by taking a security parameter k as input to generate a master key
SecretValue Request. The challenger runs SetSecretValue to generate the secret value
PartialPrivateKey Request. The challenger runs PartialPrivateKeyExtract to generate the private key
PublicKeyReplacement. The adversary
Verify Queries. The challenger runs Verify with
Sign Queries. The challenger runs Sign with
Challenge Phase. Once
Guess.
We define
Then, we analyze the security of the improved group signature.
Theorem 6.
One's improved efficient signature scheme is EUF-CMA secure in the random oracle model, assuming that the CDH problem is intractable.
The above theorem is obtained by combining Lemma 7.
Lemma 7.
Suppose
Proof.
To prove the lemma, we first assume that the RSA signature scheme is UF-CMA secure with advantage ν
Let
H
1
Queries.
If there is a tuple
Otherwise,
H
2
Queries.
SecretValue Request.
If
Otherwise (
PartialPrivateKey Request.
If
Otherwise (
Public Key Replacement.
Verify Queries. Suppose the request is to verify signature If the public key has not been replaced and Otherwise, search the
Signature Queries. picks sets defines computes
Guess. Eventually,
Analysis. We first evaluate the simulations of the random oracles. From the construction of
By definition of ε, we have
The running time of the CDH bounded by
(iii) The Anonymity of Users and Traceability Can Be Guaranteed. In the proposed PAS scheme, each user's data is signed by an improved group signature.
Theorem 8.
Our improved efficient signature scheme satisfies the anonymity and traceability.
Proof.
Assume that the
The group manager (KGC) is able to open any valid group signature and provably identify the actual signer: assuming that the signature is valid, this implies that
According to the anonymity, it is impossible for the regional center to recognize user's identity. And if there are any errors in the process of verification, TA can recover the malicious SM's identity. Therefore, the identify privacy of user can be guaranteed.
From the above analysis, we can see that the proposed PAS scheme is secure and privacy preserving and achieves our security design goal.
6. Performance Evaluation
In this section, we evaluate the performance of proposed PAS scheme in terms of the computation complexity and overhead of the SM, RC, and OC using a simulator built in Java. We run it on a workstation with 2.0 GHz 6-core processor, 64 GB RAM. Concretely, with the parameter setting
6.1. Computation Complexity
We evaluate the proposed PAS scheme in the computation complexity of SM, RC, and OC. Above all, we denote the computation overhead of an exponentiation operation in
In the proposed PAS scheme,
Comparing with the EPPA [12], the proposed PAS scheme enables acquiring more effective information and guarantees the anonymity of user's identify simultaneously and has similar computation overhead as shown in Table 1, while
Comparisons between PAS and EPPA scheme.
6.2. Simulation Results
According to the algorithms in our scheme, there are two factors that may affect the computation overhead. One is the number of users whose power consumption exceeds the basic step in RC which is denoted by r and the other is the number of step electricity l. Therefore, r and l are chosen to illustrate the computation overhead of SM, RC, and OC, and we present the comparison between PAS and EPPA. Note that EPPA cannot disclose users' identify and cannot provide the detailed information (e.g., users' numbers and power consumptions at each step in different dimensions).
For the comparison with EPPA, we plot the computation overhead of SM, RC, and OC in basic step with 10 different r from 0 to 50 in Figures 3(a), 3(b), and 3(c). They have shown that both PAS's and EPPA's processing times increase negligibly with the increase of r. The average processing times of SM and RC in PAS are 16 ms and 460 ms, respectively, which are a little less than those in EPPA. Then, we consider the processing time of OC. The time used to recover M of OC in PAS approaches to 17 ms which is higher than in EPPA. From the figure, we can see that, by increasing r, the average processing times of SM, RC, and OC are barely growing, which means r has a little influence on the computation overhead of SM, RC, and OC.

The computation overhead of SM, RC, and OC.
As shown in Figures 3(d), 3(e), and 3(f), we plot the PAS and EPPA's computation overhead of SM, RC, and OC with the number of step consumption l selected from 1 to 10. We can see that, with the increase of l, the average processing times of SM also increase in PAS and EPPA. The reason is that SM's power consumption may be in a higher step with l becoming larger and the SM will take more calculation on average. Afterwards, the average processing times of RC and OC are given, and by increasing l, the average operation times of RC and OC are barely growing. Thus, the number of step consumption l has a great influence on SM, but little on RC and OC. It is shown that the average processing times of SM and RC in PAS are a little less than that in EPPA, but the average processing time of OC in PAS is higher than that of EPPA.
From the evaluation results above, the PAS scheme has similar computation overhead with the EPPA scheme. Furthermore, the processing time of users is less than 20 ms which is acceptable to SM. And for RC and OC, they only need 400 ms and 18 ms, respectively, which is very efficient. The results of experiment show that our proposed PAS scheme is not only privacy preserving but also efficient and has stable computation performance.
7. Related Works
The study of privacy-preserving multidimensional aggregation for smart grid has gained great interest from the research community recently, and we briefly review some of them closely related to ours.
Part of the existing schemes needs to decrypt the received data before aggregating them and then encrypt the aggregated result before forwarding it. This process is fairly expensive and risky when aggregator is not trusted. Castelluccia et al. [25] designed a symmetric-key aggregation scheme that blends inexpensive encryption techniques with simple aggregation methods to achieve very efficient aggregation of encrypted data. Although the aggregator is under attack, the adversary cannot acquire any privacy-sensitive information about residents.
Then, differential privacy techniques are introduced to achieve multidimensional aggregation in smart grid without decrypting data, and the middle aggregator can be untrusted [26, 27]. Shi et al. [26] proposed a construction that could allow a group of participants to periodically upload encrypted values to a data aggregator. First, it utilizes applied cryptographic techniques to allow the aggregator to decrypt the sum from multiple ciphertexts encrypted under different users' keys. Second, it describes a distributed data randomization procedure that guarantees the differential privacy of outcome statistic, even when a subset of participants might be compromised. Chan et al. [27] offered a protocol with an untrusted aggregator as well. Specifically, the protocol builds a binary interval tree over n users and allows the aggregator to estimate the sum of contiguous intervals of users represented by nodes in the interval tree. In order to achieve multidimensional privacy-preserving data aggregation, Kursawe et al. [14] designed a protocol that can be employed to compute aggregate meter measurements over defined sets of meters, allowing for fraud and leakage detection as well as further statistical processing of meter measurements, without revealing any additional information about the individual meter readings. Lin et al. [28] proposed a scheme to achieve privacy-preserving aggregation that can integrate the superincreasing sequence and perturbation techniques into compressed data aggregation and has the ability to combine more than one aggregated piece of data into one. Jia et al. [29] proposed a novel privacy-preserving data aggregation scheme which could support efficient data aggregation for time-series metering data without leaking the individual value. It could also thwart HAD attack by introducing some randomness to the aggregation result without affecting the aggregation utility. Lu et al. [12] thought the scheme in [28] is not practical to deploy and manage thousands of keys for users and RC and proposed the EPPA based on public cryptograph system.
However, almost all the existing aggregation schemes focus on the privacy of users' information, cannot afford more valid information to the service provider (e.g., the power supplier) for allocating resources, and neglect the anonymity of user's identify. In this paper, we propose an efficient privacy-preserving multidimensional aggregation scheme, which can achieve more information from the aggregation data without disclosing user's identity. Specifically, the employment of superincreasing sequence techniques and the importation of a big prime in the Paillier cryptosystem ensure that the scheme provides more effective information, and a group signature algorithm is adopted to protect the user's identify from RC.
8. Conclusions
In this paper, we propose an efficient privacy-preserving multidimensional aggregation scheme for smart grid, named PAS. Without disclosing the privacy of user, we can acquire more valid information about the number of users and power consumption at each step after the detailed analysis of preliminary aggregated data by the operation center. Specifically, we employ an improved Paillier cryptosystem technique to obtain more information at the operation center and an efficient anonymous authentication scheme to protect the privacy of user's identity from the regional center. Detailed security analysis shows the security strength and privacy-preserving ability of PAS. Moreover, with reasonable computation overhead, PAS can satisfy the high-frequency multidimensional data collection requirements in smart grid. For the future work, we will focus on reducing the computation overhead of smart meter and achieving a more efficient security scheme for smart grid.
Footnotes
Conflict of Interests
The authors declare that there is no conflict of interests regarding the publication of this paper.
Acknowledgments
This research was supported by National Natural Science Foundation of China (nos. 61303218 and 61272457), Huawei Innovation Research Program, and the 111 Project (B08038).
