An assured deletion scheme for encrypted data in Internet of Things

Abstract

With the development of Internet of Things, heterogeneous data from all kinds of sensors are processed and stored in the cloud server provider. Cloud can be regarded as one of the important layers in the Internet of Things architecture and Internet of Things is the biggest customer of the cloud. As to the security in the Internet of Things, encryption is one of the important mechanisms to achieve confidentiality of the data. However, data encryption introduces new challenges for its assured deletion in cloud, which becomes crucial for big data storage and processing in cloud for Internet of Things. Most existing solutions of cloud data assured deletion only delete the key while the ciphertext is still intact. On the other hand, the decryption time of the user increases with the size of data in the solutions. In this article, we propose a novel assured deletion scheme over encrypted cloud data based on the idea of uploading by sampling slice and outsourcing the decryption of ciphertext policy attribute–based encryption. The analysis and simulation results show that our scheme is secure and efficient, especially on reducing the decryption time of the user. We also designed a ciphertext deduplication scheme based on this scheme and explained its flow.

Keywords

Internet of Things cloud computing layer data privacy assured deletion ciphertext deduplication

Introduction

Internet of Things (IoT) is a complex system with multiple heterogeneous networks.¹ Heterogeneous Internet of Things (HetIoT) is an emerging research field that has strong potential to transform both our understanding of fundamental computer science principles and our future living.² Qiu et al.² proposed a four-layer future HetIoT architecture as shown in Figure 1, which includes applications layer, cloud computing layer, networking layer, and sensing layer. The sensing data collected from various sensors are stored at cloud servers through efficient heterogeneous networking units. In this article, we focus on the cloud computing layer. Cloud computing layer in future HetIoT will receive and process data from other layers.³

Figure 1.

Future HetIoT architecture.

Concerning the integration of IoT and cloud computing, there have been made some previous studies.⁴ A propose of a new platform for using cloud computing capacities for provision and support of ubiquitous connectivity and real-time applications and services for smart cities’ needs is given in Suciu et al.⁵ The CloudThings architecture, a cloud-based IoT platform which accommodates CloudThings IaaS, PaaS, and SaaS for accelerating IoT application, development, and management proposed in Zhou et al.⁶

The huge amounts of sensor data become an important resource, and at the same time, the data and the cloud have to face kinds of security risks which includes cloud data assured deletion. The data in IoT will keep sensitive for a long time, although it may be expired and it can also be abused by the attacker to do harm on the node or the IoT which is so dangerous. Most data stored in cloud is encrypted because cloud computing makes the ownership of cloud data separate from management, which may lead to security issues such as user data leakage, illegal migration across the cloud, and unauthorized access in cloud service provider (CSP). Furthermore, if owner’s data are stored in CSP for a long time and lack of effective assured deletion mechanism, it will not only cause huge waste of CSP storage space, but also lead to serious problems such as user data abuse and privacy leakage. Iqbal et al.⁷ presents taxonomy of cloud security attacks and potential mitigation strategies with the aim of providing an in-depth understanding of security requirements in the cloud environment. Therefore, it is necessary to study the assured deleting technology of big data in cloud storage.⁸ The technique named assured deletion will make sure that the expired or backup data was reliably deleted and remain permanently unrecoverable and inaccessible by any party. The explosive growth of data volume has introduced more challenges to cloud storage systems. Research shows that 60% of the data stored in the cloud storage system is redundant and it is also increasing over time. In order to improve the efficiency of cloud storage services and cut down the waste of resources, it is especially important to study ciphertext deduplication protocols in cloud storage.

Xiong et al.⁹ divided the existing research into three categories: cloud data assured deletion based on trusted execution environments, cloud data assured deletion based on key managements, and cloud data assured deletion based on access control policies:

Cloud data assured deletion based on trusted execution environments.^10,11

The basic idea of this method is to build a secure deleted executable environment from two aspects of hardware and software, but it needs to add trusted enhancement settings on the existing cloud computing infrastructure, so this method is difficult to achieve.

Cloud data assured deletion based on key managements.

This method includes cloud data assured deletion based on key centralized managements,^12,13 cloud data assured deletion based on key distributed managements^14–16 and cloud data assured deletion based on key hierarchical managements.^17,18 The basic idea of this method is to delete the key to prevent the user from decrypting. However, the previous schemes only deleted the key while the ciphertext is still intact. Once the key was compromised, it would be a great threat to the privacy of sensitive data. Therefore, it cannot satisfy the real sense of assured deletion.

Cloud data assured deletion based on access control policies.

The attribute-based encryption (ABE)¹⁹ is a prevailing technology to achieve data security access in the cloud environment, which includes the key policy attribute–based encryption (KP-ABE) and the ciphertext policy attribute–based encryption (CP-ABE). Xiong et al.²⁰ proposed the KP-ABE with time specified attributes (KP-TSABE) scheme, in which every ciphertext is labeled with a time interval while the private key is associated with a time instant. The ciphertext can only be decrypted if both the time instant is in the allowed time interval and the attributes associated with the ciphertext satisfy the key’s access structure. Zhang et al.²¹ proposed a new scheme based on ciphertext sample slice named Assured Deletion based on Ciphertext Sample Slice (ADCSS). The incomplete data by means of ciphertext sample slice, which contributes to the top confidentiality of outsourced data even the key is obtained by accident or by malicious attacks.

Most of the existing cloud data assured deletion schemes are the destruction of the key. Zhang et al.²¹ first realized part of the deletion of the ciphertext, the idea is similar to Secure Self-Destructing scheme for electronic Data (SSDD),²² IBE based Secure Self-destruction scheme (ISS),²³ FULL lifecycle Privacy Protection scheme for sensitive data (FULLPP).²⁴ However, the encryption and decryption of this method are based on the bilinear pairing, which requires a large amount of calculation and calculation time, especially when the size of data is very large. To solve this problem, we combine the idea of uploading by sampling slice and outsourcing the decryption of CP-ABE.^25,26 CP-ABE encryption mechanism can realize the secure sharing of cloud data among multi-users and flexible high-fine-grained access control. By associating user attributes with their private keys, the necessity of centralized key management is weakened, and the risk of key theft is reduced. This new scheme can achieve assured deletion and the decryption time has little change as the size of data is increased.

Background

Access structures

Definition 1 (access structure²⁷) Let ${P_{1}, P_{2}, \dots, P_{n}}$ be a set of parties. A collection $A \subseteq 2^{{P_{1}, P_{2}, \dots, P_{n}}}$ is monotone if $\forall B, C$ : when $B \in A$ and $B \subseteq C$ then $C \in A$ . An access structure (respectively, monotone access structure) is a collection (resp., monotone collection) $A$ of non-empty subsets of ${P_{1}, P_{2}, \dots, P_{n}}$ , that is, $A \subseteq 2^{{P_{1}, P_{2}, \dots, P_{n}}} \ {Ø}$ . The sets in $A$ are called the authorized sets, and the sets not in $A$ are called the unauthorized sets.

In our context, the role of the parties is taken by the attributes. Thus, the access structure A will contain the authorized sets of attributes. We restrict our attention to monotone access structures. However, it is also possible to (inefficiently) realize general access structures using our techniques by defining the “not” of an attribute as a separate attribute altogether. Thus, the number of attributes in the system will be doubled. From now on, unless stated otherwise, by an access structure, we mean a monotone access structure.

Linear secret-sharing scheme

We will make essential use of linear secret-sharing schemes. We adapt our definitions from those in Beimel.²⁷

Definition 2 (linear secret-sharing schemes (LSSS))

A secret-sharing scheme $Π$ over a set of parties $P$ is called linear (over $z_{p}$ ) if

The shares of the parties form a vector over $z_{p}$ .

There exists a matrix $M$ with $l$ rows and $n$ columns called the share-generating matrix for $Π$ . There exists a function $ρ$ which maps each row of the matrix to an associated party. That is for $i = 1, \dots, l$ , the value $ρ (i)$ is the party associated with row $i$ . When we consider the column vector $v = (s, r_{2}, \dots, r_{n})$ , where $s \in z_{p}$ is the secret to be shared, and $r_{2}, \dots, r_{n} \in z_{p}$ are randomly chosen, then $M v$ is the vector of $l$ shares of the secret $s$ according to $Π$ . The share ${(M v)}_{i}$ belongs to party $ρ (i)$ .

It is shown in Green et al.²⁶ that every LSSS according to the above definition also enjoys the linear reconstruction property, defined as follows: suppose that $Π$ is an LSSS for the access structure $A$ . Let $S \in A$ be any authorized set, let $I \subset {1, 2, \dots, l}$ be defined as $I = {i : ρ (i) \in S}$ . Then, there exist constants ${w_{i} \in z_{p}}_{i \in I}$ .such that if ${λ_{i}}$ are valid shares of any secret $s$ according to $Π$ , then $\sum_{i \in I} w_{i} λ_{i} = s$ . It is shown in Xiong et al.²⁰ that these constants ${w_{i}}$ can be found in time polynomial in the size of the share-generating matrix $M$ .

Like any secret-sharing scheme, it has the property that for any unauthorized set $S \notin A$ , the secret $s$ should be information theoretically hidden from the parties in $S$ .

System notation

Symbolic definition

Table 1 lists the definition of the related symbols in the scheme.

Table 1.

Symbolic definition.

Symbol	Description
$λ$	The security parameter
$U = {1, \dots, \| U \|}$	The attribute set description
$p$	The prime order
$G, G_{T}$	The multiplicative cyclic groups of prime order $p$
$z_{p}$	The integer domain of order $p$
$g, h_{1}, \dots, h_{\| U \|}$	$g, h_{1}, \dots, h_{\| U \|} \in G$
$α, β, s, z$	The random number $α, β, s, z \in Z_{p}$
$PK$	The public parameters
$MSK$	The master key
$M$	The owner data
$m_{1}$	The remain plaintext
$M_{2}$	The sampled plaintext
$m_{2}$	The bit information of sampled plaintext
$d_{2}$	The position information of sampling plaintext
$S$	A set of attributes
$A = (A, ρ)$	The access structure $A$
$C_{m 1} = (A, C^{m 1}, C_{0}, C_{i}, D_{i})$	The ciphertext of $m_{1}$
$C_{m 2} = (A, C^{m 2}, C_{0}, C_{i}, D_{i})$	The ciphertext of $m_{2}$
$C_{d 2} = (A, C^{d 2}, C_{0}, C_{i}, D_{i})$	The ciphertext of $d_{2}$
$S K_{S} = (S, K, K_{0}, K_{i})$	The private key consists of $S, K, K_{0}, K_{i}$
$T K_{S} = (S, TK, T K_{0}, T K_{i})$	The transpose key consists of $S, TK, T K_{0}, T K_{i}$
$R K_{S}$	The recovery key
$TC$	The transpose ciphertext

Algorithm definition

This scheme includes five algorithms, which are defined as follows:

$Setup (1^{λ}, U)$ : The setup algorithm takes as input the security parameter $λ$ and the attribute set description $U$ . It outputs the public parameters $PK$ and a master key $MSK$ .

$KeyGen (PK, MSK, S)$ : The key generation algorithm takes as input the public parameters $PK$ , the master key $MSK$ and a set of attributes $S$ that describe the key. It outputs a private key $S K_{S}$ .

$Gen (PK, S K_{S})$ : The generation algorithm takes as input the public parameters $PK$ and the private key $S K_{S}$ . It outputs a transpose key $T K_{s}$ and a recovery key $R K_{S}$ .

$Encrypt (PK, M, A)$ : The encryption algorithm takes as input the public parameters $PK$ , a message $M$ , and an access structure $A$ over the universe of attributes. The algorithm will encrypt $M$ and produce a ciphertext $C_{m 1}, C_{m 2}, C_{d 2}$ such that only a user that possesses a set of attributes that satisfies the access structure will be able to decrypt the message. We will assume that the ciphertext implicitly contains $A$ .

$ODecrypt (PK, T K_{S}, C_{m 1})$ : The outsourcing decryption algorithm takes as input the public parameters $PK$ , a transpose key $T K_{s}$ , and the ciphertext of $m_{1}$ . It outputs a transpose ciphertext.

$Decrypt (PK, R K_{S}, TC, C_{m 1, m 2, d 2})$ : The decryption algorithm takes as input the public parameters $PK$ , a recovery key $R K_{S}$ , the transpose ciphertext and the ciphertext, then the algorithm will decrypt the ciphertext and return three parts of a message $M$ .

System model and scheme

System model

As shown in Figure 2, the system contains five types of entities: (1) CSP that offers storage services and cannot be fully trusted since it is curious about the contents of stored data, but should perform honestly on data storage in order to gain commercial profits; (2) data owner (DO) that uploads and saves its data at CSP; (3) authorized user (AU) that be authorized ones to access the documents of CSP; (4) trusted third party (TTP) that managed by authority, such as the government; (5) trusted authorized party (TAP) that generates keys in the algorithm:

The TAP runs the setup algorithm setup $(1^{λ}, U)$ and sends the public parameters $PK$ and a master key $MSK$ to the DO.

The DO takes sample of the message $M$ , $M = m_{1} + M_{2}$ . $m_{1}$ is the remain plaintext, and $M_{2}$ is the sampled plaintext includes the bit information of sampled plaintext $m_{2}$ and the position information of sampling plaintext $d_{2}$ . The DO runs the encryption algorithm Encrypt $(PK, M, A)$ and sends $A, C_{m 1}, PK$ and other information to the CSP.

The DO sends $A, C_{m 2}, C_{d 2}, PK$ and other information to the TTP.

The TAP runs the key generation algorithm $KeyGen (PK, MSK, S)$ and the generation algorithm $Gen (PK, S K_{S})$ , then sends the private key $S K_{S}$ , the recovery key $R K_{S}$ , the transpose key $T K_{S}$ , and the public parameters $PK$ to the AU.

The AU sends the set of attributes $S$ and the transpose key $T K_{S}$ to the CSP in order to access the document.

The CSP sends the set of attributes $S$ and the transpose key $T K_{S}$ to the TTP.

The CSP runs the outsourcing decryption algorithm $ODecrypt (PK, T K_{S}, C)$ and then sends the transpose ciphertext $TC$ and the $C_{m 1}$ to the AU.

The TTP sends the $C_{m 2}, C_{d 2}$ to the AU.

Figure 2.

System model.

Then the AU runs the decryption algorithm $Decrypt (PK, R K_{S}, TC, C_{m 1, m 2, d 2})$ , decrypts the ciphertext and gets three parts of a message $M$ , and restores plaintext according to the position information of sampling plaintext. Moreover, the goal of assured deletion can be achieved by destroying the ciphertext which stored in the TTP. The scheme contributes to the top confidentiality of outsourced data even the key is obtained by accident or by malicious attacks.

Scheme

Our cloud data assured deletion approach based on the CP-ABE construction of Waters.²⁸

$Setup (1^{λ}, U)$ : It outputs the public parameters $PK$ and a master key $MSK$

PK = (p, G, G_{T}, e, g, g^{β}, e {(g, g)}^{α}, h_{1}, \dots, h_{| U |})

(1)

MSK = α

(2)

$Encrypt (PK, M, A)$ : It outputs the ciphertext $(A, C, C_{0}, C_{i}, D_{i})$

The ciphertext of $m_{1}$ is $C_{m 1} = (A, C^{m 1} {, C}_{0}^{m 1} {, C}_{i}^{m 1} {, D}_{i}^{m 1})$

The ciphertext of $m_{2}$ is $C_{m 2} = (A, C^{m 2} {, C}_{0}^{m 2} {, C}_{i}^{m 2} {, D}_{i}^{m 2})$

The ciphertext of $d_{2}$ is $C_{d 2} = (A, C^{d 2} {, C}_{0}^{d 2} {, C}_{i}^{d 2} {, D}_{i}^{d 2})$

C = Me (g, g)^{α s}, C_{0} = g^{s}, C_{i} = g^{β λ_{i}} h_{ρ_{(i)}}^{- r_{i}}, D_{i} = g^{r_{i}}

(3)

$KeyGen (PK, MSK, S)$ : It outputs the private key $S K_{S} = (S, K, K_{0}, K_{i})$

k = g^{α} g^{β t}, k_{0} = g^{t}, k_{i} {= h}_{i}^{t}

(4)

$Gen (PK, {SK}_{S})$ : It outputs the transpose key $T K_{S} = (S, TK, T K_{0}, T K_{i})$

TK = k^{\frac{1}{z}}, T K_{0} {= k}_{0}^{\frac{1}{z}}, T K_{i} {= k}_{i}^{\frac{1}{z}}

(5)

R K_{S} = z

(6)

$ODecrypt (PK, {TK}_{S}, C_{m 1})$ : It outputs the transpose ciphertext $TC$

\begin{matrix} TC = \frac{e (C_{0}, TK)}{\underset{i \in I}{Π} {(e (C_{i}, T K_{0}) e (T K_{ρ_{(i)}}, D_{i}))}^{w_{i}}} \\ = \frac{e (g^{s}, g^{\frac{α}{z}} g^{\frac{β st}{z}})}{\underset{i \in I}{Π} {(e (g^{β λ_{i}} h_{ρ_{(i)}}^{- r_{i}}, g^{\frac{t}{z}}) e (g^{r_{i}} {, h}_{ρ_{(i)}}^{\frac{t}{z}}))}^{w_{i}}} \\ = \frac{e (g, g)^{\frac{α s}{z}} e (g, g)^{\frac{β st}{z}}}{\underset{i \in I}{Π} {(e (g^{β λ_{i}}, g^{\frac{t}{z}}) e (h_{ρ_{(i)}}^{- r_{i}}, g^{\frac{t}{z}}) e (g^{r_{i}} {, h}_{ρ_{(i)}}^{\frac{t}{z}}))}^{w_{i}}} \\ = \frac{e (g, g)^{\frac{α s}{z}} e (g, g)^{\frac{β st}{z}}}{\underset{i \in I}{Π} (e {(g^{β λ_{i}}, g^{\frac{t}{z}})}^{w_{i}})} = \frac{e (g, g)^{\frac{α s}{z}} e (g, g)^{\frac{β st}{z}}}{e (g, g)^{\frac{β t}{z} \sum_{i \in I} λ_{i} w_{i}}} \\ = \frac{e (g, g)^{\frac{α s}{z}} e (g, g)^{\frac{β st}{z}}}{e (g, g)^{\frac{β ts}{z}}} = e (g, g)^{\frac{α s}{z}} \end{matrix}

$Decrypt (PK, {RK}_{S}, TC, C_{m 1, m 2, d 2})$ : It outputs three parts of a message $M$

m_{1} = \frac{C^{m 1}}{T C^{z}}, m_{2} = \frac{C^{m 2}}{T C^{z}}, d_{2} = \frac{C^{d 2}}{T C^{z}}

(8)

Performance analysis

Theoretical analysis

Theorem 1

If an attacker A breaks the scheme in a polynomial time with an negligible dominant $ε$ , then another simulator B can be constructed to solve the Decisional Bilinear Diffie-Hellman (DBDH) problem with the probability of $ε / 2$ in a polynomial time.

Proof

Let $G$ and $G_{T}$ be two cyclic multiplicative groups of prime order $p$ . Let $g$ be a generator of $G$ , and $e$ is a bilinear pairing. Also given the attacker A an instance of DBDH, $g^{a}, g^{b}, g^{c}, e (g, g)^{z}$ , where $z = abc$ or a random number, the interactive process between the simulator B and the attacker A is as follows:

Init: Attacker A selects an access policy $B^{*}$ , and send to the simulator B.

Setup: B sets up a parameter $Y = e (g, g)^{ab}$ , selects the random parameters $α$ and $β$ , and generates the public parameters $PK$ to A as in equation (1).

KeyQuery 1: Attacker A selects attribute sets which will not satisfy the access policy $B^{*}$ . The attacker A asks B about the key of the attribute sets, and B generates the private key to A as in equation (4).

Challenge: Attacker A selects two plaintexts $M_{0}$ and $M_{1}$ with equal length and sends them to B, B chooses a bit $b \in {0, 1}$ randomly and encrypted either message $M_{b}$ according to the value of b. B sends the ciphertext to A where $C = M_{b} Z, C_{0} = g^{s}$ .

If $b = 0$ , then $Z = e (g, g)^{abc}$ , and if $s = c$ , then $Y^{s} = e (g, g)^{abc}$ , $C_{0} = g^{s} = g^{c}$ . Therefore, the ciphertext is an effective random encryption for $M_{b}$ .

If $b = 1$ , then $Z = e (g, g)^{z}$ , $C = M_{b} Z = M_{b} e (g, g)^{z}$ . Z is a random number, so for an attacker, ciphertext is a random element in $G_{T}$ , and ciphertext does not contain any information of $M_{b}$ .

KeyQuery 2: The same as KeyQuery 1.

Guess: Attacker A guesses $b = b'$ , and outputs $b'$ . The probability of A guessing $b' = b = 0$ is $pr [b' = b = 0] = 1 / 2 + ε$ , and the probability of A guessing $b' = b = 1$ is $pr [b' = b = 1] = 1 / 2$ . The probability that the simulator B is successful in the game is $Adv = 1 / 2 (pr [b' = b = 0] - [b' = b = 1]) - 1 / 2 = ε / 2$ .

Implementation

Experimental environment

The testing environment was Intel Core i5-6200U CPU at 2.30 GHz, 2.40 GHz, 8.0 GB RAM, Windows 10 (64 bit). The software was MyEclipse 10, jdk1.6.0 and JPBC 1.2.0.

The symmetric encryption algorithm Advanced Encryption Standard (AES) used in CP-ABE, which was the algorithm in the security component Java Cryptography Extension (JCE) in Java Development Kit (JDK), and the key length is 128 bit. The sizes of the test files are selected in order of 8, 16, 32, 64, 128, 256 and 512 MB. The local decryption time is recorded respectively. The decryption time is the time from the ciphertext data flow to the plaintext data flow, that is, it does not include the time when the decrypted data stream is restored to a file.

Experimental results and performance analysis

The results of the experiment are shown in Tables 2 and 3. The decryption time refers to the time when the encrypted data stream is decrypted into a plaintext data stream. As shown in Figure 3, When there is no outsourced decryption in Zhang et al.,²¹ the decryption time increases as the file grows, while in our scheme the local decryption time almost remains unchanged and is less than 1 millisecond.

Table 2.

Decryption time.

File size (MB)	8	16	32	64	128	256	512
Decryption time (s)²¹	0.08	0.11	0.129	0.17	0.251	0.412	0.88
Decryption time (s) (our scheme)	0.001	0.001	0.001	0.001	0.001	0.001	0.001

Table 3.

Total time of decryption.

File size (MB)	8	16	32	64	128	256	512
Total time (s)²¹	0.2	0.75	1	2	4.3	9.3	19.26
Total time (s) (our scheme)	0.12	0.64	0.871	1.83	4.049	8.888	18.38

Figure 3.

Decryption time.

The total decryption procedure includes the decryption and the coding. As shown in Figure 4, the total decryption time of our scheme is less than the scheme in Zhang et al.,²¹ and the larger the amount of data, the better our scheme performs. As the encoding time occupies most of the total decryption time, the study of efficient coding strategy helps us improve the efficiency of decryption.

Figure 4.

Total time of decryption.

Experiments show that this scheme can realize the assured deletion over encrypted cloud data and reduce local decryption time by outsourcing complex bilinear pairings to a cloud server. With the huge amount of data in the cloud storage, the scheme overcomes the shortcoming of traditional algorithm that the decryption time increases with the increase of data. Although this study shortens local decryption time significantly, the time to restore the data stream to the file is far greater, which will be further discussed in future research.

Ciphertext deduplication protocol

Based on the assured deletion protocol in the fourth part, we extended the ciphertext deduplication protocol. The ciphertext deduplication protocol includes two sub-protocols, a duplicate data detection protocol, and a provable user ownership protocol.

As shown in Figure 5, the DO wants to upload a new data $M_{*}$ . First, the system runs a duplicate data detection protocol which includes three steps; the steps are labeled 1, 2, and 3 respectively in the system model. If the duplicate data detection result shows that the file already exists in the cloud server, the TTP further performs ownership verification on DO. The user ownership verification protocol consists of four steps, which are labeled 4–7, respectively, in the system model. The two protocols are described in detail as following part A and part C according to their labels 1–7 in the figure.

Figure 5.

Data deduplication protocol system model.

Duplicate data detection protocol

The proposed scheme in this section is improved based on the literature.²⁹ A decision tree is a tree structure that classifies data according to different attributes. Each of the internal nodes represents an attribute judgment. Each branch represents the output of the attribute result according to the difference of the attribute judgment results, and each leaf node represents a classification result.

The cloud server creates a decision tree as shown in Figure 6 for the data stored in it. Each node consists of two parts of information, the hash value and parameters of the file. The specific parameter generation process is that the cloud server generates a seed parameter s₀, which is a root node parameter, and sequentially generates respective node parameters of the left and right subtrees according to the seed parameter and the following calculation rules.

Figure 6.

Duplicate data detection protocol.

Left subtree parameter calculation rule

s_{0 b 1 b 2 \dots bi} = H (s_{0 b 1 b 2 \dots bi - 1} | | 0)

(9)

Right subtree parameter calculation rule

s_{0 b 1 b 2 \dots bi} = H (s_{0 b 1 b 2 \dots bi - 1} | | 1)

(10)

1. The DO sends the query information $τ$ to the cloud server to know whether the ciphertext of $M_{*}$ has been stored in the cloud server; the inquiry information $τ$ consists of two parts $H (M_{*}), b_{i}$

b = B (H (M) | | s_{0 b 1 b 2 \dots bi})

(11)

b_{i} = B (H (M) | | s_{0 b 1 b 2 \dots bi - 1})

(12)

(In the initial case, $b = - 1$ , which means that the first node of the verification is the root node of the tree).

2. The cloud server starts from the root node of the decision tree and verifies whether the following equation is set, $H (M_{0}) = H (M_{*})$ . If the equation does not hold, the left or right subtree is selected to continue verification according to the value of $B (H (M_{*}) | | s_{01})$ . When it is 1, right subtree is selected and the equation $H (M_{1}) = H (M_{*})$ is calculated. If the equation is not true, the calculation of b and the verification are going on. When the equation $H (M_{i}) = H (M_{*})$ holds, the cloud server returns 1 to the DO; otherwise, it returns 0.

3. When the equation holds, it means that there are duplicate data. The cloud server will send the hash value of the file to the TTP.

Analysis of duplicate data detection protocols

First, the protocol will get the correct path to the tree during the verification process. Since the decision tree in the protocol is a self-generating tree, obviously if $s_{α} = s_{β}$ , $α = 0 b_{1} \dots b_{i}$ , $β = 0 b_{1}^{'} \dots b_{i}^{'}$ then $α = β$ . Assume $b = B (H (M) | | s_{α})$ , $b' = B (H (M') | | s_{β})$ , if and only if $M = M'$ then $b = b'$ .

Second, in the verification process, each step will judge whether the following equation is true, $H (M) = H (M^{'})$ . Obviously, only when $M = M'$ , the verification will pass.

Provable ownership protocol

If Challenger C wants to access a file stored in cloud server, it should submit a query information to the cloud server first. Then both of them run the duplicate data detection protocol as above steps 1–3 to verify if there is such a file in the cloud server. If the equation holds, then:

4. Receiving $H (M_{*})$ from the cloud server as in step 3, the TTP finds ( $C_{m 2}$ , $d_{2}$ , $H (m_{2})$ ) according to the assure deletion algorithm above and generates a random number t. The plaintext of the message $d_{2}$ is sent to Challenger C along with t.

5. The Challenger C samples the owned data ${M_{*}}^{'}$ with d₂ to get the bit information ${m_{2}}^{'}$ of the sampled plaintext. Then C calculates the hash value $H ({m_{2}}^{'})$ and $H (t | | H ({m_{2}}^{'}))$ and sent them to a TTP with the attribute set S.

6. The TTP calculates $H (t | | H (m_{2}))$ and verifies if the equation $H (t | | H (m_{2})) = H (t | | H (m_{2}^{'}))$ holds according to t and the hash value $H (m_{2})$ stored locally. If it is false, it proves that Challenger C did not pass the ownership verification protocol; it does not really own the data. The TTP will send 0 to the authorized organization. Otherwise, the TTP will send 1 and the user attribute set S to the authority organization to prove that C owns the data.

7. If the authorized organization receives 0, nothing will be done. If the authorized organization receives 1, the Challenger C is authenticated as a legitimate user, and the authorization center runs two algorithms $KeyGen (PK, MSK, S)$ and $Gen (PK, S K_{S})$ to generate a private key $S K_{S}$ , a recovery key $R K_{S}$ , and a transpose key $T K_{S}$ for the legitimate user according to its attribute set. The keys are sent together to the legitimate user with PK.

Analysis of the proof of ownership protocol

With reference to the proof steps in the random detection of data blocks in Zhao,³⁰ we have the following verification. Suppose that the file to be verified contains n data blocks, in which t data blocks are inconsistent with original file. The TTP requires verification of randomly selected c data blocks, while R is the number of the inconsistent data blocks in them. $P_{R}$ is defined as the probability that at least one of the c data blocks is inconsistent. $P_{R}$ can be seen as the probability of verifying successfully. Then

\begin{matrix} P_{R} = P {R \geq 1} = 1 - P {R = 0} \\ = 1 - \frac{n - t}{n} \times \frac{n - 1 - t}{n - 1} \times \dots \times \frac{n - c + 1 - t}{n - c + 1} \end{matrix}

(13)

Due to

\frac{n - i - t}{n - i} \geq \frac{n - (i + 1) - t}{n - (i + 1)}

(14)

And so

1 - {(\frac{n - t}{n})}^{c} \leq P_{R} \leq 1 - {(\frac{n - c + 1 - t}{n - c + 1})}^{c}

(15)

The analysis shows that when there are 1% inconsistent blocks out of n, only 460 randomly selected blocks can satisfy the detection probability which is more than 99%. In other words, when the detection file is inconsistent with more than 1% of the data block in the original file, we select 460 data blocks for detection, and the inconsistent one can be found with probability more than 99%.

Protocol workflow

The system should further implement the two functions of assured deletion and ciphertext deduplication on the basis of implementing the cloud storage service function. As shown in Figure 7, the workflow of the protocol is described below.

Figure 7.

Protocol workflow.

The DO U1 wants to upload a file to the cloud server and runs a duplicate data detection protocol with the cloud server first. The detection result will show whether the file already exists in the cloud server or not.

If the file already exists in the cloud server, the user ownership verification protocol is run between U1 and the TTP. If U1 passes the ownership verification protocol, it is a legitimate user who can access the cloud data. Otherwise, U1 is an illegal user to access the cloud data.

If the file does not exist in the cloud server, U1 will be the initial uploader to upload the encrypted sampled slicing file. The user can use the cloud storage service normally during the authorization period. Otherwise, the TTP performs the assured deletion protocol to delete the sampled data information when the authorization expires.

Conclusion

The WSN in IoT is the data source of the cloud computing and cloud computing makes the heterogonous data available for many services and applications in IoT. The combination of the IoT and cloud computing had to face problems in their practical way as they are based on the Internet. The assured deleting of cloud encrypted data is one of the issues. By researching and combing out the existing assured deletion methods, we adopt a method of outsourcing the decryption part to innovate the method of assured deletion of cloud data based on ciphertext sampling in this article. In this method, complex bilinear pairings will be performed by cloud servers instead of local ones. Therefore, once the user gets the transposed ciphertext, it takes only one simple calculation to get the plaintext data stream. Extensive performance analysis and test showed that our scheme can effectively shorten the local time for the user to decrypt, and it also improves the security of confidential data while improving the decryption efficiency. It means that an attacker has to get four parts of information to get relevant information, which includes the transpose key, the remaining ciphertext, the sampling ciphertext bit information, and the location information of the encrypted ciphertext. Future work will extend the function of assured deletion and deduplication on encrypted big data in cloud.

Footnotes

Handling Editor: Fei Yu

Declaration of conflicting interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) received no financial support for the research, authorship, and/or publication of this article.

References

Qiu

Chen

et al . Heterogeneous ad hoc networks: architectures, advances and challenges. Ad Hoc Netw 2016; 55: 143–152.

Qiu

Chen

et al . How can heterogeneous Internet of Things build our future: a survey. IEEE Commun Surv Tut 2018; 20: 2011–2027.

Maksymyuk

Strykhalyuk

et al . Device-to-device-based heterogeneous radio access network architecture for mobile cloud computing. IEEE Wirel Commun 2015; 22: 50–58.

Stergiou

Psannis

Kim

et al . Secure integration of IoT and cloud computing. Future Gener Comp Sy 2016; 78: 964–975.

Suciu

Vulpe

Halunga

et al . Smart cities built on resilient cloud computing and secure Internet of Things. In: International conference on control systems and computer science, Bucharest, 29–31 May 2013, pp.513–518. New York: IEEE.

Zhou

Leppanen

Harjula

et al . CloudThings: a common architecture for integrating the Internet of Things with cloud computing. In: 17th IEEE international conference on computer supported cooperative work in design, Whistler, BC, Canada, 27–29 June 2013, pp.651–657. New York: IEEE.

Iqbal

Kiah

MLM

Dhaghighi

et al . On cloud security attacks: a taxonomy and intrusion detection and prevention as a service. J Netw Comput Appl 2016; 74: 98–120.

Mao

Yang

Mao

. Survey on wireless sensor network applications. Comput Appl Softw 2008; 25: 179–181.

Xiong

Wang

. Research progress on cloud data assured deletion based on cryptography. J Commun 2016; 37: 167–184.

10.

Zhang

Chen

. Lifetime privacy and self-destruction of data in the cloud. J Comput Res Dev 2011; 48: 1155–1167.

11.

Zhao

Mannan

. Gracewipe: secure and verifiable deletion under coercion. In: 22th annual conference on network & distributed system security (ISOC NDSS), San Diego, CA, USA, 8–11 February 2015, pp.1–13. USA: Internet Society.

12.

Perlman

. File system design with assured delete. In: Third IEEE international security in storage workshop (SISW), San Francisco, CA, 13 December 2005, pp.83–88. New York: IEEE.

13.

Perlman

. File system design with assured delete. In: 14th annual network & distributed system security (ISOC NDSS), San Diego, CA, 28 February–2 March 2007, pp.1–7. CA: IEEE.

14.

Geambasu

Kohno

Levy

et al . Vanish: increasing data privacy with self-destructing data. In: 18th USENIX security symposium, Montreal, QC, Canada, 10–14 August 2009, pp.299–315. USA: USENIX Association.

15.

Reimann

Durmuth

. Timed revocation of user data: long expiration times from existing infrastructure. In: ACM workshop on privacy in the electronic society (WPES), Raleigh, NC, 15 October 2012, pp.65–74.

16.

Castelluccia

CDE

Cristofaro

Francillon

et al . EphPub: toward robust ephemeral publishing. In: 19th IEEE international conference on network protocols (IEEE ICNP), Vancouver, BC, Canada, 17–20 October 201l, pp.165–175. New York: IEEE.

17.

Atallah

Blanton

Fazio

et al . Dynamic and efficient key management for access hierarchies. ACM T Inform Syst Se 2009; 12: 1–43.

18.

Wang

Owens

et al . Secure and efficient access to outsourced data. In: ACM workshop on cloud computing security, Chicago, IL, 13 November 2009, pp.55–66. New York: IEEE.

19.

Sahai

Waters

. Fuzzy identity-based encryption. In: Proceedings of the 24th annual international conference on theory and applications of cryptographic techniques (EUROCRYPT), Aarhus, 22–26 May 2005, pp.457–473. Berlin: Springer.

20.

Xiong

Liu

Yao

et al . A secure data self-destructing scheme in cloud computing. IEEE T Cloud Comput 2014; 2: 448–458.

21.

Zhang

Yang

. Novel cloud data assured deletion approach based on ciphertext sample slice. J Commun 2015; 36: 108–117.

22.

Wang

Yue

Liu

. A secure self-destructing scheme for electronic data. J Comput Syst Sci 2013; 79: 279–290.

23.

Xiong

Yao

. A secure self-destruction scheme with IBE for the internet content privacy. Chin J Comput 2014; 37: 139–150.

24.

Xiong

et al . A full lifecycle privacy protection scheme for sensitive data in cloud computing. Peer-to-Peer Networking & Applications 2015; 8: 1025–1037.

25.

Liu

. Research on outsourced CP-ABE in cloud computing. Guangzhou, China: Jinan University, 2016.

26.

Green

Hohenberger

Waters

. Outsourcing the decryption of ABE ciphertexts. In: USENIX conference on security, San Francisco, CA, 8–12 August 2011, p.34. Berkeley, CA: USENIX Association.

27.

Beimel

. Secure schemes for secret sharing and key distribution. PhD Dissertation, Israel Institute of Technology, Haifa, Israel, 1996.

28.

Waters

. Ciphertext-policy attribute-based encryption: an expressive, efficient, and provably secure realization. In: Catalano

Fazio

Gennaro

et al . (eds) Public key cryptography. Berlin: Springer, 2011, pp.53–70.

29.

Jiang

Chen

et al . Secure and efficient cloud data deduplication with randomized tag. IEEE T Inf Foren Sec 2017; 12: 532–543.

30.

Zhao

. Application of third-party auditor in cloud storage data integrity verification. Chengdu, China: University of Electronic Science and Technology of China, 2015.