An efficient secure Internet of things data storage auditing protocol with adjustable parameter in cloud computing

Abstract

Nowadays, an increasing number of cloud users including both individuals and enterprises store their Internet of things data in cloud for benefits like cost saving. However, the cloud storage service is often regarded to be untrusted due to their loss of direct control over the data. Hence, it is necessary to verify the integrity of their data on cloud storage servers via a third party. In real cloud systems, it is very important to improve the performance of the auditing protocol. Hence, the well-designed and cost-effective auditing protocol is expected to meet with the performance requirement while the data size is very large in real cloud systems. In this article, we also propose an auditing protocol based on pairing-based cryptography, which can reduce the computation cost compared to the state-of-the-art third-party auditing protocol. Moreover, we also study how to determine the number of sectors to achieve the optimal performance of our auditing protocol in a case of the same challenged data. And an equation for computing the optimal number of sectors is proposed to further improve the performance of our auditing protocol. Both the mathematical analysis method and experiment results show that our solution is more efficient.

Keywords

Internet of things data cloud storage integrity auditing third-party auditing pairing-based cryptography

Introduction

Internet of things (IoT) has deep impact on our daily lives.¹ IoT applications can generate a great deal of data, such as heart rate and blood pressure data, water quality, and industrial big data.² Nowadays, an increasing number of cloud users including both individuals and enterprises store their IoT data in cloud for benefits like cost saving and ease of data sharing. However, the cloud storage service is often regarded to be untrusted due to their loss of direct control over the data. Because the untrusted cloud storage providers instead delegate managements of the user data, the data on the cloud can be lost due to some possible causes. The disks of the storage servers could be damaged and the maintainer might not successfully restore the lost data. But the cloud storage provider could not inform this error to the affected users. Moreover, cloud storage providers could be dishonest. To recycle the cloud storage space, they could discard the user data that have not been accessed or rarely accessed by the users. Thus, the users must be able to verify periodically the integrity of the data storing in the untrusted cloud storage.

It should be noted that we assume that the verifier cannot use the local copy data when the verifier executes the provable data possession (PDP) protocol. An intuitive way is that the client stores a file to the server and computes a hash value of the file as a metadata stored in the client using a cryptographic hash function, such as MD5 and SHA. Then, the server sends the entire file to the client and the client recalculates a new hash value of the received file and compares the two hash values to verify whether the file is stored correctly in cloud storage. But this way has some critical issues. In real cloud storage system, both the number of files and the data size are very large, it requires a very large amount of bandwidth and time to send these files from the servers to the clients across the network. Hence, traditional integrity checking techniques such as cryptographic hash function cannot meet with the real requirement. That is the first issue. Accordingly, some solutions have used some techniques to solve the first issue, such as RSA-based hash functions^3,4 and homomorphic hash functions.^5,6

The communication cost can be $O (1)$ . But these solutions still have the second issue that the server needs access to the entire file and it costs expensive disk I/O and a large amount of computation. Accordingly, sampling technique, called spot checking, was used to reduce the expensive disk I/O and computation costs while still achieving a high probabilistic possession guarantee.⁷ These solutions using sampling technique only need constant amount of data to execute the PDP protocol, independently of the total size of a file.

To solve the data privacy problem (i.e. the auditor cannot recover the data blocks from the server data proof), these auditing protocols are constructed based on the cryptographic algorithms, such as pairing-based cryptography⁸ and homomorphic encryption,⁹ to verify the server’s proof while the auditor cannot decrypt and recover the data blocks. To reduce the number of data tags and improve the performance, these auditing protocols also use the data fragment technique to split each data block into sectors, such as Yang and Jia’s⁸ solution. But the data fragment technique can increase the computation number of the bilinear pairing of Yang’s auditing protocol, so the computation cost of the algorithm can also increase with the computation number of the bilinear pairing.

Our contributions can be summarized as follows:

We propose an improved secure data storage auditing protocol based on Yang’s solution. Our solution is more efficient than Yang’s using the pre-do the bilinear pairing operation in the user client.

We further improve the performance of our auditing protocol with adjustable parameter of the number of sectors in a case of the same challenged data. The optimal number of sectors can be directly determined by an equation according to the amount of challenged data.

Related work

IoT data can be very important and the vast quantities of data can be keen to be outsourced to cloud. Botta et al.¹⁰ and Díaz et al.¹¹ have discussed the integration of IoT and cloud computing. In order to guarantee data integrity, integrity verification is always a very important problem on outsourced IoT data.² There were many solutions on integrity verification. In some previous solutions, the data owners are responsible for verifying the data integrity of cloud storage. In order to guarantee the justice of data possession, third-party auditing was introduced.^12,13 It should meet three requirements (confidentiality, dynamic auditing, and batch auditing).^13–15 Recently, several new third-party auditing protocols were proposed to allow the auditor to verify the data integrity on the cloud storage server.^8,9,16,17 Wang et al.⁹ use homomorphic encryption to solve the data privacy problem. Yang and Jia⁸ proposed a new scheme based on pairing-based cryptography to solve the data privacy problem. Both of these two auditing protocols are constructed based on the cryptographic algorithms to verify the server’s proof while the third-party auditor cannot decrypt and recover the data blocks. And Yang and Jia⁸ also utilize the data fragment technique to split each data block into sectors to reduce the number of data tags and improve the performance. Yang et al.¹⁸ developed a novel error detection approach based on the scale-free network topology and most of detection operations to support fast detection and locating of errors in big sensor data on cloud. As dynamic data on cloud supporting these operations of insert/delete/modification is a very useful feature, integrity verification should be support dynamic data that allows the data owners update their data on cloud. Liu et al.¹⁹ proposed an auditing scheme with the support both of public auditing and fine-grained updates over variable-sized file blocks instead of a fixed size blocks. Zhang et al.²⁰ proposed a novel approach based on an upper bound constraint to identify the intermediate data sets which need not be encrypted on cloud and reduce the privacy-preserving cost.

Preliminaries

In this section, we describe the system model and security model for a secure data storage auditing system. More information about the secure data storage auditing can be found in Yang and Jia,⁸ Sookhak and colleagues,^21,22 and Shin and Kwon.²³

System model

As shown in Figure 1, a secure IoT data Storage auditing system model includes three entities:

Owner who moves their files to the cloud server;

Cloud Server who stores the owner’s data and provides the data access to users;

Auditor who is a third party and responsible for checking the integrity of data stored in the cloud server on behalf of the owner.

Figure 1.

System model of the secure data storage auditing system.

A storage auditing protocol consists of five algorithms (KeyGen, TagGen, Chall, Prove, and Verify):

KeyGen is a key generation algorithm that can be used to generate a public key and a secret key by the owner.

TagGen is a tag generation algorithm that can be used to compute a data tag for each data block of an owner’s data file by the owner. And the owner moves the data blocks and the corresponding data tags to the cloud server.

Chall is a challenge algorithm that can be used to randomly choose some data blocks as the challenge data set by the auditor. And the auditor sends the challenge data set to the cloud server.

Prove is a proof algorithm that can be used to generate a data proof to the auditor by the cloud server.

Verify is a verification algorithm that can be used to check the integrity of data file by verifying the correctness of data proof.

Security model

We assume that the auditor is semi-trusted such that they will perform honestly during the whole auditing procedure, but will not be trusted for data confidentiality. And the cloud server may be fully dishonest in terms of the data confidentiality and integrity.

Thus, the secure data storage auditing system can face two main threats: data privacy threat from both of the cloud server and the auditor and data integrity threats from the cloud server.²⁴ The cloud server and the auditor could be curious to the owner’s data file. The auditing protocol should protect the data privacy against the cloud server and the auditor. And the cloud server may carry out the following attacks:⁸

Replace attack. The cloud server may choose another valid pair of data block and data tag to replace the challenged pair of data block and data tag that have been discarded.

Forge attack. The cloud server may forge the data tag of data block and deceive the auditor, if the owner’s secret tag keys are reused for the different versions of data.

Replay attack. The cloud server may generate the proof from the previous proof or other information, without retrieving the actual owner’s data.

Our efficient secure data auditing protocol

Table 1 lists some notations used by algorithms for our secure data storage auditing protocol.

Table 1.

Notations for our secure data storage auditing protocol.

Symbol	Physical meaning
$λ$	Security parameter
$s k_{t}$	Secret tag key
$p k_{t}$	Public tag key
$s k_{h}$	Secret hash key
n	Number of blocks in each component
M	Data component divided into n data blocks, $M = {m_{1}, m_{2}, \dots, m_{n}}$
s	Number of sectors in each data block
$m_{ij}$	The jth sector of data block $m_{i}$ , where $1 \leq i \leq n$ and $1 \leq j \leq s$
T	Set of data tags, $T = {t_{1}, t_{2}, \dots, t_{n}}$ $W_{i} = FID ∥ i$ , the “∥”denotes the concatenation operation and
$W_{i}$	FID is the identifier of the data and i represents the block number of $m_{i}$ , where $1 \leq i \leq n$
$M_{info}$	Abstract information of M
$C$	Challenge generated by the auditor
$P$	Proof generated by the server

Let $G_{1}$ , $G_{2}$ , and $G_{T}$ be the multiplicative groups with the same prime order p and $e : G_{1} \times G_{2} \to G_{T}$ be the bilinear map. Let $g_{1}$ be a generator of $G_{1}$ and $g_{2}$ be a generator of $G_{2}$ . Let $h : {0, 1}^{*} \to G_{1}$ is a keyed secure hash function that maps $M_{info}$ to an element in $G_{1}$ .

Algorithms for our secure data auditing protocol

Generally, the pairing computation is the most expensive part of pairing-based cryptography. Yang’s secure data storage auditing protocol can protect the data privacy against the auditor using the pairing-based cryptography method. Then, the pairing computation is also a very expensive part of their solution. In this article, we propose an improved secure data storage auditing based on Yang’s one to reduce the total computation cost of one auditing. We let the user pre-do the pairing operation so as to avoid it at the cloud side during the process of auditing. Yang’s secure data auditing protocol includes five algorithms: KeyGen, TagGen, Chall, Prove, and Verify. Our data auditing Protocol also have five algorithms. They are defined as follows and our definitions of TagGen, Chall, and Prove are different with Yang’s.

KeyGen

(λ) \to (p k_{t}, s k_{t}, s k_{h})

Input:

λ

Output:

p k_{t}, s k_{t}, s k_{h}

1. It chooses two random numbers

s k_{t}, s k_{h} \in Z_{p}

2. It computes

p k_{t} = g_{2}^{s k_{t}}

, where

p k_{t} \in G_{2}

TagGen

(M, s k_{t}, s k_{h}) \to (T, E)

Input:

M = {m_{1}, m_{2}, \dots, m_{n}}, s k_{t}, s k_{h}

Output:

T = {t_{1}, t_{2}, \dots, t_{n}}, E

1. It chooses s random numbers

x_{1}, x_{2}, \dots, x_{s} \in Z_{p}

2. It computes

u_{j} = g_{1}^{x_{j}}

, where

1 \leq j \leq s

and

u_{j} \in G_{1}

3. It computes

e_{g_{1}, g_{2}} = e (g_{1}, g_{2})

and then computes

E_{j} = e_{g_{1}, g_{2}}^{x_{j} \cdot s k_{t}}

, where

1 \leq j \leq s

E_{j} \in G_{T}

4. It computes a

t_{i} = {(h (s k_{h}, W_{i}) \cdot Π_{j = 1}^{s} u_{j}^{m_{ij}})}^{s k_{t}}

for each data block

m_{i}

, where

t_{i} \in G_{1}

Chall

(M_{info}, E) \to C

Input:

M_{info}

Output:

C

1. It chooses randomly some data block numbers to construct a Challenge SetQ.

2. It chooses a random number

v_{i} \in Z_{p}^{*}

for each data block

m_{i}

, where

i \in Q

3. It selects a random number

r \in Z_{p}^{*}

and computes an

E_{j}^{'} = E_{j}^{r}

for each

E_{j}

, where

1 \leq j \leq s

4. It constructs

C = ({i, v_{i}}_{i \in Q}, {E_{j}^{'}}_{j \in [1, s]})

Prove

(M, T, C) \to P

Input:

M, T, C

Output:

P

1. It computes the tag proofTP as

TP = \underset{i \in Q}{Π} t_{i}^{v_{i}}

where

TP \in G_{1}

2. It computes

M P_{j}

M P_{j} = \sum_{i \in Q} v_{i} \cdot m_{ij}

where

1 \leq j \leq s

3. It computes the data proofDP as

DP = Π_{j = 1}^{s} (E_{j}^{'})^{M P_{j}}

where

DP \in G_{T}

4. It constructs

P = (TP, DP)

Verify

(C, P, s k_{h}, p k_{t}, M_{info}) \to 0 / 1

Input:

C, P, s k_{h}, p k_{t}, M_{info}

Output:

0 / 1

1. It computes the challenge hash $H_{chal} = \underset{i \in Q}{Π} h (s k_{h}, W_{i})^{r \cdot v_{i}}$ .

2. It checks the following equation

DP \cdot e (H_{chal}, p k_{t}) = e (TP, g_{2}^{r})

If the equation holds it outputs 1. Otherwise, it outputs 0.

Correctness analysis

The correctness and verification equation of our improved auditing protocol can be written in detail as follows

\begin{matrix} DP \cdot e (H_{chal}, p k_{t}) \\ = Π_{j = 1}^{s} {(E_{j}^{'})}^{M P_{j}} \cdot e (H_{chal}, p k_{t}) \\ = Π_{j = 1}^{s} {(E_{j}^{r})}^{M P_{j}} \cdot e (H_{chal}, p k_{t}) \\ = Π_{j = 1}^{s} e {(g_{1}, g_{2})}^{x_{j} \cdot s k_{t} \cdot r \cdot M P_{j}} \cdot e (H_{chal}, p k_{t}) \\ = Π_{j = 1}^{s} e {({g_{1}}^{x_{j}}, {g_{2}}^{s k_{t}})}^{r \cdot M P_{j}} \cdot e (H_{chal}, p k_{t}) \\ = Π_{j = 1}^{s} e {(u_{j}, p k_{t})}^{r \cdot M P_{j}} \cdot e (H_{chal}, p k_{t}) \\ = Π_{j = 1}^{s} e {(u_{j}, p k_{t})}^{r \cdot \sum_{i \in Q} v_{i} \cdot m_{ij}} \cdot e (\underset{i \in Q}{Π} h {(s k_{h}, W_{i})}^{r \cdot v_{i}}, p k_{t}) \\ = Π_{j = 1}^{s} \underset{i \in Q}{Π} e {(u_{j}, p {k_{t}}^{r})}^{v_{i} \cdot m_{ij}} \cdot e (\underset{i \in Q}{Π} h (s k_{h}, W_{i}), p {k_{t}}^{r \cdot v_{i}}) \\ = Π_{j = 1}^{s} \underset{i \in Q}{Π} e ({u_{j}}^{m_{ij}}, p {k_{t}}^{r \cdot v_{i}}) \cdot e (\underset{i \in Q}{Π} h (s k_{h}, W_{i}), p {k_{t}}^{r \cdot v_{i}}) \\ = \underset{i \in Q}{Π} Π_{j = 1}^{s} e ({u_{j}}^{m_{ij}}, p {k_{t}}^{r \cdot v_{i}}) \cdot e (\underset{i \in Q}{Π} h (s k_{h}, W_{i}), p {k_{t}}^{r \cdot v_{i}}) \\ = \underset{i \in Q}{Π} e (Π_{j = 1}^{s} {u_{j}}^{m_{ij}}, p {k_{t}}^{r \cdot v_{i}}) e (h (s k_{h}, W_{i}), p {k_{t}}^{r \cdot v_{i}}) \\ = \underset{i \in Q}{Π} e (Π_{j = 1}^{s} {u_{j}}^{m_{ij}} \cdot h (s k_{h}, W_{i}), {g_{2}}^{r \cdot v_{i} \cdot s k_{t}}) \\ = \underset{i \in Q}{Π} e ((h (s k_{h}, W_{i}) \cdot Π_{j = 1}^{s} {u_{j}}^{m_{ij}})^{s k_{t}}, {g_{2}}^{r \cdot v_{i}}) \\ = \underset{i \in Q}{Π} e (t_{i}, {g_{2}}^{r \cdot v_{i}}) \\ = \underset{i \in Q}{Π} e ({t_{i}}^{v_{i}}, {g_{2}}^{r}) \\ = e (\underset{i \in Q}{Π} {t_{i}}^{v_{i}}, {g_{2}}^{r}) \\ = e (TP, {g_{2}}^{r}) \end{matrix}

(1)

From equation (1), the cloud server can pass the auditing, if the data blocks and the data tags are stored correctly. However, if any of the challenged data block or data tag is corrupted or modified, the verification equation will not hold and the cloud server cannot pass the auditing.

Security analysis

We first prove that our improved auditing protocol is provably secure under the security model. Then, we prove that it can guarantee the data privacy.

Theorem 1. Our improved auditing protocol can resist the Replace Attack, Forge Attack, and Replay Attack.

Proof. Obviously, on the cloud server side

\begin{matrix} E_{j}^{'} & = E_{j}^{r} \\ = e {(g_{1}, g_{2})}^{x_{j} \cdot s k_{t} \cdot r} \\ = e ({g_{1}}^{x_{j}}, {g_{2}}^{s k_{t} \cdot r}) \\ = e (u_{j}, p {k_{t}}^{r}) \\ = e (u_{j}, R) \end{matrix}

The cloud server side of our improved auditing protocol is the same as Yang’s. Because Yang’s auditing protocol can resist the three attacks, our improved auditing protocol can also resist them.

Theorem 2. In our improved auditing protocol, neither the server nor the auditor can obtain any information about the data and the secret tag key during the auditing procedure.

Proof. Obviously, the cloud server side of our improved auditing protocol is the same as Yang’s. Because Yang’s auditing protocol can be confidential against the server during the auditing procedure, our improved auditing protocol can also guarantee the data privacy.

On the auditor side, it can get the public key $p k_{t}$ , but it cannot conclude the secret key $s k_{t}$ . It can also get the pairing value $e (u_{j}, p k_{t})$ , but it cannot conclude the random number $x_{j}$ . It is a discrete logarithm problem to get $x_{j}$ from the $e (u_{j}, p k_{t}) = e (g_{1}, p k_{t})^{x_{j}}$ and $e (g_{1}, p k_{t})$ . It can also get the product of all the challenged data tags from the tag proof TP. Because the tag $t_{i}$ is computed using the secret key $s k_{t}$ , it cannot conclude any information about the input data blocks. It can get the data proof DP. It is a discrete logarithm problem to get the linear combinations of the chosen data sectors $M {P_{j}}_{j \in [1, s]}$ from the data proof DP. Hence, the auditor cannot get any information about the data blocks, random number $x_{j}$ , and the secret key in our improved auditing protocol.

Performance improvement with adjustable parameter

Suppose that the corrupted probability of each sector of each block on the cloud storage is $ρ$ . For a confirmation auditing or sampling auditing, the verifier selects b challenged blocks, where each block is split into s sectors. Then, any data corruption can be detected by the verifier with a probability of $\Pr (b, s)$ , which can be expressed as

\Pr (b, s) = 1 - (1 - ρ)^{b \cdot s}

(2)

Interestingly, we can see that the verifier can choose a constant amount of sectors and can detect any data corruption with the same probability of $\Pr (b, s)$ regardless of the total number of data blocks. Therefore, block sampling can reduce the expensive disk I/O and computation costs while still achieving a high probabilistic possession guarantee. By calculating, if $ρ = 0.01$ , the verifier needs only to choose $b \cdot s = 460$ sectors in order to achieve $\Pr (b, s)$ of at least 99%.

We first give some notations to represent the computation cost of basic computational operation in the algorithms of Yang’s secure data storage auditing protocol in Table 2.

Table 2

The computation cost of basic computational operations.

Symbol	Computational operation
$t_{pg}$	Exponentiation of an element of $G_{1}$ or $G_{2}$
$t_{pgt}$	Exponentiation of an element of $G_{T}$
$t_{mz}$	Multiplication of two numbers of $Z_{p}$
$t_{mg}$	Multiplication of two elements of $G_{1}$ or $G_{2}$
$t_{mgt}$	Multiplication of two elements of $G_{T}$
$t_{e}$	Pairing of two elements of $G_{1}$ and $G_{2}$
$t_{a}$	Addition of two numbers of $Z_{p}$
$t_{h}$	Keyed secure hash function h

According to the definitions of three algorithms of Chall, Prove, and Verify, we can formalize the expression of computation cost using some basic computational operations.

So, the total computation cost can expressed as

\begin{matrix} T_{all} = (b \cdot t_{pg} + (b - 1) \cdot t_{mg} + \\ (s \cdot (b \cdot t_{mz} + (b - 1) \cdot t_{a})) + \\ (s \cdot t_{e} + s \cdot t_{pgt} + (s - 1) \cdot t_{mgt}) + \\ (t_{pg}) + (b \cdot t_{h} + b \cdot t_{pg} + b \cdot t_{mz} + \\ (b - 1) \cdot t_{mg}) + (2 \cdot t_{e} + t_{mgt}) \end{matrix}

(3)

Obviously, we should evaluate the performance of Yang’s secure data storage auditing protocol with the total computation cost of one auditing (Table 3).

Table 3

The computation cost of some sub-operations in Yang’s solution.

Server	Auditor
$t_{TP} = b \cdot t_{pg} + (b - 1) \cdot t_{mg}$	$t_{chal} = t_{pg}$
$t_{MP} = s \cdot (b \cdot t_{mz} + (b - 1) \cdot t_{a})$	$t_{H_{chal}} = b \cdot t_{h} + b \cdot t_{pg} + b \cdot t_{mz} + (b - 1) \cdot t_{mg}$
$t_{DP} = s \cdot t_{e} + s \cdot t_{pgt} + (s - 1) \cdot t_{mgt}$	$t_{verify} = 2 t_{e} + t_{mgt}$

According to the definitions of our three algorithms of Chall, Prove, and Verify, we also formalize the expression of computation cost in Table 4.

Table 4

The computation cost of some sub-operations in our solution.

Server	Auditor
$t_{TP} = b \cdot t_{pg} + (b - 1) \cdot t_{mg}$	$t_{chal} = t_{pg} + s \cdot t_{pgt}$
$t_{MP} = s \cdot (b \cdot t_{mz} + (b - 1) \cdot t_{a})$	$t_{H_{chal}} = b \cdot t_{h} + b \cdot t_{pg} + b \cdot t_{mz} + (b - 1) \cdot t_{mg}$
$t_{DP} = s \cdot t_{pgt} + (s - 1) \cdot t_{mgt}$	$t_{verify} = 2 t_{e} + t_{mgt}$

The total computation cost of one auditing of our solution can expressed as

\begin{matrix} T_{all} = (b \cdot t_{pg} + (b - 1) \cdot t_{mg}) + \\ (s \cdot (b \cdot t_{mz} + (b - 1) \cdot t_{a})) + \\ (s \cdot t_{pgt} + (s - 1) \cdot t_{mgt}) + \\ (t_{pg} + s \cdot t_{pgt}) + \\ (b \cdot t_{h} + b \cdot t_{pg} + b \cdot t_{mz} + (b - 1) \cdot t_{mg}) + \\ (2 \cdot t_{e} + t_{mgt}) \end{matrix}

(4)

According to equations (3) and (4), the difference of the total computation cost between our and Yang’s solution can be expressed as

Δ T = s \cdot t_{pgt} - s \cdot t_{e}

(5)

Generally, the computation cost of a bilinear pairing of two elements of $G_{1}$ and $G_{2}$ is more expensive than a exponentiation of an element of $G_{T}$ . According to equation (5), our solution is more efficient than Yang’s with the same parameter.

Moreover, for a confirmation auditing or sampling auditing, in the case of the same total number of sectors (i.e. the same probability of $\Pr (b, s)$ and the same disk workload), the computation cost should be influenced by the values of b and s according to equation (4). Hence, we further improve the performance of our solution by choosing the optimal number of sectors s in a case of the same challenged data ( $b \cdot s$ sectors).

We test the computation cost of basic computational operations listed in Table 2 on a Windows System with an AMD Phenom II X4 810 Processor at 2.60 GHz and 8 GB RAM in the C language based on GNU Multiple Precision Arithmetic Library (GMP) version 4.1 and pairing-based cryptography library version 0.4.7. The elliptic curve group used by us is an MNT d159 curve, which has a 160-bit group order, and the embedding degree of which is 6.

The results are shown in Table 5.

Table 5

The computation cost results of some basic computational operations.

Symbol	Experiment value (ms)
$t_{pg}$	2.824
$t_{pgt}$	10.826
$t_{mz}$	0.00094
$t_{mg}$	0.01419
$t_{mgt}$	0.054
$t_{e}$	32.979
$t_{a}$	0.00016
$t_{h}$	0.2605

In this section, our goal is to choose a number of sectors for each data block in order to achieve the optimal performance of the auditing protocol in a case of the same challenged data in bytes. We define the total challenged sectors as $z = b \cdot s$ . Thus, we obtain the following optimization problem

min T_{all} (z, s)

The partial derivative of $T_{all} (z, s)$ at s is $\frac{\partial T_{all} (z, s)}{\partial s}$ . The following equation (6) indicates the relationship between the optimal number of sectors s and the total challenged sectors z to minimize the total computation cost $T_{all} (z, s)$

\frac{\partial T_{all} (z, s)}{\partial s} = 0

(6)

So we can get the following two equations to calculate the optimal number of sectors $s_{1}$ and $s_{2}$ for Yang’s and our solutions

s_{1} = 0.3679 \sqrt{z}

(7)

and

s_{2} = 0.523 \sqrt{z}

(8)

Figure 2 shows the relationship between the computation cost and the number of sectors s by evaluation when the total challenged data in Kbytes is 100, 200, 300, 400, and 500. Figure 2 also shows our solution is more efficient than Yang’s when the total challenged data in Kbytes is 100, 200, 300, 400, and 500.

Figure 2.

The computation cost versus the number of sectors with a constant challenged data by evaluation: (a) Yang’s solution and (b) our solution.

Table 6 shows the optimal number of sectors s when the total challenged data in Kbytes is 100, 200, 300, 400, and 500 by evaluation.

Table 6

The optimal number of sectors s with the total challenged data by evaluation.

Challenged data	Challenged sectors	$s_{1}$	$s_{2}$
100	5120	26	37
200	10,240	37	53
300	15,360	46	65
400	20,480	53	75
500	25,600	59	84

According to equations (7) and (8), respectively, Figure 3 shows the optimal number of sectors s versus the total challenged data for Yang’s and our solution by evaluation. Moreover, according to equations (3) and (4), Figure 4 shows the optimal total computation cost versus the total challenged data for Yang’s and our solution by evaluation based on the optimal number of sectors in Figure 3. We also show the optimal number of sectors in Figure 4. In the two figures, the total challenged data goes to 1000 Kbytes (i.e. the total number of challenged sectors is 51,200). From Figures 3 and 4, we can see that our solution can achieve the better performance than Yang’s while the optimal number of sectors of our solution is greater than the one of Yang’s solution.

Figure 3.

The optimal number of sectors s versus the total challenged data (Kbytes).

Figure 4.

The optimal total computation cost versus the total challenged data (Kbytes).

Experiment results and analysis

According to the computation cost of some basic computational operations, we can evaluate the total computation cost of Yang’s and our solutions using equations (3) and (4). Figure 5(a) and (b) shows the total computation cost evaluated using equations (3) and (4).

Figure 5.

The total computation cost of Yang’s and our solutions: (a) Yang’s solution by evaluation, (b) our solution by evaluation, (c) Yang’s solution by experiment, and (d) our solution by experiment.

We also implement Yang’s and our secure data storage auditing protocol on a Windows System with an AMD Phenom II X4 810 Processor at 2.60 GHz and 8 GB RAM in the C language based on GNU Multiple Precision Arithmetic Library (GMP) version 4.1 and pairing-based cryptography library version 0.4.7. The corresponding experiment results are shown for Yang’s and our solutions in Figure 5(c) and (d).

In Figure 5, as the number of the challenged data increases, the total computation cost of the two solution increases linearly. But we cannot see clearly the difference between Yang’ s and our solutions. So some other figures are shown as follows to illustrate the difference. Figure 6 shows the total computation cost ratio between Yang’s and our solutions. In Figure 6, we can know the total computation cost ratio in a case of the same number of challenged data blocks and the same number of sectors. And in a case of the same number of sectors, as the number of challenged data blocks increases, the total computation cost ratio decreases and goes stable (the ratio is about 1).

Figure 6.

The total computation cost ratio between Yang’s and our solution: (a) by evaluation and (b) by experiment.

We can see the more obvious trend in Figure 7(a) and (c), which are the projection of Figure 6(a) and (b) in the plane of the number of challenged data blocks and the total computation cost. In Figure 7(a) and (c), in a case of the same number of sectors, as the number of challenged data blocks increases, the total computation cost ratio decreases. Moreover, when the number of challenged data blocks is large, the total computation cost ratio is constant (the ratio is about 1).

Figure 7.

Project figures of Figure 7: (a) the number of challenged data blocks and the total computation cost by evaluation, (b) the number of sectors and the total computation cost by evaluation, (c) the number of challenged data blocks and the total computation cost by experiment, and (d) the number of sectors and the total computation cost by experiment.

We can also see the more obvious trend in Figure 7(b) and (d), which are the projection of Figure 6(a) and (b) in the plane of the number of sectors and the total computation cost. In Figure 7(b) and (d), in a case of the same number of challenged data blocks, as the number of sectors increases, the total computation cost ratio increases.

The total computation cost ratio by experiment in Figure 7 looks a little different with the one by evaluation because there are some other computation costs. But overall our experiment results are consistent with the results by evaluation.

In summary, when the number of challenged data blocks is small, our solution is more efficient than Yang’s in some cases of the same number of challenged data blocks. And as the number of challenged data blocks increases, our solution has the same performance as Yang’s (the performance ratio is about 1). As mentioned above, the number of challenged data ( $z = b \cdot s$ ) should be determined by the corrupted probability $ρ$ of each sector of each block on the server and the probability $\Pr (b, s)$ of server misbehavior detection. By spot checking, the verifier needs only to choose $z = b \cdot s = 460$ sectors in order to achieve $\Pr (b, s)$ of at least 99%(99% confidence) if $ρ = 0.01$ . Hence, the size of challenged data is about 10 Kbytes. While the size of challenged data goes to 500 Kbytes, the verifier can still detect server misbehavior at 99% confidence if $ρ = 0.00018$ according to equation (2). Hence, the number of challenged data sampled by the verifier will be small in which our solution is more efficient than Yang’s.

In Figure 8, we compare the optimal total computation cost versus the total challenged data in bytes for Yang’s and our solution by evaluation and by experiment. We also show the optimal number of sectors in Figure 8. In our experiment, the maximize number of sectors is 64. Therefore, the optimal total computation cost by experiment in Figure 8 looks a little different with the one by evaluation. But overall our experiment results are consistent with the results by evaluation. Figure 8 also shows that our solution is more efficient than Yang’s.

Figure 8.

The optimal total computation cost versus the total challenged data (Kbytes) for Yang’s and our solution.

To further illustrate the effectiveness of our solution, we also compare the computation cost versus the size of challenged data in Figure 9 by experiment.

Figure 9.

Comparison of computation cost of our solution(s is determined by equation (8)) and Yang’s solution (s=50).

In summary, our solution is more efficient than Yang’s. Equation (8) can be used to evaluate and determine the optimal number of sectors for each data block in order to achieve the optimal performance per executing our auditing protocol in a case of the same amount of challenged data.

Conclusion

In this article, we proposed an improved secure IoT data storage auditing protocol based on Yang’s solution. The performance of our solution was improved by pre-doing the bilinear pairing computation. And then we further improved the performance of our solution by selecting the optimal number of sectors to achieve the optimal performance of executing the auditing protocol. Experiment results show that the performance results are consistent with the results by the mathematical analysis method and our solution is more efficient than Yang’s.

Footnotes

Academic Editor: Xuyun Zhang

Declaration of conflicting interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported, in part, by National High Technology Research and Development Program of China (No. 2015AA016008), National Science and Technology Major Project (No. JC201104210032A), National Natural Science Foundation of China (No. 61402136), and Natural Science Foundation of Guangdong Province, China (No. 2014A030313697).

References

Kortuem

Kawsar

Sundramoorthy

. Smart objects as building blocks for the internet of things. IEEE Internet Comput 2010; 14(1): 44–51.

Liu

Yang

Zhang

. External integrity verification for outsourced big data in cloud and IoT: a big picture. Future Gener Comp Sy 2015; 49: 58–67.

Deswarte

Quisquater

Saidane

Remote integrity checking. In: Proceedings of the IFIP TC11/WG11.5 6th working conference on integrity and internal control in information systems, Boston, MA, 13–14 November 2003, pp.1–11. Berlin: Springer.

Gazzoni Filho

Barreto

PSLM

. Demonstrating data possession and uncheatable data transfer. IACR Cryptol ePr Arch 2006; 2006: 150.

Sebe

Martinez-Balleste

Deswarte

. Time-bounded remote file integrity checking. Technical report 04429, July, 2004. Toulouse: LAAS.

Yamamoto

Oda

Aoki

. Fast integrity for large data. In: Proceedings of the ECRYPT workshop on software performance enhancement for encryption and decryption, Amsterdam, 11–12 June 2007, pp.21–32. European Network of Excellence (ECRYPT).

Ateniese

Burns

Curtmola

. Provable data possession at untrusted stores. In: Proceedings of the 14th ACM conference on computer and communications security, Alexandria, VA, 29 October–2 November 2007, pp.598–609. New York: ACM.

Yang

Jia

An efficient and secure dynamic auditing protocol for data storage in cloud computing. IEEE T Parall Distr 2013; 24(9): 1717–1726.

Wang

Ren

. Enabling public auditability and data dynamics for storage security in cloud computing. IEEE T Parall Distr 2011; 22(5): 847–859.

10.

Botta

De Donato

Persico

. Integration of cloud computing and internet of things: a survey. Future Gener Comp Sy 2016; 56: 684–700.

11.

Daz

Martín

Rubio

State-of-the-art, challenges, and open issues in the integration of internet of things and cloud computing. J Netw Comput Appl 2016; 67: 99–117.

12.

Zeng

. Publicly verifiable remote data integrity. In: Proceedings of 10th international conference on information and communications security, Birmingham, 20–22 October 2008, pp.419–434. Berlin, Heidelberg: Springer.

13.

Wang

Ren

Lou

. Toward publicly auditable secure cloud data storage services. IEEE Network 2010; 24(4): 19–24.

14.

Ateniese

Di Pietro

Mancini

. Scalable and efficient provable data possession. In: Proceedings of the 4th international conference on security and privacy in communication networks, Istanbul, 22–25 September 2008, pp.9:1–9:10. New York: ACM.

15.

Erway

Küpçü

Papamanthou

. Dynamic provable data possession. In: Proceedings of the 16th ACM conference on computer and communications security, Chicago, IL, 9–13 November 2009, pp.213–222. New York: ACM.

16.

Wang

Ren

. Privacy-preserving public auditing for data storage security in cloud computing. In: Proceedings of the 29th conference on information communications (INFOCOM’10), San Diego, CA, 15–19 March 2010, pp.525–533. Piscataway, NJ: IEEE Press.

17.

Zhu

Ahn

. Cooperative provable data possession for integrity verification in multicloud storage. IEEE T Parall Distr 2012; 23(12): 2231–2244.

18.

Yang

Liu

Zhang

. A time efficient approach for detecting errors in big sensor data on cloud. IEEE T Parall Distr 2015; 26(2): 329–339.

19.

Liu

Chen

Yang

. Authorized public auditing of dynamic big data storage on cloud with efficient verifiable fine-grained updates. IEEE T Parall Distr 2014; 25(9): 2234–2244.

20.

Zhang

Liu

Nepal

. A privacy leakage upper bound constraint-based approach for cost-effective privacy preserving of intermediate data sets in cloud. IEEE T Parall Distr 2013; 24(6): 1192–1202.

21.

Sookhak

Talebian

Ahmed

. A review on remote data auditing in single cloud server: taxonomy and open issues. J Netw Comput Appl 2014; 43: 121–141.

22.

Sookhak

Gani

Talebian

. Remote data auditing in cloud computing environments: a survey, taxonomy, and open issues. ACM Comput Surv 2015; 47(4): 65:1–65:34.

23.

Shin

Kwon

A survey of public provable data possession schemes with batch verification in cloud storage. J Internet Serv Inf Secur 2015; 5(3): 37–47.

24.

Kim

Kwon

Hahn

. Privacy-preserving public auditing for educational multimedia data in cloud computing. Multimed Tool Appl 2015; 75(21): 1–15.