Abstract
More and more organizations outsource their data to remote cloud servers (RCSs). Data owners rent the cloud service providers (CSPs) infrastructure to store their unlimited resources by paying fees metered in month or gigabyte. For increasing the availability and scalability of data, the data owners store their data replicas on multiple servers across multiple data centers. Data owners should ensure that the CSPs actually store all their copies according to the service contract. In this paper, we present an efficient dynamic multicopies possession checking scheme that has the following properties: (1) the data owner uses fully homomorphic encryption (FHE) algorithm to generate multiple copies; (2) the scheme supports data block dynamic operation; (3) the scheme supports third-party auditor's public validation. Finally, security analysis and experimental results show that our scheme can resist forgery, replacement, and replay attack and perform better than some other related scheme published recently.
1. Introduction
Storing abundant file resources in the cloud has become the current trend [1]. An increasing number of users send their data to the cloud service providers (CSPs) for storage. Storing data on the remote cloud server enables organizations to relieve the burden brought by server updating and other computing issues. Also, authorized users can conveniently use the stored data from different geographic locations. When data has been stored on a remote CS that may be unreliable, the data owners cannot control their privacy data directly. This lack of control for privacy data would exhibit data confidentiality and privacy protection problem. For the CSP, some errors may occur, such as software or hardware failures and operational errors of system administrator [2–5]. For maintaining the CSP's reputation, he/she usually tries to hide data loss incidents. Therefore, he/she may not be trustworthy. In order to deal with data confidentiality, the sensitive data is encrypted before it is outsourced to the CSP. For integrity protection, many researchers have proposed provable data possession (PDP) schemes to verify the integrity of data stored on the CSP.
For intensifying responsibility of the CSP, it is essential for the data owner to demand that the CSP provide evidence that the data owners' data are not discarded or damaged [6]. PDP [7] is a method to check integrity of the data stored on the CSP. To keep the security of data, the data owner encrypts the data file and calculates tag for encrypted data file. Then he/she stores the encrypted data blocks and the corresponding tags on the CSP and deletes them from the local computers. When the data owner (or verifier) wants to check the integrity of data file, he/she sends a challenge vector to the CSP. The CSP makes a response for challenge and sends the response to the data owner (or verifier).
How to efficiently guarantee that the CSP faithfully stores the data owner's data file is the main goal of remote PDP scheme. Nowadays, there exist two types of verification scheme: provable data possession (PDP) and Proof of Retrieval (POR). In addition, to validate the correctness of data, the POR scheme can also check out the corrupted data block and then recover the original data. In order to check the correctness of data stored on the CSP, Ateniese et al. [7, 8] proposed two provably secure PDP schemes. In the subsequent year, different variations of PDP schemes were proposed, such as [9–12] which are based on different cryptographic hard problem assumptions. In 2007, a model of POR was constructed in [13], Juels and Kaliski used error-correcting code to present a sentinel-based POR protocol. Various POR schemes that check static data file integrity can be found in [14–19].
In fact, it is not realistic to check static data file integrity, because the stored data block is usually modified by the data owner. Therefore, we consider that one of the basic requirements of storing data on the remote CSP is to support dynamic update operation of data block. The updated operation mainly refers to modification, insertion, or deletion of the stored data.
Another design goal is to support public validation. It is not appropriate to allow any side of the data owner or the CSP to verify the integrity of data stored on the CSP, because neither of them could be ensured to get unbiased verification results [20]. The third-party verifier is a suitable choice for the storage verification. A third-party verifier has the ability to do meaningful work and it is trusted by the data owners and CSPs. Under the influence of the distributed storage system, the data storage verification becomes more significant. Many schemes have been proposed: Proof of Retrieval (POR) protocols [21], remote integrity checking protocols [22, 23], and provable data possession protocols [24]. However, these protocols only apply to the scene that the data owner verifies the integrity of the data file stored on the CSP. In cloud computing, the data file storage verification is provided by a third-party rather than the data owners. Therefore, we propose a PDP scheme that supports public verifiability for data stored on the CSP and dynamic operation.
(1) Related Work. Ateniese et al. [7] first proposed a PDP paradigm and built two PDP schemes. However, they have not considered the multicopies storage system and dynamic data storage operation. To solve dynamic data operation problem, Zeng [10] and Wang et al. [20] proposed a dynamic version which can support limited data block operation but cannot support block insertions. In 2009, Erway et al. [25] modified the PDP scheme in [7] to achieve the dynamic updates of data files stored on the CSP applying rank-based authenticated skip lists. However, the computation and communication efficiency of this scheme achieves good only for a single data block. Wang et al. [20] verified the integrity of file copies using MHT. Although their scheme supports dynamic operation, their data files stored on the CSP are not encrypted and work for a single data block. In 2011, Hao et al. [26] proposed a PDP scheme which supports not only the dynamic operation for data file but also public audit. Public audit is an interaction protocol which allows any entity, not necessarily the data owner, to verify the correctness of data stored on the CSP, since their scheme also does not consider stored data file encryption and the scheme cannot be modified to support multicopies data file stored. Barsoum and Hasan [27] proposed a PDP scheme that encrypts a file F by using an encryption algorithm and generates different data file replicas. The encryption algorithm has a strong diffusion property, for example, AES. However, the data updating efficiency of their scheme is not good, for all the copies stored on the CSP will be encrypted again and updated on the CSP. To solve this problem, we use the FHE algorithm to encrypt the data file and generate multicopies stored on the CSP. The file copies need not be encrypted again when data updates. Therefore, the verification efficiency of our PDP scheme has been greatly improved.
The goal of our PDP scheme differs from the goal defined in the earlier researches. Our PDP scheme aims to keep confidentiality, integrity, and public verifiability of data copies stored on the CSP. Considering the FHE algorithm which will generate a random noise in each process, we encrypt a file F to get multiple file copies. The novelty of our scheme comes with ensuring data security and providing good efficiency for data updating while preserving the public verifiability. Because the file copies are stored on the distributed cloud server, we assume that all copies will not be damaged at the same time.
(2) Design Goal. The proposed scheme in our paper should hold the following properties simultaneously. (1) Correctness: the third-party auditor must accept all valid information proved by the cloud; (2) Public Auditing: the public auditor cannot get any information about the stored data block; (3) security goals: the proposed scheme can resist forgery attack, replacement attack, replay attack, and channel attack. The corrupted data block in corrupted copies can be checked out in this scheme. Thus, the data owner just needs to regenerate the new copy of corrupted data block instead of regenerating the whole new file copy. When the data owner needs to regenerate the valid form of corrupted data block copy, he/she first obtains the original data block from other uncorrupted data copies and then reencrypts the original data block to generate a new data block copy.
The rest of the paper is organized as follows. The preliminaries are introduced in Section 2, followed by the proposed scheme description in Section 3. Section 4 analyzes the security of our scheme. Section 5 presents the implementation and experimental result. Conclusions are given in Section 6.
2. Preliminary
Some preliminary knowledge used in our paper is introduced in this section. Let F be a file outsourced to the CSP. It is divided into a sequence of n blocks; that is,
Bilinear Map/Pairing. Denote bilinearity: nondegeneracy: computability:
Computational Diffie-Hellman (CDH) Problem.
Our protocol involves three cryptographic functions:
Fully Homomorphic Encryption (FHE) [28]. The FHE scheme includes the following three algorithms:
KeyGen: input a security parameter λ and output three parameters Enc: input a plaintext Dec: input a ciphertext C and the secret key sk and output the plaintext m:
On one hand, each homomorphic encryption generates a random noise, which is a variable. Therefore, encrypting one original file τ times with one public key will generate τ different file replicas. On the other hand, we use the homomorphic property that satisfies plaintext operation corresponding to ciphertext operation to realize dynamic operation for data replicas stored on the CSPs.
3. Dynamic Multireplicas Provable Remote Data Possession (DPRDP) Scheme
In this work, the cloud storage model considered in this paper consists of four entities as shown in Figure 1: (1) a data owner who can be an individual or an organization possesses privacy data information, (2) the CSPs provide paid storage space for storing the data owner's data files, (3) the third-party auditor is responsible for integrity verification of file copies stored on the CSP through challenge-response protocol, and (4) authorized users share the decryption key with the data owner and have the right to access the data copies from the CSP. In our paper, we mainly discuss the first three entities: the data owner, the CSP, and the third-party verifier.

Cloud storage system model.
3.1. DPRDP Scheme
In [29], Mukundan et al. used Paillier encryption to generate data copies. However, the encryption time complexity of their scheme is
The proposed MR-PDP scheme includes nine functions KeyGen, CopiesGen, TagGen, Challenge, Response, Verify, Request, Execution, and IndiceRetri.
(1) KeyGen. The data owner inputs a security parameter λ and generates a public/secret key pair
(2)CopiesGen. For a file F, the data owner creates τ differentiable copies
(3)TagGen. It is necessary to suppose that the data owner generates the tags sequentially in accordance with the index j. That is, the data owner generates a tag for a data block choosing a random element computing outputting computing the tag
Upon receiving the
(4)Challenge. The data owner assigns the third auditor to complete validation tasks. For challenging the CSP and verifying the possession and integrity of all copies, the third-party auditor sends c (number of data blocks to be challenged) and two challenge keys
(5)Response. The CSP receives the triple computing computing sending
(6)Verify. After receiving the proof
If the above equation holds, the third-party auditor outputs 1, otherwise 0.
The challenge-response protocol between the third-party auditor and the CSP in the DPRDP scheme is summarized in Figure 2.

The challenge-response protocol in the DPRDP scheme.
(7)Request. When the data owner wants to update the data block stored on the CSP, he/she would generate an update requirement including file name, update command, update index, and encrypted difference and update tag. The data owner sends the update request
(8)Execution. The CSP receives the update request and executes the update operation. He/she inputs the copies of file F, the tag T, and update request and outputs the copies of an updated file

Data block modification operation procedure in the DPRDP scheme.
The data block insertion and deletion operations are similar with the data block modification. In order to complete the data block insertion, the data owner inserts a new data block before position j in a copy file. Thus, the number of data blocks in copies will be changed from n to
(9)IndiceRetri. The indices of corrupted data blocks can be identified by using this retrieval algorithm. The response
the verifier computes the following τ verification equation: verify and where where verify and
The retrieve algorithm inputs four elements: T, ξ,
We give an example for Algorithm 1. As shown in Figure 4, Algorithm 1 first gets the corrupted data block copies

Example of Algorithm 1.
4. Security Analysis
Theorem 1.
If both the data owner and the CSP are honest to perform the DPRDP scheme, the response of CSP can pass the auditor's validation:
Theorem 1 is proved. The details of security analysis are given in the following.
(1) Scenario. In our protocol, there exist four entities in the security model: the data owner, verifier, CSP, and authorized user. The security analysis of protocol involves the first three participants, so we only consider the communication among the data owner, verifier, and CSP. The data owner encrypts data block and gets the replicas of each data block. Then the data owner generates the tags corresponding to the data block replicas and stores the block-tag pair on the CSP. When the verifier wants to validate the integrity of data copies, it sends to the CSP a challenge vector. Upon receiving the challenge vector, the CSP aggregates the tag of challenged data block and the challenged data blocks, respectively, and then sends the aggregation back to the verifier. That is, the verifier sends a challenge vector to the CSP, and then the CSP makes a response for the challenge. Finally, the verifier checks the aggregation information. The interaction among the data owner, CSP, and verifier is shown in Figure 5.

The interaction among the data owner, CSP, and verifier.
(2)Possible Attack
(a) Internal Attack. The main internal attack considered in our scheme refers to forgery attack, replacement attack, and replay attack. All these attacks initiated by dishonest CSP. The dishonest CSP may forge a single data block tag or replace a corrupted data block-tag pair with another valid block-tag pair. Even, he/she may also try to make a response to the current challenge of verifier with the previous proof, without querying the current data block actually.
(b) External Attack. An adversary may temper the messages transmitting on the channel.
(3)Interaction Protocol Design. In order to clearly present interaction between any two participants, a detailed communication process is given in Figure 6.

The communication process among the data owner, CSP, and verifier.
(4)Security Analysis
(a) External Attack. An adversary may temper the transmitted messages on channel shown in Figure 4. On the first transmission channel, the data owner sends
For the adversary, tempering with the transmitted messages on the channels ② and ③ can be regarded as the CSP tempering with these messages. In addition to tempering messages, the CSP can also initiate forgery, replacement, and replay attack. That is, the CSP has stronger attack ability than an external adversary. Therefore, for the messages transmitted on the second and the third channel, we just need to consider the internal attack initiated by the CSP.
(b) Internal Attack
Forgery. A dishonest CSP attempts to forge the tag for some data block and wishes that the value of forged tag is the same as the original tag. For a single tag
Replacement. Assuming that the CSP uses a valid block-tag pair
According the verification equation, the following equations are obtained:
If
Replay. The essence of replay attack is to use the previous proof to replace the current challenge's response. That is, the CSP does not actually access the latest data status. To some extent, the replay attack can be regarded as the multiple replacement attack. Therefore, we can use the same analysis method as replacement attack to analyze replay attack. If the challenged latest block-tag pairs are different with the previous block-tag pairs, the previous response will not pass the verifier verification.
5. Performance Analysis
In terms of our scheme implementation, we have simulated this MR-PDP protocol in C language. We conduct the following experiments on the Win7 system. To start, we perform several experiments on a system with an Intel(R) Core(TM) 3.10 GHZ processor and 4 GB RAM computer. We set the encryption security parameters
DPRDP scheme communication cost with file size 1 MB.
We give a performance comparison between the MR-PDP scheme proposed in our scheme and the DMR-PDP scheme proposed in [29].
Figure 7 presents the data setup computation time comparison between the DMR-PDP scheme and the DPRDP scheme using three copies for the file sizes 1, 5, 10, and 20 MB, respectively. Similarly with the scheme in [29], the data setup process is completed by the data owner only once. As shown in Figure 7, the DPRDP scheme performs faster than DMR-PDP scheme.

The Ccmparison of data setup time between DMR-PDP and DPRDP scheme.
We use the FHE encryption to generate the data block copy, but Raghul uses the Paillier encryption to generate the data block copies in DMR-PDP scheme. The following three elements are considered as the main influence factors for the efficiency of data setup. (1) The generation of encryption parameters: the time for generating parameter of FHE is much faster than that in Paillier encryption. In addition to generating random numbers, Paillier encryption also needs to compute the least common multiple which is a relatively consuming time process when both p and q are large number. However, this is not the main factor to affect the efficiency of data setup. (2) Encryption: the time complexity of FHE is

(a) The comparison of data owner computation time for signatures between DMR-PDP and DPRDP scheme with file size of 1 MB and 5 MB. (b) The comparison of data owner computation time for signatures between DMR-PDP and DPRDP scheme with file size of 10 MB and 20 MB.
In Figure 8, the signing efficiency comparison of data owner in DPRDP scheme and DMR-PDPD scheme is presented. We set the number of replicas as 3, 5, and 10. The solid line denotes the signing time of data owner in DMR-PDP scheme and the dotted line represents the signing time of data owner in DPRDP scheme.
The signing efficiency comparison of data owner in DPRDP scheme and DMR-PDPD scheme for file sizes 1 MB and 5 MB is shown in Figure 8(a). The red line denotes the data owner signing time comparison with file size 1 MB for the number of copies 3, 5, and 10. Although the algorithms for generating tags in DPRDP scheme and DMR-PDP scheme are two similar methods, but multiplication and addition must be needed in DMR-PDP scheme, the signing time will be slower than that in DPRDP scheme. The blue line denotes the data owner signing time comparison with file size 5 MB for the number of copies 3, 5, and 10. When there exist 3 data copies, the signing time in DMR-PDP scheme is 0.037 seconds slower than that in DPRDP scheme. In Figure 8(b), we show the signing efficiency comparison of data owner in DPRDP scheme and DMR-PDPD scheme for file sizes 10 MB and 20 MB. The green line denotes the data owner signing time comparison with file size 10 MB with the number of copies 3, 5, and 10. When the file size is 10 MB, the difference of signing time between these two schemes is small. In Figure 8(b), the purple line denotes the data owner signing time comparison with file size 20 MB with the number of copies 3, 5, and 10. For 20 MB file, the growth rate of signing time in DMR-PDPD and DPRDP scheme increases a little when the number of copies changes from 3 to 5.
The tag aggregation efficiency comparison of CSP in DPRDP scheme and DMR-PDPD scheme is presented in Figure 9. Let the number of replicas be 3, 5, and 10. The solid line denotes the tag aggregation time of CSP in DMR-PDP scheme and the dotted line represents the tag aggregation time of CSP in DPRDP scheme.

(a) The comparison of CSP computation time for aggregation tag between DMR-PDP and DPRDP scheme with file size of 1 MB and 5 MB. (b) The comparison of CSP computation time for aggregation tag between DMR-PDP and DPRDP scheme with file size of 10 MB and 20 MB.
In Figure 9(a), we show the tag aggregation efficiency comparison of CSP in DPRDP scheme and DMR-PDPD scheme for file sizes 1 MB and 5 MB. The red line denotes the CSP aggregation time comparison with file size 1 MB for the number of copies 3, 5, and 10. For the algorithms for aggregation tags in DPRDP scheme and DMR-PDP scheme are two similar methods, the difference of aggregation time between these two schemes is small when the file size is 1 MB. The blue line denotes the CSP aggregation time comparison with file size 5 MB with the number of copies 3, 5, and 10. When there exist 3 data copies, the aggregation time in DMR-PDP scheme is 2.2 seconds slower than that in DPRDP scheme. The aggregation efficiency difference between DMR-PDP scheme and DPRDP scheme narrows as the increase of number of copies. When the number of copies is 10, the aggregation efficiency of the two schemes is very close.
The aggregation efficiency comparison of CSP in DPRDP scheme and DMR-PDPD scheme for file sizes 10 MB and 20 MB is shown in Figure 9(b). The green line denotes the CSP aggregation time comparison with file size 10 MB for the number of copies 3, 5, and 10. When the file size is 10 MB, the difference of aggregation time between these two schemes is small. With the increase of number of copies, growth rate of aggregation time is getting bigger. The purple line denotes the CSP aggregation time comparison with file size 20 MB for the number of copies 3, 5, and 10. Also, the growth rate of aggregation time is getting bigger for both DMR-PDP scheme and DPRDP scheme. For 20 MB file, the growth rate of aggregation time in DMR-PDPD scheme is faster than that in DPRDP scheme when the number of copies changes from 5 to 10. Therefore, when both the number of data copies and the file size are large, the DPRDP scheme is more effective.
6. Conclusion
In this paper, we have presented a dynamic replicated data possession checking scheme for checking the data copies integrity in the CSP. The data owners encrypt their data using FHE algorithm, obtain the data replicas, and then store their encryption data replicas on the CSP. Firstly, the proposed scheme not only supports checking integrity of the data file copies anytime and anywhere but also satisfies public verification without leaking any information to the third-party auditor. Secondly, the authorized users can decrypt data copies received from the CSP using a single decryption secret key. Thirdly, the data owner can implement data block update operation on data replicas. Finally, the retrieval algorithms in this scheme can identify the corrupted data blocks in corrupted copies accurately. The corrupted data block copies can be reconstructed by the data owner, and thus he/she needs not to generate the whole corrupted copy file. Through security analysis, we have shown that our scheme can resist forgery attacks, replacement attack, and replay attack. At the same time, our DPRDP scheme achieves a higher efficiency than DMR-PDP scheme.
We consider that there exists data expansion in the encryption process, so how to reduce the data expansion is our future work. Thus, we can further improve the efficiency of the proposed scheme.
Footnotes
Conflict of Interests
The authors declare that there is no conflict of interests regarding the publication of this paper.
Acknowledgments
The authors thank the editor and reviewers for their suggestions to improve the quality of paper. This work was supported by the NSF of China (nos. U1433105 and U1536118) and the Beijing Higher Education Young Elite Teacher Project (YETP0448).
