Abstract
Distributed data storage is a key technology in the data collection in wireless sensor networks. The storage scheme based on network coding is applied to data collection in wireless sensor networks because of its high reliability and low overhead. However, it is an open problem to reduce data repair communication overhead caused by the failure of storage nodes. This paper focuses on this issue and presents a two-layer distributed data storage scheme. The lower-layer nodes store the encoded data blocks and the upper-layer nodes store the re-encoded blocks that are responsible for failure data recovery. Based on the two-layer data storage scheme, a data repair method is proposed to decrease the repair communication overhead with only sacrificing lower storage overhead. Compared with MSR, interference alignment-based scheme and group interference alignment scheme, the proposed method has lower repair communication overhead. We prove that the proposed method can reduce the repair communication overhead to
1. Introduction
Wireless sensor networks, Internet of things and M2 M make the computer networks extended to things. The data collection in wireless sensor networks perform environmental monitoring, information transmission, data storage, and independent provided service. The collected data are distributed stored at the data storage nodes. In the case of node failure, high reliability and low-overhead distributed data storage is an open problem which has received widespread attention in recent years [1, 2]. With the consideration of resource-constrained property of the storage nodes in the data collection in wireless sensor networks, some effective distributed storage methods are proposed in [3–8]. The distributed storage scheme based on network coding is one of them and it has been researched extensively. It encodes the original data into a number of encoded data blocks and then stores them at different storage nodes. To reconstruct the original data, users only need to get a reasonable number of encoded data blocks (not less than the original data amount). Compared with the data backup method, the network coding technique has advantages of low storage overhead and robustness.
But it also brings the repair problem: When a storage node fails, the data on nonfailure nodes are used to repair the failure data to keep the same level reliability. To repair the failure encoded data blocks, the traditional method is that the newly added storage node collect sufficient encoded data (not less than the original data amount), then decode the encoded data to get the original data, and reencode the original data blocks to recover the failure data. As a result, traditional method causes great repair communication overhead which makes it not optimal for the wireless sensor networks because of the strict limitation of energy. Therefore, communication overhead becomes the primary factor of designing data repair algorithm. In order to reduce the repair communication overhead, some researchers focus their attention on designing data repair algorithms. Among these repair algorithms, the interference alignment repair algorithm presented in [9] and regenerating codes repair algorithm introduced in [7, 8] are outstanding.
For regenerating codes repair algorithm, there are two interesting points on its optimal tradeoff curve: minimum-bandwidth regenerating (MBR) codes and minimum-storage regenerating (MSR) codes [7, 8]. For MSR codes, its core constructive techniques are interference alignment and the network coding. With the interference alignment technique, MSR codes can reduce the repair communication overhead. The more interference elements are aligned, the more communication overhead is reduced. In order to decrease the interference dimension to a maximum extent, a common eigenvector should be computed and used. But when the interference dimension is more than 3, the complexity of computing the common eigenvector will increase greatly and that will become a significant challenge to the wireless network nodes with limited calculation ability. Dimakis et al. [7] have proved that the repair communication overhead of exact-MSR codes repair algorithm can be achieved with interference alignment only when the code rate is not higher than 1/2. For MBR codes, it can decrease the repair communication overhead to the minimum by sacrificing significant storage overhead. Its storage overhead is about 2 times of MSR. So for the given redundancy, the MBR codes are no longer optimal in terms of reliability.
The interference alignment repair algorithm is based on a hybrid storage model. In this model, the data consist of two parts. One is systematic part which is composed of the original data blocks. The other is nonsystematic part which is composed of the linear combination of the systematic part. Using the interference alignment algorithm to repair the failure data, the newly added node should first collect sufficient data to reduce the interference factors to 3 and then use the interference alignment technique to repair the failure data. The method proposed in [9] can reduce the repair communication overhead, but it is still high.
This paper focuses on optimizing the repair communication overhead of distributed storage. Having analyzed the tradeoff between storage overhead and repair communication overhead and turned the flat storage structure into hierarchical storage structure, this paper proposes a distributed storage method, which is based on two-layer storage structure, and a data repair algorithm. The proposed algorithm decreases the repair communication overhead by sacrificing lower storage overhead. The two-layer storage structure has two kinds of encoded data. They are the encoded network coding data and reencoded data. The lower-layer nodes are responsible for the original data reconstruction. The upper-layer nodes are responsible for failure data recovery. The data repair algorithm based on two-layer storage structure can ensure the restored data and the original encoded data have the recoverability property. Moreover, the two-layer storage structure keeps the storage system in a dynamic steady state all the time. That is, the data reliability of the entire system is stable. The analysis shows that the proposed method can greatly reduce the repair communication overhead by sacrificing lower storage overhead. This paper also proves that the proposed method can reduce the repair communication overhead to
This paper is organized as follows. Section 2 is related works. Section 3 proposes two-layer data storage scheme and data repair method. Section 4 evaluates the repair communication overhead of the proposed method. The conclusion is in Section 5.
2. Related Work
There is a tradeoff between the communication overhead of repairing failure data and the storage overhead in a distributed data storage system. To ensure the availability of the stored data, some methods to balance the storage overhead and the communication overhead of repairing the failure data are proposed in [1, 7, 9–14]. There are three kinds of recovery algorithms: regenerating codes recovery algorithm [1, 7, 11–14], interference alignment recovery algorithm [1, 9], and tree-structured recovery algorithm based on network topology [10].
For the regenerating codes recovery algorithm, MSR codes and MBR codes [7] can be its representation because of most researchers having a huge interest in them. With the consideration of the minimum storage overhead, MSR codes have minimal storage overhead on a single storage node. MSR codes use the interference alignment technique to reduce the repair communication overhead and the repair process can be seen in [7]. Compared with the traditional backup method, MSR codes repair algorithm can significantly reduce the repair communication overhead by interference alignment technique. But MSR codes have some deficiencies. (1) Using interference alignment technique to reduce the interference dimension to 1 should use the common eigenvectors of all the interfering elements. But the complexity of computing the common eigenvectors will greatly increase with the increase of interference elements. It will be a great challenge to the wireless storage nodes with limited calculation ability. (2) Repair communication overhead of exact-MSR codes repair algorithm can be achieved only when the coding rate is at most 1/2 [7], otherwise its desired results cannot be guaranteed.
MBR codes consider the repair problem with the view of minimum repair communication overhead. Repair communication overhead of MBR codes is equal to its single-node storage overhead. And its repair communication overhead is minimal among all the known data repair algorithms. The meticulous process of building MBR codes is in [7]. Nevertheless, the shortcoming of MBR codes is its great storage overhead. That is because each data block is stored twice. So for the given redundancy, the reliability of MBR codes is no longer optimal.
With minimal storage overhead for a single storage node, the data repair method based on interference alignment reduces repair communication overhead by merging interference elements. The interference alignment technique can reduce the interference elements to 1 by collecting sufficient data. Then the failure data can be repaired by solving linear equations. The details are shown in [9]. The shortcomings of the interference alignment repair algorithm are as follows: (1) Interference alignment repair algorithm is mainly acted on the systematic data. When non-systematic data are failure, they are turned into systematic. (2) Compared with the traditional method, the interference alignment repair algorithm does not decrease the repair communication overhead significantly.
Data repair algorithm based on network topology is named tree-structured data regeneration. This method views the data repair as recoding of the encoded data blocks. This method is based on the random network coding theory. And it is chiefly used to reduce the repair time. Compared with the traditional method, its repair communication overhead is not reduced.
For these shortcomings of the available data repair algorithms, this paper analyzes the tradeoff between the storage overhead and repair communication overhead, transforms the flat node storage method into hierarchical network coding storage method which is inspired by the tree-structured data regeneration, and then proposes the distributed network coding storage method and repair algorithm.
3. Two-Layer Storage Structure and Data Repair Method
In this section, two-layer storage structure and data repair method are proposed. In the two-layer storage structure, the encoded data blocks are organized into two layers.
3.1. The Construct of Two-Layer Data Storage Structure
There are two types of encoded data blocks in the two-layer storage structure and they constitute the lower-layer and the upper-layer of the two-layer data storage structure, respectively. The network coding data blocks of the original data consist of the lower-layer data blocks; while the upper-layer encoded data blocks are the linear combination of the lower-layer data blocks. The construction process is as follows.
The original data of size M bits are divided equally into k blocks (of size M/k bits each), represented by a k-dimension vector
The traditional data repair method illustrates that the data repair is essentially the process of solving linear equations. The communication overhead of the data recovery depends on the required data blocks, which are also the number of linear equations. Thus, reducing the communication overhead of data recovery is to decrease the number of equations.
To reduce the number of equations, the upper-layer encoded data scheme is proposed. The upper-layer nodes store the reencoded data from the lower-layer data by

Two-layer encoded data structure with (3, 11, 6) code.
3.2. The Methods of Repairing the Failure Data
The exact repair means that the recovered data are exactly the same as the failures. For all of the upper-layer nodes and the lower-layer nodes involved in re-encoding process, when any of them is failure, the failure data can be exactly repaired. For the rest of the lower-layer nodes, when one of them fails, there are two ways, exact repair and functional repair, to recover the failure data. The functional repair is that the newly generated block can contain the different failure data as long as the system maintains the network coding
3.2.1. Exact Repair
In exact repair, when the upper-layer nodes failed, the new joined node should only collect the lower-layer data blocks which are responsible for the previous re-encoded block to exact repair the failure data. For the lower-layer node participated in re-encoding, such as node
For the rest of the lower-layer nodes, if one of them failed, the upper-layer data blocks are used to realize the exact repair. The subsets of the upper-layer encoded blocks can exactly repair the failure data blocks stored at the lower layer nodes that are not involved in re-encoding process and the number of elements in each subset is
3.2.2. Hybrid Repair
Hybrid repair is a hybrid model of the exact repair and functional repair. The hybrid model is: If any of the upper-layer nodes or the lower-layer nodes involved in reencoding process failed, the data stored at them are exactly repaired; however, if the lower-layer nodes which are not involved in the reencoding process failed, the data stored at them are functionally repaired. The functional repair is actually the linear combination of
The functional repair of the data stored at the lower layer nodes which are not involved in re-encoding process is a linear combination of data blocks of the upper layer. Therefore, the functional repair can be represented by
4. Evaluation
In this section, we analyze the proposed data repair method and evaluate the communication overhead for data repair, namely, repair communication overhead. We also compare it with the existing data repair method. The communication overhead for data repair is represented by the amount of communication data in the data repair process.
4.1. Repair Communication Overhead Evaluation of the Exact Repair
4.1.1. Repair Communication Overhead
Whether the lower-layer data blocks that are involved in re-encoding process or not brings the difference of the repair schemes and even makes the repair communication overhead not the same. From the repair methods mentioned above, we know that the repair communication overhead of the upper-layer data blocks and the lower-layer data blocks involved in re-encoding process is p, and the repair communication overhead of the rest of the lower layer data blocks is
From the equation, we know that the value of E is related to
Firstly, if
Secondly, if
When x is maximal, E will be minimal. From (8), we can see that
Thirdly, if
Compare the value of E at these 3 situations and we can see that if
4.1.2. Evaluation of the Repair Communication Overhead
Theorem 1.
If
Proof.
If
Theorem 2.
If
Proof.
The repair communication overhead of the basic interference alignment repair algorithm proved by [9] is
For the repair algorithm based on group interference alignment, the repair communication overhead is
4.1.3. Numeral Result
In order to compare the repair overhead of these data repair method above and verify the correctness of the conclusions, the numeral result is shown in Figure 2. The histogram in Figure 2 shows, respectively, the repair overhead of MSR repair, exact repair based on two-layer storage structure, repair based on basic interference alignment and group interference alignment, where

The repair communication overhead evaluation of exact repair.
4.2. Repair Overhead Evaluation of the Hybrid Repair
4.2.1. Repair Communication Overhead
Similar to the exact repair, the communication overhead of hybrid repair can also be represented by its expectation value. Let m be x, let p be y, and let
Let
When
From (12), we can see that the value of E is related to
And the partial derivative of z is
Combine (23) and (18), we can have
Formula (18) can be rewritten as
Now, we discuss the value of E. According to the relationship between y and z, we will have the following 3 situations.
First, if
Second, if
Theorem 3.
Under the condition that the relationship between n and k is
Proof.
For
Therefore,
The minimum value of x can be drawn from Theorem 3 when the relationship between n and k is
If
Then, formula (12) turns into
We can compute the derivative of a by the equality (32). If the derivative is greater than 0, set
The repair communication overhead of hybrid repair based on two-layer storage structure is
Theorem 4.
The hybrid repair based on two-layer storage structure can reduce the repair communication overhead to
Proof.
From the analysis mentioned above, it can be concluded that the repair communication overhead of hybrid repair based on two-layer storage structure is at most
4.2.2. Evaluation of the Repair Communication Overhead
Theorem 5.
If the relationship between n and k is
Proof.
The repair communication overhead of hybrid repair is at most
Theorem 6.
For the repair method based on group interference, if
Proof.
For the repair method based on group interference, its repair communication overhead is
For the repair method based on basic interference alignment, since its repair communication overhead is
4.2.3. Numeral Result
To verify the correctness of the conclusions, the numeral result is shown in Figure 3. The histogram in Figure 3 shows, respectively, the repair communication overhead of MSR repair, exact repair based on two-layer storage structure, repair based on basic interference alignment and group interference alignment, where

The repair communication overhead evaluation of the hybrid repair.
Comparing the conclusions of this paper with the numeral results displayed in Figure 3, it can be seen apparently that they are consistent.
5. Conclusion
This paper analyzes the tradeoff between storage overhead and repair communication overhead in the distributed data storage. We turn the flat storage structure into hierarchical storage structure and present a two-layer distributed data storage scheme to improve the repair communication overhead. Based on the two-layer data storage scheme, a data recovery method is proposed to decrease the repair communication overhead with sacrificing lower storage overhead. The proposed method has lower repair communication overhead than that of MSR, basic interference alignment and group interference alignment schemes. We prove this method reduces the repair communication overhead to
Footnotes
Acknowledgments
The authors gratefully acknowledge inspiring discussions with Weisong Shi and Xiaohong Jiang. This work was supported by the National Natural Science Foundation of China (61100153, 61172068) and the Aviation Science Foundation of China (2009198101, 2010ZC31001, 2010ZC31002, 20101981015, and 2011ZC31006).
