Sage Journals: Discover world-class research

Abstract

This article investigates the distributed data storage problem with compressed sensing in the space information network. Since there exists a performance-energy trade-off, most existing strategies focus only on improving the compressed sensing construction performance or reducing the energy consumption, respectively. In order to achieve a better balance, a novel and efficient strategy, referred to as distributed storage strategy based on compressed sensing, is proposed in this article. Unlike other strategies which require source packets visiting the entire network, the proposed strategy is a “one-hop” method since information exchange is only performed between neighbors. Therefore, the compressed sensing measurement matrix depends heavily on the degree of each space node. We prove that the proposed strategy guarantees the compressed sensing reconstruction performance under both sparse orthonormal basis and dense orthonormal basis. Simulation results validate that, compared with the representative CStorage strategy and compressive data persistence strategy, the proposed strategy consumes the least energy and computational overheads, while almost without sacrificing the compressed sensing reconstruction performance.

Keywords

Space information network distributed data storage compressed sensing energy efficient

Introduction

The space information network (SIN) was proposed to solve the problems that different space systems’ built separately and inconvenience in cooperation. It can provide communication, navigation, and remote sensing service simultaneously using various space platforms, for example, satellites, aerial vehicles, high-altitude platforms (HAPs), and terrestrial terminals.^1,2 Unlike traditional wireless networks, the network status of SIN was changing dynamically due to the distinguishing characteristics such as self-organizing and large scale.³ In our prior work,⁴ we divided the SIN into a series of hierarchical autonomous system (AS) networks based on the property of space nodes. In this way, the complex SIN was decoupled into quasi-static sub-networks, which makes the control easier to carry out. Each AS network can be modeled as a homogeneous network similar to the terrestrial wireless networks, while the difference is that the nodes in the SIN are bandwidth-limited and energy-limited, these limitations can be jointly described as resource-limited.

In recent years, space information is increasing significantly. According to the satellite database of union of concerned scientists (UCS), the amount of space information in 2020 will be 44 times than that in 2010.⁵ However, space nodes in the SIN are resource-limited, and sometimes they may be invisible for a long time due to the shadowing effect. Therefore, how to increase the lifetime of information in the SIN is challenging. Distributed storage is an effective solution, and it adds redundancy into storage systems by combining network technology.⁶ This way the original information can be reconstructed by querying a small subset of space nodes even if some nodes are invisible.

The distributed data storage problem has been widely studied in the wireless networks, and information is stored with redundancy among the entire network. At first, traditional erasure codes like low-density parity-check (LDPC) codes⁷ and fountain codes⁸ are adopted in a decentralized manner to improve the storage reliability.^9–12 To reduce the repair bandwidth, Dimakis et al.¹³ and Lin et al.¹⁴ proposed the regenerating codes by introducing network coding into storage systems. On the other hand, compressed sensing (CS) theory^15–17 has been applied in a variety of wireless distributed storage strategies by exploiting the correlation between typical information.^18–23 As a result, both the decoding ratio and data dissemination cost are greatly reduced for the same order of erasure codes. As presented in Wang et al.²⁴ and Huleihel et al.,²⁵ communication signals between space nodes are also compressible due to the spatial correlation. Therefore, it is significant if the CS theory can be utilized in the SIN.

We investigate the distributed data storage problem in the SIN, and the main objective is to improve the energy efficiency of the AS network such as data dissemination cost and computational overheads. Since there exists a trade-off between the reconstruction performance and the energy consumption, we also aim to make our strategy sacrificing less performance. In this article, storage packet is generated based on the CS theory and the data dissemination process is performed according to the “one-hop” broadcasting mechanism, that is, each node only broadcasts its own source packet once. Thus, information is only exchanged between neighboring nodes. We denote our strategy by distributed storage strategy based on compressed sensing (DSSCS). Detailed theoretical analysis and extensive simulations are provided to evaluate the proposed strategy compared with other representative strategies; the results show that the DSSCS strategy guarantees the CS reconstruction performance under both sparse orthonormal basis (e.g. canonical basis) and dense orthonormal basis (e.g. discrete cosine transform (DCT) basis), while improving the energy efficiency.

The remainder of this article is organized as follows. In section “Background and related work,” we briefly review the AS networks of SIN, CS theory, and the related work about CS-based distributed storage strategies in wireless networks. In section “Network model and problem description,” we define the network model and describe the problem. In section “Proposed DSSCS strategy,” we propose the DSSCS strategy to achieve the balance between CS reconstruction performance and energy efficiency. Then, the validity of DSSCS strategy and data dissemination cost are analyzed in section “Theoretical analysis.” In section “Simulation results and discussions,” simulation results and discussions are presented. Finally, we make conclusion in section “Conclusion.”

Background and related work

In this section, we briefly introduce the necessary background to design our strategy.

AS networks of SIN

The SIN is a space-based information infrastructure that contains different kinds of space nodes. As shown in Figure 1, these nodes are located in different altitude of space orbit and work within various circumstances with either dynamic (e.g. low earth orbit satellites) or static (e.g. grid topology on the earth) statuses. The unified management approach is not efficient for SIN because it induces lots of control messages, which consume excessive energy and network capacities. Therefore, as shown in Figure 2, we decouple the SIN into a series of hierarchical AS networks, each AS network contains a collection of similar space nodes and can be modeled as a homogeneous network.⁴ For example, AS-1 contains the satellites, AS-2 contains the airplanes, AS-3 contains the HAPs, and AS-4 contains the ground terminates. In addition, individual AS network can be further divided into sub-AS networks if necessary. Independent manage strategy can be adopted in each AS (sub-AS) network. This will decouple the complex SIN into quasi-static sub-networks, which makes the control easier to carry out.

Figure 1.

The architecture of SIN.

Figure 2.

The whole SIN is divided into a series of AS networks based on the property of space nodes.

CS

CS theory^15–17 states that sparse or compressible information can be successfully reconstructed at a sub-Nyquist sampling rate with high probability (WHP). In particular, suppose an N-dimensional vector $x = (x_{1}, \dots, x_{N})^{T}$ can be expressed as

x = Ψ θ = \sum_{i = 1}^{N} θ_{i} ψ_{i}

(1)

where $θ = (θ_{1}, \dots, θ_{N})^{T}$ is the transform coefficient vector in an deterministic orthogonal basis $Ψ = (ψ_{1}, \dots, ψ_{N})$ . We say $x$ is K-sparse if and only if (iff) $θ$ has at most K non-zero entries. Then, the N-dimensional vector $x$ can be successfully reconstructed through

y = Φ x = Φ Ψ θ

(2)

where $y$ is a M-dimensional measurement vector with $M \geq K \log (N / K)$ , $Φ$ is the $M \times N$ measurement matrix which should satisfy the following restricted isometry property (RIP)

(1 - δ) {‖ θ^{'} ‖}_{2} \leq {‖ Φ θ^{'} ‖}_{2} \leq (1 + δ) {‖ θ^{'} ‖}_{2}

(3)

where $‖ α ‖_{2}$ is the 2-norm of vector $α$ . Designing the measurement matrix $Φ$ is the key problem of CS theory, and the RIP refers to the necessary condition for successful CS reconstruction. Specifically, the random Gaussian and Bernoulli matrices have been shown to satisfy the RIP¹⁷ WHP.

Related work

CS theory has been widely studied in the distributed storage of wireless networks, which improves both storage reliability and energy efficiency for the same order of erasure codes. To the best of our knowledge, the works by Talari and Rahnavard¹⁸ and Liu et al.¹⁹ are the most representative strategies based on CS for distributed storage in wireless networks. However, they focus on different performance metrics since there exists a trade-off between the CS reconstruction performance and the energy efficiency. Aiming at reducing the data dissemination cost, Talari and Rahnavard¹⁸ propose the CStorage strategy for distributed data storage in wireless sensor networks. During the data dissemination process, only $N_{s}$ $(N_{s} << N)$ nodes run the probabilistic broadcasting (PBcast)²⁶ mechanism. Each node that receives a packet for the first time will rebroadcast it with probability p. However, the CS reconstruction performance is not proved, while they only consider the case that signals are sparse in the dense orthonormal basis, for example, DCT basis as shown in the signal model.

On the other hand, Liu et al.¹⁹ propose the compressive data persistence (CDP) strategy for improving the CS reconstruction performance on different types of signals. Random walk²⁷ is employed for disseminating packets throughout the entire network since it does not need any location information and can be used in arbitrary topology. During the encoding process, each node launches r random walks and the length is set to the cover time to reach the equilibrium distribution, where the cover time is defined as the expected length that ensures a source packet visiting all nodes at least once. As a result, each column of the measurement matrix has almost r non-zero elements and they prove that $Φ$ satisfying the RIP property. However, the CDP strategy consumes excessive energy and takes long time to converge.

Since space nodes in the SIN are resource-limited, it is significant if less energy and computational overheads are consumed. Inspired by Talari and Rahnavard¹⁸ and Liu et al.,¹⁹ we propose a novel and simple, but efficient, DSSCS strategy for AS network to achieve a better balance between the CS reconstruction performance and energy efficiency. In our strategy, nodes only exchange information with its direct neighbors. Hence, the measurement matrix $Φ$ is determined by the local information. As presented in Mobius et al.,²⁸ the data dissemination cost is proportional to the number of transmissions and receptions. It is our intuition that the energy consumption and computational overheads can be reduced through this way. In addition, we prove that the proposed DSSCS strategy guarantees CS reconstruction performance under both sparse and dense orthonormal basis.

Haupt et al.²⁰ first propose a decentralized CS-based strategy for networked storage, the random gossiping is employed for disseminating source packets. As a result, the measurement matrix $Φ$ is fairly dense and each row has $O (N)$ non-zero elements, this considerably consumes energy and computational overheads. Lin et al.²¹ utilized a sparser measurement matrix to reduce the transmission cost, each node randomly combines $O (\log N)$ source packets during the encoding phase. However, the CS reconstruction performance is not proved.

Yang et al.²² propose the compressive network coding–based distributed data storage (CNCDS) strategy by introducing network coding into wireless storage systems. Although the storage reliability is improved, similar to Talari and Rahnavard,¹⁸ they ignore the universality of different orthonormal basis. Gong et al.²³ propose the spatiotemporal compressive network coding (ST-CNC) strategy by exploiting the spatial and temporal correlation simultaneously, while rigorous proof is not provided and the signal model is not suitable for the SIN.

Network model and problem description

Network model

In this article, we mainly focus on the distributed data storage problem in an AS network. Suppose that the AS network consists of N similar space nodes distributed uniformly at random with omnidirectional antennas, and they have identical transmission radius R. Then, the AS network can be modeled as an undirected random geometric graph $G (N, R)$ as shown in Figure 3. Two nodes are called neighbors and can communicate directly iff their Euclidean distance is less than R. In order to remain the connectivity of AS network, R should satisfy $R^{2} = Θ (V \log N / N)$ , where V is the area of AS network.²⁹ Each node is assigned a unique ID based on its property, for example, MAC address. Without loss of generality, we assume that the node ID is $I D_{i} = {i | 1 \leq i \leq N}$ .

Figure 3.

Example of AS network.

In our AS model, we don’t make any assumption about the routing table since the proposed strategy is fully distributed. Each space node only knows the local information, which can be acquired from its neighbors. The global information (e.g. the network size) is not available for each space node, which is different from most existing strategies, for example, Talari and Rahnavard¹⁸ and Liu et al.¹⁹ For convenience to our theoretical analysis, we also assume that the link is symmetric and obstacle-free, and there is a link (either direct or indirect) between any two space nodes. Then, we provide the definition of node degree.

Definition 1: node degree

Denote by $N (u)$ the set of neighbors of node u, the number of neighbors is defined as the node degree of node u, and is denoted by $d_{n} (u)$ , that is, $d_{n} (u) = | N (u) |$ . The average node degree of all nodes in the AS network is called the density of network and is denoted by d, that is

d = \frac{1}{N} \sum_{u = 1}^{N} d_{n} (u)

(4)

For an AS network with N nodes uniformly and independently distributed in the region V, the probability that two nodes are neighbors, that is, the link probability $p_{link}$ is

p_{l i n k} = \frac{π R^{2}}{V}

(5)

Then, the density of network d can be expressed as

d = N \times p_{l i n k} = N \times \frac{π R^{2}}{V} \approx Θ (\log N)

(6)

Therefore, each node in the AS network almost has the same order of node degree $Θ (\log N)$ WHP.

Problem description

As presented above, we mainly focus on the distributed data storage problem in an AS network, where N similar space nodes are distributed uniformly at random in the region V. Each node is responsible for generating, relaying, and storing information. For convenience to the theoretical analysis and fair comparison with other strategies, each node i has a limited memory and generates one symbol $x_{i}$ independently every period. Then, the original data of the entire AS network can be denoted by a vector $x = (x_{1}, \dots, x_{N})^{T}$ . As presented in Wang et al.,²⁴ $x$ is often spatially sparse in some orthonormal basis $Ψ$ based on the nature of space information. Without loss of generality, we assume that the vector $x$ is K-sparse in the canonical basis (i.e. $Ψ = I$ ) and the DCT basis, respectively. The forming process of the measurement matrix $Φ$ and the measurement vector y will be discussed in section “Proposed DSSCS strategy.” Hence, the signal model of the DSSCS strategy can be formulated as $y = Φ x = Φ Ψ θ$ , which coincides with the CS theory. The objective of our proposed strategy is to successfully reconstruct the vector $x = (x_{1}, \dots, x_{N})^{T}$ by querying a small subset of measurements using a decentralized manner.

Proposed DSSCS strategy

In this section, we present our strategy, DSSCS, for distributed data storage in the AS network. The proposed DSSCS strategy is divided into four phases: initialization, CS encoding, storage, and reconstruction phase. We aiming that every space node in the AS network stores one CS measurement in a decentralized manner, that is, the encoding process is down without unified management. In order to reduce the data dissemination cost and without sacrificing the CS reconstruction performance, each node broadcasts its own source packet only once. Therefore, the measurement matrix is obtained based on the local information of space nodes. The detailed descriptions of the DSSCS strategy are given below.

Phase 1: initialization phase

We assume that the transmission in the AS network is synchronized and slotted. At the beginning, that is, the initialization phase, each node $i, i = 1, \dots, N$ generates a source data $x_{i}$ and adds its ID to the packet header to form its source packet $P_{i}$ , that is, $P_{i} = [i, x_{i}]$ . After that, each node $i, i = 1, \dots, N$ forms the initial storage packet $r (i)$ with the following structure: $r (i) = [φ_{i}, I D_{i}, y_{i}]$ . The explanation of the packet structure is as follows: (1) $r (i) . a 1 = φ_{i}$ : the measurement coefficient vector of node i, (2) $r (i) . a 2 = I D_{i}$ : the set of packets’ ID that node i has been received, and (3) $r (i) . a 3 = y_{i}$ : the measurement value of node i. We assume that each node takes its own source packet as the first visit packet and stores it, thus the initial values of $r (i)$ are $φ_{i} = ϕ_{i, i}$ , $I D_{i} = i$ , and $y_{i} = ϕ_{i, i} \times x_{i}$ , respectively, where $ϕ_{i, i}$ is the measurement coefficient and will be discussed later.

Phase 2: CS encoding phase

In the CS encoding phase, every node disseminates its source packet according to the “one-hop” broadcasting mechanism, that is, each node broadcasts its source packet only once. It is a special case of PBcast with the forward probability $p = 0$ . In this article, we do not consider the packet loss since it can be handled at the lower layer. In particular, as presented in Yang et al.,²² selecting proper nodes to broadcast packets in turn can significantly reduce packet collisions in wireless networks. During the encoding process, each node $i, i = 1, \dots, N$ that receives a packet $P_{j}$ will update its storage packet $r (i)$ as follows

\begin{array}{l} φ_{i} = φ_{i} \cup Φ_{i, j} \\ {ID}_{i} = {ID}_{i} \cup j \\ y_{i} = y_{i} + Φ_{i, j} x_{j} \end{array}

(7)

The measurement of each node depends on the source packets it receives. We assume that node i receives three packets during the CS encoding phase: $P_{1}$ , $P_{3}$ , and $P_{N}$ . Hence, the CS measurement $y_{i}$ can be calculated by

y_{i} = ϕ_{i, i} x_{i} + ϕ_{i, 1} x_{1} + ϕ_{i, 3} x_{3} + ϕ_{i, N} x_{N}

(8)

The above equation can be expressed as the matrix form

y_{i} = (\begin{matrix} ϕ_{i, 1} & 0 & ϕ_{i, 3} & \dots & ϕ_{i, i} & \dots & ϕ_{i, N} \end{matrix}) (\begin{matrix} x_{1} \\ x_{2} \\ ⋮ \\ x_{N} \end{matrix})

(9)

The final packet structure stored in node i is shown in Figure 4, which consists of the measurement coefficient set $φ_{i}$ , the node ID (of received source packets) set $I D_{i}$ , and the CS measurement $y_{i}$ .

Figure 4.

Storage packet structure of node i.

During the CS encoding phase, each node independently runs the CS encoding procedure and updates its own storage packet as shown in equation (7). Therefore, the encoding process can be expressed as

y = Φ x

(10)

where $y = (y_{1}, y_{2}, \dots, y_{N})^{T}$ is the measurement vector of the AS network, $Φ = (φ_{1}, φ_{2}, \dots, φ_{N})^{T}$ is the measurement matrix, and node i computes the corresponding row vector $φ_{i}$ of $Φ$ . The non-zero elements of $φ_{i}$ denote the source packets that had visited node i (e.g. $ϕ_{i, 1}$ , $ϕ_{i, 3}$ , and $ϕ_{i, N}$ are non-zero elements in $φ_{i}$ since node i receives packets $P_{1}$ , $P_{3}$ , and $P_{N}$ ).

Here, we discuss the choice of measurement coefficient $ϕ_{i, j}$ . Let $q (i, j)$ denote the probability that $ϕ_{i, j} \neq 0$ after the CS encoding phase, according to equation (5), the probability that two nodes are neighbors is $p_{link} = π R^{2} / V$ . Since each node independently runs the encoding process, we have

\begin{matrix} q (i, j) = p_{link} = \frac{π R^{2}}{V} = Θ (\frac{π \log N}{N}) \end{matrix}

(11)

When N is sufficiently large

\begin{matrix} q (i, j) \to \frac{1}{N} \end{matrix}

(12)

For convenience to our theoretical analysis and without sacrificing the CS reconstruction performance, we set the measurement coefficient $ϕ_{i, j}$ to +1 and −1 with equal probability, that is

ϕ_{i, j} = {\begin{matrix} + 1, & w . p . \frac{1}{2} q (i, j) \\ - 1, & w . p . \frac{1}{2} q (i, j) \\ 0, & w . p . 1 - q (i, j) \end{matrix}

(13)

Hence, the measurement matrix $Φ$ can be described as a random Bernoulli matrix and the number of non-zero elements in each row obeys the binomial distribution. As present in Chang and Wu,¹⁷ the random Bernoulli matrix satisfies the RIP property WHP. The CS reconstruction performance will be proved in section “Theoretical analysis.”

According to equation (6), each node has the same order of node degree $Θ (\log N)$ . Thus, each row (column) vector of the measurement matrix $Φ$ has $Θ (\log N)$ non-zero elements, that is, each source packet contributes to $Θ (\log N)$ CS measurements. Therefore, the measurement matrix $Φ$ is sparse enough and the storage overheads of each space node is small. Figure 5 shows that the measurement matrix $Φ$ can be described as a bipartite graph between N nodes (measurements) and N source packets. Where the nodes on the left side denote the source packets, and the nodes on the right side correspond to the measurements with average node degree $Θ (\log N)$ .

Figure 5.

The bipartite graph between N nodes and N source packets.

Phase 3: storage phase

During the CS encoding phase, each node $i, i = 1, \dots, N$ runs the CS encoding procedure (as shown in equation (7)) for the received source packets. Then, node i finishes its encoding process and stores the packet $r (i)$ in the storage phase.

Phase 4: reconstruction phase

To reconstruct the source vector $x = (x_{1}, \dots, x_{N})^{T}$ , a user node (either fixed or mobile) queries any $M (M \geq K \log (N / K))$ space nodes and collects their storage packets. Then, the vector x can be successfully reconstructed WHP by solving the following l₁-programming problem¹⁵

min ‖ θ ‖_{l_{1}}, s . t . y_{M} = Φ_{M} x, x = Ψ θ

(14)

where $y_{M}$ is the M-dimensional measurement vector, and $Φ_{M}$ is the $M \times N$ measurement matrix. They can be obtained from the corresponding collected storage packets. Several CS reconstruction algorithms such as basis pursuit (BP) and orthogonal matching pursuit (OMP) algorithm have been introduced in Chu et al.¹⁶ and Chang and Wu.¹⁷ Without loss of generality, we use the BP algorithm to reconstruct the original data in this article.

The pseudo-code of DSSCS strategy is illustrated in Algorithm 1.

Algorithm 1: DSSCS strategy
Input: source data $x_{i}, i = 1, \dots, N$
Output: storage packets $r (i), i = 1, \dots, N$
Begin:
/Phase I: Initialization Phase/
for each node $i, i = 1, \dots, N$ do
$P_{i} = [i, x_{i}]$ ;
$r (i) . a 1 = ϕ_{i, i}$ ;
% $ϕ_{i, i}$ is set to +1 or −1 with equal probability;
$r (i) . a 2 = i$ ;
$r (i) . a 3 = ϕ_{i, i} x_{i}$ ;
end
/Phase II: CS Encoding Phase/
N nodes independently broadcasts its source packets once;
for $i = 1$ to $i = N$ do
for $j = 1$ to $j = N$ do
if node i receives packet $P_{j}$ then
$r (i) . a 1 = [r (i) . a 1] \cup ϕ_{i, j}$ ;
% $ϕ_{i, j}$ is set to +1 or −1 with equal probability;
$r (i) . a 2 = [r (i) . a 2] \cup j$ ;
$r (i) . a 3 = r (i) . a 3 + ϕ_{i, j} x_{j}$ ;
else
continue;
end
end
end
/Phase III: Storage Phase/
for $i = 1$ to $i = N$ do
return $r (i)$ ;
end
/Phase IV: Reconstruction Phase/
The user node queries any M nodes randomly to obtain the measurement $Φ_{M}$ and measurement vector $y_{M}$ from the storage packet r;
Use the BP algorithm to recover original data vector x through $y_{M} = Φ_{M} x$ .

Theoretical analysis

In this section, we first prove that the proposed DSSCS strategy guarantees CS reconstruction performance and then investigate the expression for data dissemination cost.

Proof of CS reconstruction performance

As presented above, the measurement matrix $Φ$ of the DSSCS strategy depends heavily on the local information of space nodes in the AS network. In this subsection, we prove that the measurement matrix $Φ$ can be used to reconstruct source information. Recall that the RIP property refers to the necessary condition for successful CS reconstruction in the traditional CS theory. However, proving the RIP property for a measurement matrix $Φ$ is NP-hard.¹⁷ Fortunately, Candes and Plan³⁰ propose an alternative mechanism to guarantee the CS reconstruction performance for arbitrary measurement matrix $Φ$ . First, we provide the definitions of isotropy property and incoherence property, respectively.

Definition 2: isotropy property

For an N-dimensional signal vector $x$ which is K-sparse in the $N \times N$ orthogonal basis $Ψ$ , let the matrix $A = Ψ Φ$ , where $Φ$ is the measurement matrix. We say that the matrix $A$ satisfies the isotropy property iff $E (A^{T} A) = I_{N}$ , where $E (X)$ is the expectation of matrix $X$ . The isotropy property means that the row vectors of matrix $A$ are independent and have unit variance.

Definition 3: incoherence property

Let $A$ be the product of the orthogonal basis $Ψ$ and the measurement matrix $Φ$ , that is, $A = Ψ Φ$ . We define the incoherence measure $μ (A)$ of matrix $A$ as the product of $\sqrt{N}$ and the maximum element of matrix $A$ , that is

\begin{matrix} μ (A) = \sqrt{N} \times max_{i, j} | a_{ij} | \end{matrix}

(15)

where $a_{ij}$ denotes the element on the $i th$ row and $j th$ column of the matrix $A$ . The incoherence property is a relative property and incoherence means the smaller of the incoherence measure $μ (A)$ . In particular, when $μ (A) = O (1)$ or $μ (A)$ is bounded by a positive constant, the matrix $A$ satisfies the incoherence property³⁰ WHP.

Candes and Plan³⁰ proved that, when the matrix $A = Ψ Φ$ satisfies both isotropy property and incoherence property, the K-sparse signal vector x can be successfully reconstructed from $M (M \geq K \log (N / K))$ measurements WHP. Next, we will prove that the matrix $A$ formed by the proposed DSSCS strategy satisfies both isotropy property and incoherence property.

Theorem 1

For an N-dimensional signal vector x which is K-sparse in an $N \times N$ deterministic orthogonal basis $Ψ$ , let the matrix $A = Ψ Φ$ , where $Φ$ is the $N \times N$ measurement matrix of the proposed DSSCS strategy as defined in equation (13). Then, the matrix A satisfies the isotropy property.

Proof of Theorem 1

Let $φ_{i}$ be the $i th$ row vector of the measurement matrix $Φ$ . During the CS encoding phase, each space node independently disseminates its source packet through the “one-hop” broadcasting mechanism. Hence, the columns of the measurement matrix $Φ$ are independent. In particular, each column has N random elements with probability $1 / 2 N$ to be +1 or −1 and probability $1 - (1 / N)$ to be zero. Thus

\begin{matrix} E (φ_{i} φ_{j}^{T}) = {\begin{matrix} 1, i = j \\ 0, i \neq j \end{matrix} \end{matrix}

(16)

Then, we have

\begin{matrix} E (A^{T} A) & = E ((Φ Ψ)^{T} (Φ Ψ)) \\ = E (Ψ^{T} Φ^{T} Φ Ψ) \\ = Ψ^{T} E (Φ^{T} Φ) Ψ \\ = Ψ^{T} I_{N} Ψ \\ = I_{N} \end{matrix}

(17)

In addition, the deviation of $A^{T} A$ is bounded by the positive constant 1 and independent of N according to equation (13). Hence, the matrix $A$ is statistical orthonormal and satisfies the isotropy property.

It is evident from Theorem 1 that the isotropy property is mainly based on the measurement matrix $Φ$ , while the incoherence property depends heavily on the orthonormal basis $Ψ$ . Then, we prove that matrix $A$ satisfies the incoherence property under both sparse orthonormal basis and dense orthonormal basis. Without loss of generality, we assume that the signal vector x is K-sparse in the canonical basis $I$ and the DCT basis, respectively.

Theorem 2

For an N-dimensional signal vector x which is K-sparse in the canonical basis, that is, $Ψ = I$ , let the matrix $A = Ψ Φ$ , where $Φ$ is the $N \times N$ measurement matrix of the DSSCS strategy as defined in equation (13). Then, the matrix $A$ satisfies the incoherence property since the incoherence measure $μ (A) = \sqrt{N}$ .

Before proving the correctness of Theorem 2, the following lemma is first provided.

Lemma 1 (Conditional incoherence property 1)

Let $Φ$ be the $N \times N$ measurement matrix of the DSSCS strategy as defined in equation (13). For an N-dimensional signal vector x which is K-sparse in an $N \times N$ deterministic orthogonal basis $Ψ$ , let $A = Ψ Φ$ . Then, there exists a positive constant $C_{a}$ and the incoherence measure $μ (A) = O (\sqrt{N})$ WHP.

Proof of Lemma 1

Since $\sum_{k = 1}^{N} ψ_{kj}^{2} = 1$ , the element $a_{ij}$ can be expressed as follows

\begin{matrix} a_{ij} = \sum_{k = 1}^{N} ϕ_{ik} ψ_{kj} = \frac{\sum_{k = 1}^{N} ϕ_{ik} ψ_{kj}}{\sqrt{\sum_{k = 1}^{N} ψ_{kj}^{2}}} \end{matrix}

(18)

Based on equation (13), we have $E (ϕ_{ij}) = 0$ and $D (ϕ_{ij}) = 1 / N$ , where $D (x)$ is the variance of x. Hence

\begin{matrix} E (a_{ij}) = 0, D (a_{ij}) = \frac{1}{N} \end{matrix}

(19)

Based on the Chebyshev inequality,³¹ we have

\begin{matrix} \Pr (| a_{ij} | \geq C_{a}) \leq \frac{D (a_{ij})}{C_{a}^{2}} = \frac{1}{N \cdot C_{a}^{2}} \end{matrix}

(20)

where $C_{a}$ is a positive constant. Since $N >> 1$ , we have $| a_{ij} | < C_{a}$ when $C_{a} \geq 5$ WHP. Based on equation (15), the inherence measure $μ (A)$ can be expressed as follows

\begin{matrix} μ (A) = \sqrt{N} \times max_{i, j} | a_{ij} | \leq C_{a} \sqrt{N} = O (\sqrt{N}) \end{matrix}

(21)

Hence, Lemma 1 holds.

Then, we prove the correctness of Theorem 2, it is a special case of Lemma 1.

Proof of Theorem 2

Since $Ψ = I$ , we have $A = Φ$ . The inherence measure $μ (A)$ can be expressed as

\begin{matrix} μ (A) = \sqrt{N} \times max_{i, j} | ϕ_{ij} | = \sqrt{N} \end{matrix}

(22)

Therefore, the matrix $A$ satisfies the incoherence property.

Next, we consider that the signal vector x is K-sparse in the DCT orthonormal basis.

Theorem 3

For an N-dimensional signal vector x which is K-sparse in the DCT orthonormal basis, let the matrix $A = Ψ Φ$ , where $Φ$ is the $N \times N$ measurement matrix of the DSSCS strategy as defined in equation (13). Then, the matrix $A$ satisfies the incoherence property since the incoherence measure $μ (A) = O (1)$ WHP.

Before proving the correctness of Theorem 3, the following lemma is first provided.

Lemma 2 (Conditional incoherence property 2)

Let $Φ$ be the $N \times N$ measurement matrix of the DSSCS strategy as defined in equation (13). For an N-dimensional signal vector x which is K-sparse in an $N \times N$ deterministic orthogonal basis $Ψ$ , let $A = Ψ Φ$ . If the maximum element of $Ψ$ satisfies $max | ψ_{ij} | = O (\sqrt{1 / N})$ , then we have $μ (A) = O (1)$ .

Proof of Lemma 2

Based on equation (15), the incoherence measure $μ (A)$ can be expressed as follows

\begin{matrix} μ (A) = \sqrt{N} \times max_{i, j} | \sum_{k} ϕ_{ik} ψ_{kj} | \end{matrix}

(23)

Since $Ψ$ satisfies $max | ψ_{ij} | = O (\sqrt{1 / N})$ , we have $max | ψ_{ij} | < C_{b} \sqrt{1 / N}$ and $C_{b}$ is a positive constant. Then, equation (23) can be further expressed as follows

\begin{matrix} μ (A) \leq \sqrt{N} \times max_{i} \sum_{k} | ϕ_{ik} | \times max_{k, j} | ψ_{kj} | \\ \leq \sqrt{N} \times max_{i} {‖ φ_{i} ‖}_{1} \times C_{b} \sqrt{\frac{1}{N}} \end{matrix}

(24)

where $φ_{i} = (ϕ_{i 1}, ϕ_{i 2}, \dots, ϕ_{iN})$ is the $i th$ row vector of the measurement matrix $Φ$ , and $‖ α ‖_{1}$ is the 1-norm of vector $α$ . According to equation (13), the elements in vector $φ_{i}$ are independent of each other. Hence, $‖ φ_{i} ‖_{1}$ obeys the binomial distribution, that is, $‖ φ_{i} ‖_{1} ~ b (N, 1 / N)$ , where $1 / N$ is the probability that each element to be non-zero in vector $φ_{i}$ . Based on the property of binomial distribution, we have $‖ φ_{i} ‖_{1} < C_{r}$ WHP when the positive constant $C_{r} \geq 4$ . Hence, we have

\begin{matrix} μ (A) \leq C_{b} C_{r} = O (1) \end{matrix}

(25)

Hence, Lemma 2 holds.

Then, we prove the correctness of Theorem 3.

Proof of Theorem 3

The elements of the DCT orthonormal basis are defined as follows

\begin{matrix} ψ_{kj} = {\begin{matrix} \sqrt{\frac{1}{N}} & k = 1 \\ \sqrt{\frac{2}{N}} \cos (\frac{(k - 1) (2 j - 1) π}{2 N}) & k > 1 \end{matrix} \end{matrix}

(26)

Since $| \cos (*) | \leq 1$ , $ψ_{kj} \leq \sqrt{2 / N}$ . According to Lemma 2, we have $μ (A) = O (1)$ . Therefore, the matrix $A$ satisfies the incoherence property.

Data dissemination cost

In this subsection, we investigate the expression for data dissemination cost of the proposed DSSCS strategy. As presented above, data dissemination is the most consuming process and it is proportional to the number of data transmissions and data receptions. According to Mobius et al.,²⁸ in wireless communications, almost identical energy is consumed for data transmission and data reception. We assume that each transmission/reception consumes unit energy, thus the data dissemination cost $N_{tot}$ can be calculated by the number of data transmissions plus the number of data receptions.

Let N be the number of space nodes in the AS network. During the data dissemination process, every node broadcasts its own source packet only once. Therefore, the data dissemination cost of the proposed DSSCS strategy is

\begin{matrix} N_{tot} & = N_{t} + N_{r} \\ = O (N) + O (N \log N) \\ = O (N + N \log N) \end{matrix}

(27)

where $N_{t}$ and $N_{r}$ denote the number of transmissions and receptions during the proposed DSSCS strategy, respectively.

We should notice that the data dissemination cost is only related to the number of space nodes in the AS network. However, in the CStorage strategy¹⁸ and the CDP strategy,¹⁹ each source packet should visit the entire network at least once, and this is down by choosing appropriate parameters such as the forward probability p or the length of random walk t. Table 1 shows the data dissemination cost for all three strategies, where $N_{s}$ is the number of nodes that broadcast their source packets in the CStorage strategy, and r is the number of random walks that each node launched in the CDP strategy.

Table 1.

Total data dissemination cost in each strategy.

Strategy	$N_{t}$	$N_{r}$	$N_{tot}$
DSSCS	$O (N)$	$O (N \log N)$	$O (N + N \log N)$
CStorage	$O (N_{s} + N_{s} (N - 1) p)$	$O (N_{s} (N - 1))$	$O (N_{s} + (N_{s} (N - 1) (1 + p)))$
CDP	$O (rtN)$	$O (rtN)$	$O (2 rtN)$

DSSCS: distributed storage strategy based on compressed sensing; CDP: compressive data persistence.

Simulation results and discussions

In this section, we first investigate the appropriate parameters of the proposed DSSCS strategy through extensive simulations. Then, to evaluate the effectiveness of DSSCS strategy, we compare it with the CStorage strategy¹⁸ and the CDP strategy¹⁹ using various performance metrics.

Simulation parameters and performance metrics

We consider an AS network consists of N similar space nodes distributed uniformly at random in a normalized $V = 2000 \times 2000 k m^{2}$ region, and each node is responsible for generating, relaying, and storing information. All nodes have identical transmission radius R, and we set $R^{2} = CV \log N / N$ to ensure the connectivity of AS network, where C is a positive constant and will be discussed later. We define the querying ratio as the ratio between the number of queried measurements M for CS construction and the total number of space nodes N, that is, $η = M / N$ .

To overcome the effect of orthonormal basis, we use both sparse basis and dense basis in the following simulations. Without loss of generality, we consider signals are sparse in the canonical basis $I$ and the DCT basis, respectively. Similar to Liu et al.,¹⁹ for sparse signals in the canonical basis $I$ , that is, the original sparse signal, we randomly choose K space nodes with non-zero elements from all N space nodes. For sparse signals in the DCT basis, we use K non-zero measurement coefficients to simulate an N-dimensional signal vector over the entire AS network. In particular, those non-zero elements are integers randomly chosen from 1 to 100. The sparsity S is defined as the ratio between the number of non-zero elements (or transform coefficients) K and the total number of space nodes N, that is, $S = K / N$ . BP algorithm is employed for CS reconstruction, and the CS reconstruction error is defined as $e (x, \tilde{x}) = ‖ x - \tilde{x} ‖_{2} / ‖ x ‖_{2}$ , where $x \in R^{N}$ and $\tilde{x} \in R^{N}$ are the source signal vector and the reconstructed signal vector, respectively. We say one CS reconstruction is successful iff $e (x, \tilde{x}) < e_{0}$ , where $e_{0}$ is the error threshold and we set $e_{0} = 10^{- 3}$ .

Three metrics are chosen to evaluate the CS reconstruction performance and efficiency of those strategies, that is, the successful CS reconstruction probability $\Pr$ , the total data dissemination cost $N_{tot}$ , and the computational overhead E. Where the successful CS reconstruction probability is defined as the ratio between the number of successful CS reconstruction $Z_{S}$ and the number of Monte Carlo trials Z, that is, $\Pr = Z_{S} / Z$ . The computational overhead E is defined as the average time for running one Monte Carlo trail. To overcome the effect of network topology, each simulation is repeated 1000 times to calculate the mean value of the performance metrics. All results are carried from a laptop platform with a CPU of Intel (R) Core (TM) i5 2.5 GHz and a RAM of 4 GHz using the MATLAB R2010a simulator.

Performance on different parameters

In this subsection, we present simulations to investigate appropriate parameters of the proposed DSSCS strategy. Since the measurement matrix $Φ$ depends heavily on the density of AS network, that is, the average node degree, we consider the following three parameters which may affect the CS reconstruction performance:

N. The number of space nodes in the AS network.

C. The positive constant that affects the transmission radius R.

S. The sparsity of the signal vector.

First, we investigate how the number of space nodes N affects the CS reconstruction performance of the DSSCS strategy. We fix the transmission radius $R^{2} = 2 V \log N / N$ and the sparsity $S = 0.1$ , while the number of space nodes N is varied among 100, 200, 300, and 800. Figure 6 shows the successful CS reconstruction probability $\Pr$ versus the querying ratio $η$ for signals are sparse in the canonical basis $I$ and DCT basis, respectively. The corresponding density of AS network, the data dissemination cost, and the computational overheads are shown in Table 2. The simulation results are summarized as follows:

The reconstruction probability increases as $η$ gets larger and converges to 1 when queries enough measurements. This is consistent with the analysis.

There exists an intersection point between different curves for both signals, for example, $η \approx 0.4$ in Figure 6(a) and $η \approx 0.33$ in Figure 6(b), respectively. The CS reconstruction performance with large network density (e.g. $N = 800$ ) is a little worse when queries less measurements, and vice versa.

The density of AS network increases as N gets larger, which also induces higher data dissemination cost and computational overheads. In addition, the ratio between d and N decreases with increasing number N, which results in a relative sparse measurement matrix. In order to ensure the CS reconstruction performance and without loss of generality, we set $N = 100$ in the following simulations.

Figure 6.

The successful reconstruction probability $\Pr$ versus $η$ for different N: (a) canonical sparse signal and (b) DCT-sparse signal.

Table 2.

Total data dissemination cost and computational overheads in terms of N.

N	100	200	300	800
Density of AS network, d	29.31	37.12	41.74	52.76
Data dissemination cost, $N_{tot}$	3.03e3	7.62e3	1.28e4	4.29e4
Computational overheads, E	1.40e−1	2.98e−1	6.76e−1	2.91

AS: autonomous system.

Next, we investigate the relationship between the CS performance and the transmission radius R. We fix the number of space nodes and sparsity at $N = 100$ and $S = 0.1$ , respectively. The value of parameter C is varied among 1, 2, and 3. Figure 7 shows the successful CS reconstruction probability $\Pr$ versus the querying ratio $η$ for signals are sparse in the canonical basis $I$ and DCT basis, respectively. The corresponding consumption and overheads are shown in Table 3. For the originally sparse signals (Figure 7(a)), the CS reconstruction performance is better when C is larger. While for the DCT-sparse signals (Figure 7(b)), increasing C has almost no effect on the CS reconstruction performance. It is because that the matrix $A = Ψ Φ$ with canonical basis $Ψ = I$ depends heavily on the density of AS network, whereas the matrix $A$ with DCT basis has $O (N)$ non-zero elements on each row since DCT basis is dense enough. In addition, there exists a trade-off between the CS reconstruction performance and the energy efficiency. To achieve a better balance, we set the parameter $C = 2$ for originally sparse signal and $C = 1$ for DCT-sparse signal in the following simulations.

Figure 7.

The successful reconstruction probability $\Pr$ versus $η$ for different parameter C: (a) canonical sparse signal and (b) DCT-sparse signal.

Table 3.

Total data dissemination cost and computational overheads in terms of parameter C.

C	1	2	3
Density of AS network, d	16.35	29.31	40.42
Data dissemination cost, $N_{tot}$	1.73e3	3.03e3	4.14e3
Computational overheads, E	9.26e−2	1.40e−1	1.85e−1

AS: autonomous system.

Finally, we investigate the effect of the sparsity S on the performance of DSSCS strategy. We set $N = 100$ and fix the parameter $C = 2$ for originally sparse signal and $C = 1$ for DCT-sparse signal, respectively. The sparsity S is varied from 0.05 to 0.25. Figure 8 shows the successful CS reconstruction probability $\Pr$ versus the querying ratio $η$ for signals are sparse in canonical basis $I$ and DCT basis, respectively. Note that the number of queried measurements should satisfy $M \geq K \log (N / K)$ for CS reconstruction. The reconstruction probability increases as $η$ gets larger and converges to 1 when queries enough measurements. For both originally sparse and DCT-sparse signals, the CS reconstruction performance is better when signal is sparser, that is, when S is small. For example, for DCT-sparse signal with sparsity $S = 0.25$ , we should query $M \approx 0.7 N$ measurements to recover all the signal successfully. In addition, only $M \approx 0.3 N$ measurements are required when $S = 0.05$ . This considerably improves the CS reconstruction performance and computational overheads. However, we should notice that the sparsity S is determined by the nature of space information, and the energy cost and computational overheads are approximately equal once the sparsity is fixed. In order to ensure the CS reconstruction performance and without loss of generality, we set $S = 0.1$ in the following simulations.

Figure 8.

The successful reconstruction probability $\Pr$ versus $η$ for different sparsity S: (a) canonical sparse signal and (b) DCT-sparse signal.

Comparison with other strategies

To evaluate the effectiveness of the proposed DSSCS strategy, in this subsection, we compare it with the CStorage strategy¹⁸ and the CDP strategy.¹⁹ As presented above, these two strategies are the most representative strategies while they focus on different performance metrics. $N = 100$ space nodes are randomly distributed in the region V, and we set the parameter $C = 2$ for originally sparse signal and $C = 1$ for DCT-sparse signal, respectively. The signal sparsity S is set to 0.1, that is, $K = 0.1 N$ . In order to ensure a fair comparison, in the CStorage strategy, the number of nodes to broadcast their packets is $N_{S} = M + 5$ , and the forwarding probability p is set to 0.3 to ensure overall coverage. In the CDP strategy, each node launches $r = 30$ random walks with length $t = 500$ to reach the equilibrium distribution. As a result, these three strategies have almost the same sparsity order of measurement matrix $Φ$ .

Figure 9 shows the successful reconstruction probability of the proposed DSSCS strategy in comparison with the CStorage strategy and the CDP strategy for signals are sparse in canonical basis $I$ and DCT basis, respectively. It is evident from the results that the proposed DSSCS strategy and the CDP strategy can be used in both scenarios, while the CStorage strategy cannot be used in the canonical basis since the reconstruction performance is close to zero. The performance differences are influenced by the forming process of measurement matrix $Φ$ , since the measurement matrix of CStorage strategy has $N - N_{s}$ columns with all-zero elements, thus it only guarantees the CS reconstruction performance in dense orthonormal basis, for example, DCT basis as shown in Figure 9(b). While in the DSSCS strategy and CDP strategy, the non-zero elements are randomly distributed in each row vector of $Φ$ , which guarantees the CS reconstruction performance in both sparse and dense orthonormal basis.

Figure 9.

The successful reconstruction probability $\Pr$ versus $η$ for three strategies (DSSCS, CStorage, and CDP): (a) canonical sparse signal and (b) DCT-sparse signal.

We also notice that the CS reconstruction performance of the DSSCS strategy is slightly worse than the CDP strategy in canonical basis especially when $η$ is small, for example, $η \leq 0.5$ (Figure 9(a)). It is because in the CDP strategy, each node launches multiple random walks with long steps to reach the equilibrium distribution. However, this consumes considerable dissemination cost and takes long time to converge. In addition, all three strategies almost have the same performance in DCT basis (Figure 9(b)). As presented above, this is because the matrix $A = Ψ Φ$ has $O (N)$ non-zero elements on each row since the DCT basis is dense enough.

The data dissemination cost and computational overheads of three strategies are shown in Figures 10 and 11, respectively. The data dissemination cost is expressed as dB since the values differ several orders of magnitude. It is evident from the results that the proposed DSSCS strategy consumes the least energy and computational overheads. This is benefited from the process of data dissemination. In addition, we notice that the consumption (overheads) of the DSSCS strategy and CDP strategy are independent of the querying ratio $η$ , while which increases as $η$ gets larger in the CStorage strategy. It is easy to understand because the number of broadcasting nodes (i.e. $N_{S}$ in the CStorage strategy) increases. In summary, we can conclude that the proposed DSSCS strategy is more efficient than the CStorage and CDP strategies, while almost without sacrificing the CS reconstruction performance, which is significant for the resource-limited SIN.

Figure 10.

The data dissemination cost of three strategies (DSSCS, CStorage, and CDP).

Figure 11.

The computational overheads of three strategies (DSSCS, CStorage, and CDP).

Conclusion

We investigated the distributed data storage problem in the AS network of SIN, a simple but efficient DSSCS strategy was proposed based on the CS theory. Compared with most existing strategies that either use broadcast or random walk to disseminate source symbols, DSSCS utilizes the local information through “one-hop” broadcasting mechanism. In this way, not only the dissemination cost and computational overheads are reduced significantly but also guarantees the CS reconstruction performance under both sparse and dense orthonormal basis. Our simulation results validated the theoretical analysis and effectiveness of the DSSCS strategy.

Although the assumptions presented in the network model have been widely used in existing distributed data storage strategies, some of them may not be practical in reality. Our future works will mainly focus on the storage strategy under more practical conditions. In addition, the proposed strategy is a general method and may be extended to other networks such as wireless sensor networks and interplanetary Internet (IPN). We will also consider these issues in the near future.

Footnotes

Academic Editor: Jaime Lloret

Declaration of conflicting interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

This work was supported by NSF of China under Grant Nos 91338201, 91438109, and 61401507.

References

Dong

F-H

Huang

Q-F

H-J

. A novel M2M backbone network architecture. Int J Distrib Sens N 2015; 15(11): 1–10.

Mukherjee

Ramamurthy

. Communication technologies and architectures for space network and interplanetary internet. IEEE Commun Surv Tutor 2013; 15(2): 881–897.

Jessica

Reinert

Patrick

. Challenges of integrating NASAs space communication networks. Int J Satell Comm N 2013; 31(5): 383–391.

Zhang

G-X

Gou

. A hierarchical autonomous system based topology control algorithm in space information network. KSII Trans Internet Inform Syst 2015; 9(9): 3572–3593.

Zhang

Bian

D-M

Xie

Z-D

. A novel space information network architecture based on autonomous system. In: Proceedings of the 2015 international conference on wireless communications and signal processing (WCSP ’15), Nanjing, China, 15–17 October 2015, pp.1–5. New York: IEEE.

Leong

Dimakis

. Distributed storage allocations. IEEE T Inform Theory 2012; 58(7): 4733–4752.

Wei

Y-M

Foo

Lim

. The auto-configurable LDPC codes for distributed storage. In: Proceedings of the IEEE international conference on computational science and engineering, Chengdu, China, 19–21 October 2014, pp.1332–1338. New York: IEEE.

Puducheri

. Fountain codes. IEE P: Commun 2005; 152(6): 1062–1068.

Sun

W-D

Wang

Y-J

Xiao

X-Q

. Tree-structured parallel regeneration for multiple data losses in distributed storage systems based on erasure codes. China Commun 2013; 10(4): 113–125.

10.

Kong

Z-N

Aly

Soljanin

. Decentralized coding algorithms for distributed storage in wireless sensor networks. IEEE J Sel Area Comm 2010; 28(2): 261–267.

11.

Jafarizadeh

Jamalipour

. Data persistency in wireless sensor networks using distributed luby transform codes. IEEE Sens J 2013; 13(12): 4880–4890.

12.

X-C

Chen

W-T

. LT codes based distributed coding for efficient distributed storage in wireless sensor networks. In: Proceedings of the IFIP networking conference, Toulouse, 20–22 May 2015, pp.1–9. New York: IEEE.

13.

Dimakis

Godfery

Y-N

. Network coding for distributed storage systems. IEEE T Inform Theory 2010; 56(9): 4539–4551.

14.

Lin

S-J

Chung

W-H

Han

. A unified form of exact-MSR codes via product-matrix frameworks. IEEE T Inform Theory 2015; 61(2): 873–886.

15.

Malloy

Nowak

. Near-optimal adaptive compressed sensing. IEEE T Inform Theory 2014; 60(7): 4001–4012.

16.

Chu

X-Y

Stamm

Liu

. Compressive sensing forensic. IEEE T Inf Foren Sec 2015; 10(7): 1416–1431.

17.

Chang

L-H

J-Y

. An improved RIP-based performance guarantee for sparse signal recovery via orthogonal matching pursuit. IEEE T Inform Theory 2014; 60(9): 5702–5715.

18.

Talari

Rahnavard

. CStorage: distributed data storage in wireless sensor networks employing compressive sensing. In: Proceedings of the IEEE global telecommunications conference (GLOBECOM ’11), Houston, TX, 5–9 December 2011, pp.1–5. New York: IEEE.

19.

Liu

Lin

. Design and analysis of compressive data persistence in large-scale wireless sensor networks. IEEE T Parall Distr 2014; 26(10): 2685–2698.

20.

Haupt

Bajwa

Rabbat

. Compressive sensing for networked data. IEEE Signal Proc Mag 2008; 25(2): 90–101.

21.

Lin

Luo

Liu

. Compressive data persistence in large-scale wireless sensor networks. In: Proceedings of the IEEE global telecommunications conference (GLOBECOM ’10), Miami, FL, 6–10 December 2010, pp.1–5. New York: IEEE.

22.

Yang

X-J

Tang

X-F

Dutkiewize

. Energy-efficient distributed data storage for wireless sensor networks based on compressed sensing and networking coding. IEEE T Wirel Commun 2013; 12(10): 5087–5099.

23.

Gong

Cheng

Chen

. Spatiotemporal compressive network coding for energy-efficient distributed data storage in wireless sensor networks. IEEE Commun Lett 2015; 19(5): 803–806.

24.

Wang

Y-L

Zhang

G-X

Bian

D-M

. Collaborative wideband compressed signal detection in interplanetary internet. Frequenz 2014; 68(6): 389–401.

25.

Huleihel

Merhav

Shamai

. On compressive sensing in coding problems: a rigorous approach. IEEE T Inform Theory 2015; 61(10): 5727–5744.

26.

Liang

Y-B

Lai

L-F

Poor

. A broadcast approach for fading wiretap channels. IEEE T Inform Theory 2014; 60(2): 842–858.

27.

Zhang

Tian

. Analysis of random walk mobility models with location heterogeneity. IEEE T Parallel Distr 2015; 26(10): 2657–2670.

28.

Mobius

Dargie

Schill

. Power consumption estimation models for processors, virtual machines, and servers. IEEE T Parall Distr 2014; 25(6): 1600–1614.

29.

Gupta

Kumar

. The capacity of wireless networks. IEEE T Inform Theory 2000; 46(2): 388–404.

30.

Candes

Plan

. A probabilistic and RIPless theory of compressed sensing. IEEE T Inform Theory 2011; 57(11): 7235–7254.

31.

Sybis

. Log-MAP equivalent Chebyshev inequality based algorithm for turbo TCM decoding. Electron Lett 2011; 47(18): 1049–1050.

Efficient distributed storage strategy based on compressed sensing for space information network

Abstract

Keywords

Introduction

Background and related work

AS networks of SIN

CS

Related work

Network model and problem description

Network model

Definition 1: node degree

Problem description

Proposed DSSCS strategy

Phase 1: initialization phase

Phase 2: CS encoding phase

Phase 3: storage phase

Phase 4: reconstruction phase

Theoretical analysis

Proof of CS reconstruction performance

Definition 2: isotropy property

Definition 3: incoherence property

Theorem 1

Proof of Theorem 1

Theorem 2

Lemma 1 (Conditional incoherence property 1)

Proof of Lemma 1

Proof of Theorem 2

Theorem 3

Lemma 2 (Conditional incoherence property 2)

Proof of Lemma 2

Proof of Theorem 3

Data dissemination cost

Simulation results and discussions

Simulation parameters and performance metrics

Performance on different parameters

Comparison with other strategies

Conclusion

Footnotes

Declaration of conflicting interests

Funding

References