A DCT Regularized Matrix Completion Algorithm for Energy Efficient Data Gathering in Wireless Sensor Networks

Abstract

This paper presents a novel matrix completion algorithm to enable energy efficient data gathering in wireless sensor networks. The algorithm takes full advantage of both the low-rankness and the DCT compactness features in the sensory data to improve the recovery accuracy. The time complexity of the algorithm is analyzed, which indicates it has a low computational cost. Moreover, the recovery error is carefully analyzed and a theoretical upper bound is derived. The error bound is then validated by experimental results. Extensive experiments are conducted on three datasets collected from two testbeds. Experimental results show that the proposed algorithm outperforms state-of-the-art methods for low sampling rate and achieves a good recovery accuracy even if the sampling rate is very low.

1. Introduction

Wireless sensor networks (WSNs) have been widely used in a variety of applications such as precision agriculture [1], personal health monitoring [2], and environment surveillance [3], in which sensor nodes with limited battery energy are deployed and periodically report their sensor readings to the base station or sink node. Therefore, a key issue in wireless sensor networks is how to efficiently gather these data from sensors provided with only limited energy resources. In the classical data gathering approach [4], each sensor node simply forwards its sensor readings to the sink, resulting in a large amount of traffic and energy consumption.

Recently, Compressive Sensing (CS) [5, 6] has emerged as a new approach to tackle the efficient data gathering problem in WSNs. Taking advantage of the sparsity in sensor readings, CS based methods [7–9] require fewer data packets than the classical approach. However, there are many practical problems when applying CS to data gathering in WSNs. Firstly, CS based methods require a prior dictionary to sparsify the sensor readings. Secondly, the measurement matrix in CS is composed of independent and identically distributed random Gaussian entries, which is dense with very few zero elements. Therefore, sensor nodes need to sample all sensor readings and perform a considerable number of measurement operations, resulting in a large amount of energy waste. Thirdly, CS requires the number of measurements to exceed a certain threshold (depending on the sparsity level of sensor readings) to achieve exact recovery. However, the realistic sensor signal is not always exactly sparse as it should be. Therefore, low sampling rate may lead to insufficient measurements and result in a bad recovery accuracy.

As an extension of CS, matrix completion [10] has shown its potential for enabling efficient data gathering in WSNs. Because the sensor readings are highly temporal-spatial correlated, the data matrix structured by the sensor readings will approximate to a low-rank matrix. Therefore, the sink node can gather only a few of the total sensor readings and adopt the matrix completion algorithm to reconstruct the missing data. However, unlike CS based methods, matrix completion based methods do not require the prior dictionary to sparsify the original signal. Furthermore, the sampling matrix (or measurement matrix) in matrix completion is much sparser than CS, which makes it more suitable for wireless sensor networks.

Utilizing the low-rankness feature in sensory data, there are many pioneering works [11–13] on applying matrix completion to WSN, which adopt the alternating least squares technique to estimate the low-rank matrix. The recovery accuracy is further improved by utilizing the spatiotemporal structure in the WSN data. However, the improvement is limited because the spatiotemporal structure directly implies the low-rank feature, and in some sense, these two features are equivalent. Moreover, the alternating least squares algorithm does not scale to large rank.

Besides the low-rankness feature, we observe that the sensor readings in WSNs also exhibit Discrete Cosine Transform (DCT) compactness feature. In other words, the sensor readings can be approximated by only a small number of DCT coefficients. Therefore, by taking full advantage of the DCT compactness feature of the WSNs data, in this paper, we propose a DCT Regularized Matrix Completion (DRMC) algorithm. We analyze the time complexity of DRMC, which indicates that the proposed algorithm has a low computational cost. Moreover, we analyze the recovery error of DRMC and derive a theoretical upper bound. The error bound is then validated by experimental results. Extensive experiments are carried out on three datasets that are collected from two realistic WSN testbeds. We compare the performance of DRMC with state-of-the-art methods. Experimental results show that DRMC outperforms state-of-the-art methods for low sampling rate and achieves a good recovery accuracy even if the sampling rate is very low.

The main contributions of this paper are summarized in the following: (i)

We examine the sensor data collected from real-world WSNs, which reveal two data features: (1) low-rankness, (2) DCT compactness.

(ii)

Inspired by these observations, we design a novel DCT Regularized Matrix Completion (DRMC) algorithm to estimate the missing sensory data. Experimental results indicate that DRMC outperforms state-of-the-art methods when the sampling rate is low.

(iii)

We analyze the time complexity of the DRMC algorithm, which indicates that DRMC has a low computational cost.

(iv)

We analyze the recovery error of the DRMC algorithm and derive a theoretical upper bound, which is then validated by experimental results.

The rest of this paper is organized as follows. Section 2 reviews the state-of-the-art methods. Section 3 models the problem. Section 4 examines the data features in WSNs. Section 5 proposes the DRMC algorithm. Section 6 analyzes the time complexity of the algorithm. Section 7 analyzes the recovery error of the algorithm. Section 8 evaluates the effectiveness of the proposed algorithm. Section 9 concludes this paper.

2. Related Works

In this section, we make a brief review of previous works related to the data gathering problem in wireless sensor networks.

2.1. Compressive Sensing

Compressive Sensing (CS) theory suggests that sparse signals can be accurately reconstructed from only a small number of measurements [5, 6]. It is a new paradigm for signal processing of networked data [7] and there are many CS based methods for data gathering in wireless sensor networks. Luo et al. proposed a data gathering scheme that applies Compressive Sensing theory to reduce communication cost [14]. Quer et al. presented a framework for data gathering and signals recovery in actual WSN deployments with the integration of CS [9]. Ebrahimi and Assi recently proposed a decentralized method to apply the Compressive Sensing to data gathering in wireless sensor networks [15].

2.2. Matrix Completion

Recently, there are many applications that apply matrix completion technique to wireless sensor networks. Utilizing the low-rankness and spatiotemporal correlation, Zhang et al. proposed a method to recover the lost data in internet traffic matrices [13]. Kong et al. designed an algorithm using the low-rank structure, time stability, space similarity, and multiattribute correlation to estimate the missing data in highly incomplete data matrix [12]. Cheng et al. presented a Spatiotemporal Compressive Data Collection (STCDG) algorithm that utilizes the low-rankness and short-term stability features to reduce data traffic in WSNs [11].

In our earlier work [16], we have studied the data recovery problem in wireless sensor networks when historical data are available and proposed a DCT-Regularized Partial Matrix Completion (DCT-RPMC) algorithm. However, the new algorithm proposed in this paper does not depend on any historical data, which greatly widens its applicability to more general scenarios.

3. Problem Formulation

In this section, we formally formulate the data gathering and recovery problem in wireless sensor networks and state the goal of this paper. The main notations that will be used in the rest of this paper are listed in Summary of Notations.

Assume that the network is composed of n sensor nodes. During a certain sampling period, the ith sensor node acquires m samples, which are modeled as an m-dimensional sensor vector ${\vec{x}}_{i}$ ,

\begin{matrix} {\vec{x}}_{i} = {[x (i, 0), x (i, Δ_{T}), \dots, x (i, (m - 1) Δ_{T})]}^{T}, \end{matrix}

(1)

where

Δ_{T}

is the sampling period. Therefore, the entire samples in the network can be organized as an environment matrix

X \in R^{m \times n}

\begin{matrix} X = [{\vec{x}}_{1}, {\vec{x}}_{2}, \dots, {\vec{x}}_{n}] . \end{matrix}

(2)

In order to reduce energy consumption, only a fraction of the entries of X will be transmitted to the sink node. We then define a matrix $M \in R^{m \times n}$ as the sampling matrix to indicate which parts of X are transmitted to the sink:

\begin{matrix} M (i, j) = \{\begin{cases} 1, & i f X (i, j) i s t r a n s m i t t e d, \\ 0, & o t h e r w i s e . \end{cases} \end{matrix}

(3)

And the sampling rate τ is defined in the following:

\begin{matrix} τ = \frac{\sum_{i, j}^{} M (i, j)}{m n} . \end{matrix}

(4)

Let $Y \in R^{m \times n}$ denote the data transmitted to the sink. Y is an incomplete version of X, with missing entries replaced by zeros. Therefore, we have

\begin{matrix} Y = M \circ X . \end{matrix}

(5)

The symbol ∘ denotes the element-wise matrix production operator.

After obtaining Y, the sink can reconstruct the original environment matrix X with the proposed algorithm in Section 5. Our goal is to generate a reconstructed matrix $\hat{X}$ that approximates to the original environment matrix X as closely as possible. We measure the recovery performance by the Normalized Mean Absolute Error (NMAE):

\begin{matrix} N M A E = \frac{\sum_{i, j : M (i, j) = 0}^{} |X (i, j) - \hat{X} (i, j)|}{\sum_{i, j : M (i, j) = 0}^{} |X (i, j)|} . \end{matrix}

(6)

4. Exploring the Data Features

In this section, we examine the data features in real-world wireless sensor networks.

4.1. Experimental Datasets

We use three datasets, which are collected from two WSN testbeds, to serve as the ground truth. The summary of the datasets is shown in Table 1.

Table 1

Experimental datasets.

Data name	Matrix size	Time interval
Intel Temperature	54 nodes × 96 intervals	10 minutes
Intel Humidity	54 nodes × 96 intervals	10 minutes
PARED Temperature	50 nodes × 96 intervals	10 minutes

The first category of datasets is collected by 54 Mica2Dot nodes deployed in the Intel Berkeley Research Lab [17] between February 28 and April 5, 2004. The Mica2Dot node reports collected sensor data including humidity and temperature once every 30 seconds. However, we find that the raw dataset has considerable missing data. Therefore, we have rearranged the raw data (by changing the reporting interval from 30 seconds to 10 minutes) to avoid the missing data.

The second category of dataset consists of temperature readings, which are collected with a 10-minute interval by our own testbed, namely, PARED. PARED consists of 50 sensor nodes. More details about PARED can be found in [18].

4.2. Low-Rank Structure

We first examine the low-rank structure in WSN datasets using the Singular Value Decomposition (SVD). The environment matrix X can be decomposed into three matrices by SVD:

\begin{matrix} X = U Σ V^{T}, \end{matrix}

(7)

where U is an

m \times m

orthonormal matrix, V is an

n \times n

orthonormal matrix, and Σ is an

m \times n

diagonal matrix with singular values

σ_{1}, σ_{2} \dots σ_{p}

sorted in a descending order

(p = \min (m, n))

Because sensor readings in a WSN are spatiotemporal correlated, the environment matrix X would exhibit low-rank feature. More exactly, X should approximate to a low-rank matrix of rank r. So, the first r singular values will occupy the most energy of X. We use the following as the metric to examine the quality of the low-rank approximation:

\begin{matrix} g (r) = \frac{\sum_{i = 1}^{r} σ_{i}}{{‖X‖}_{*}} = \frac{\sum_{i = 1}^{r} σ_{i}}{\sum_{i = 1}^{p} σ_{i}}, \end{matrix}

(8)

where

{‖\cdot‖}_{*}

is the nuclear norm and

{‖X‖}_{*} = \sum_{i = 1}^{p} ‍ σ_{i}

Figure 1 shows the low-rank approximation quality of the three datasets. We found that the largest 10 singular values occupy the 93%–99% of the total energy, which suggests that the WSN datasets exhibit a good low-rank feature.

Figure 1

Low-rankness feature.

4.3. DCT Compactness

We also observed that the sensor readings in WSN exhibit DCT compactness feature. In other words, the first k DCT coefficients of the sensor vector ${\vec{x}}_{i}$ concentrate the most energy of ${\vec{x}}_{i}$ .

We first define D as the $m \times m$ Discrete Cosine Transform Matrix:

\begin{matrix} D (i, j) = \sqrt{\frac{2}{m}} \cos [\frac{π}{m} (i - \frac{1}{2}) (j - \frac{1}{2})] . \end{matrix}

(9)

Then, the

m \times m

orthonormal matrix D can be divided into two submatrices:

\begin{matrix} D = [\begin{bmatrix} D_{1} \\ D_{2} \end{bmatrix}], \end{matrix}

(10)

where

D_{1}

consists of the first k rows of D and

D_{2}

consists of the last

m - k

rows of D.

Therefore, if the first k DCT coefficients occupy the most energy of ${\vec{x}}_{i}$ , we will have ${‖D_{1} {\vec{x}}_{i}‖}_{2} / {‖{\vec{x}}_{i}‖}_{2} \approx 1$ and ${‖D_{2} {\vec{x}}_{i}‖}_{2} / {‖{\vec{x}}_{i}‖}_{2} \approx 0$ . Similarly, for the matrix form, we will have ${‖D_{1} X‖}_{F} / {‖X‖}_{F} \approx 1$ and ${‖D_{2} X‖}_{F} / {‖X‖}_{F} \approx 0$ , where ${‖*‖}_{F}$ is the Frobenius norm with ${‖X‖}_{F} = \sqrt{\sum_{i, j} ‍ X (i, j)^{2}}$ . So, we use the following function to examine the DCT compactness feature:

\begin{matrix} h (k) = \frac{{‖D_{1} X‖}_{F}}{{‖X‖}_{F}} . \end{matrix}

(11)

From Figure 2, we can see that the first 10 DCT coefficients concentrate $99 %$ of the total energy, which suggests that these WSN datasets exhibit a good DCT compactness feature.

Figure 2

DCT compactness feature.

5. Algorithm

In this section, we proposed a novel matrix completion algorithm, namely, DCT Regularized Matrix Completion (DRMC), to solve the data recovery problem in WSN data gathering. DRMC takes full advantage of the low-rankness and DCT compactness features to improve the recovery accuracy.

5.1. Utilization of Low-Rankness

As mentioned before in Section 3, the goal of the recovery problem is to estimate X from only a fraction of known entries. According to [10], we can recover X by solving the following rank optimization problem if X is a low-rank matrix:

\begin{matrix} m i n i m i z e r a n k (X) \\ s u b j e c t t o M \circ X = Y . \end{matrix}

(12)

However, the rank minimization problem (12) is NP-hard and is not solvable in polynomial time. Since the nuclear norm is the optimal convex approximation of the rank function, a reasonable solution is to solve a convex relaxation problem with the rank function replaced by the nuclear norm:

\begin{matrix} m i n i m i z e {‖X‖}_{*} \\ s u b j e c t t o M \circ X = Y . \end{matrix}

(13)

However, in a more realistic occasion, the environment matrix X is not an exactly low-rank matrix. There may not exist low-rank matrices that exactly satisfy the constraints in problem (13). So, we converted the constrained optimization problem (13) into the following nuclear norm regularized optimization problem:

\begin{matrix} m i n i m i z e \frac{1}{2} {‖M \circ X - Y‖}_{F}^{2} + λ {‖X‖}_{*}, \end{matrix}

(14)

where

λ > 0

is the nuclear regularization parameter.

5.2. Utilization of DCT Compactness

Though we can estimate X by solving the optimization problem (14), it will overfit to the known entries of X when the sampling rate is low, which will lead to large recovery error in the estimation of the missing entries.

Therefore, to reduce the overfitting in (14), we exploit the DCT compactness feature of the sensor data. As mentioned in Section 4.3, ${‖D_{2} X‖}_{F} / {‖X‖}_{F} \approx 0$ . So, we added ${‖D_{2} X‖}_{F}$ as the DCT regularization term to (14), and finally, we obtain the following optimization problem:

\begin{matrix} m i n i m i z e \frac{1}{2} {‖M \circ X - Y‖}_{F}^{2} + λ {‖X‖}_{*} + μ {‖D_{2} X‖}_{F}^{2}, \end{matrix}

(15)

where μ is the DCT regularization parameter.

5.3. The DRMC Algorithm

We present the DRMC algorithm by solving the optimization problem in (15). The pseudocode is shown in Algorithm 1. Next, we will describe the design of DRMC in details.

Algorithm 1: DRMC algorithm.

Input:

$Y \in R^{m \times n}$ : collected data matrix

$M \in R^{m \times n}$ : sampling matrix

$\bar{λ}$ : nuclear norm regularization parameter

μ: DCT regularization parameter

Output:

$\hat{X} \in R^{m \times n}$ : reconstructed environment matrix

Main procedure:

( $1$ ) $L \leftarrow 1 + 2 μ$ ;

( $2$ ) $X_{old} \leftarrow 0$ ; $X_{new} \leftarrow 0$ ;

( $3$ ) Select $λ_{1} > λ_{2} > \dots > λ_{K} = \bar{λ}$

( $4$ ) for $λ = λ_{1}, λ_{2}, \dots, λ_{K}$ do

( $5$ ) $t_{old} \leftarrow 1$ ; $t_{new} \leftarrow 1$ ;

( $6$ ) repeat

( $7$ ) $Z ⟵ X_{new} + \frac{t_{old} - 1}{t_{new}} (X_{new} - X_{old})$ ;

( $8$ ) $X_{old} \leftarrow X_{new}$ ;

( $9$ ) $G \leftarrow Z - L^{- 1} (M \circ Z - Y + 2 μ D_{2}^{T} D_{2} Z)$ ;

( $1$ 0) $X_{new} \leftarrow S_{λ L^{- 1}} (G)$ ;

( $11$ ) $t_{old} \leftarrow t_{new}$ ;

( $12$ ) $t_{n e w} ⟵ \frac{1 + \sqrt{1 + 4 t_{old}^{2}}}{2}$ ;

( $13$ ) until ${‖X_{new} - X_{old}‖}_{F} / {‖X_{old}‖}_{F} < ϵ$

( $14$ ) end for

( $15$ ) $\hat{X} \leftarrow X_{new}$ ;

( $16$ ) return $\hat{X}$

The object function in (15) can be rewritten into the following form:

\begin{matrix} F (X) ≔ f (X) + P (X), \end{matrix}

(16)

with

\begin{matrix} f (X) = \frac{1}{2} {‖M \circ X - Y‖}_{F}^{2} + μ {‖D_{2} X‖}_{F}^{2}, \end{matrix}

(17)

\begin{matrix} P (X) = λ {‖X‖}_{*} . \end{matrix}

(18)

Note that $P (X)$ is a proper, convex, lower semicontinuous (lsc) [19] function but it is nonsmooth, while $f (X)$ is a convex smooth function and is continuously differentiable, with

\begin{matrix} \nabla f (X) = (M \circ X - Y) + 2 μ D_{2}^{T} D_{2} X . \end{matrix}

(19)

Furthermore,

\nabla f (X)

is Lipschitz continuous with a positive constant L:

\begin{array}{l} {‖\nabla f (X_{1}) - \nabla f (X_{2})‖}_{F} \leq L {‖X_{1} - X_{2}‖}_{F}, \\ \forall X_{1}, X_{2} \in dom P, \end{array}

(20)

where

dom P = {X | P (X) < \infty}

. Proposition 2 indicates that the Lipschitz constant of

f (X)

L = 1 + 2 μ

Since $P (X)$ is nonsmooth, it is difficult to directly minimize the objective function $F (X)$ . Instead, we choose to iteratively minimize a sequence of quadratic approximations of $F (X)$ , which is an effective way to minimize the unconstrained nonsmooth convex function [20–22]. The quadratic approximation of $F (\cdot)$ at point Z is defined as the following:

\begin{array}{l} Q_{L} (X, Z) ≔ f (Z) + 〈\nabla f (Z), X - Z〉 + \frac{L}{2} {‖X - Z‖}_{F}^{2} \\ + P (X) . \end{array}

(21)

And the objective variable

X_{n e w}

is repeatedly updated to the minimizer of

Q_{L} (X, Z)

, until

{‖X_{n e w} - X_{o l d}‖}_{F} / {‖X_{o l d}‖}_{F} < ϵ

. The convergence of such iterative process is well studied in [21].

We then introduce an auxiliary variable G to minimize $Q_{L} (X, Z)$ . As suggested by Proposition 5, we can minimize $Q_{L} (X, Z)$ using the singular value shrinkage operator defined in (26). Thus, we have $X_{n e w} = S_{λ L^{- 1}} (G)$ .

What is more, we consider a warm-start technique for the nuclear regularization parameter λ. Rather than remaining unchanged, λ is monotonically decreasing in the iterative process. The nuclear regularization parameter λ starts with an initial value $λ_{1}$ and gradually declines to $\bar{λ}$ , forming a sequence of $λ_{1} > λ_{2} > \dots > λ_{K}, (λ_{K} = \bar{λ})$ .

Lemma 1.

$D_{2}$ is defined as in (10). Then, one has

\begin{matrix} {‖D_{2}^{T} D_{2} X‖}_{F} \leq {‖X‖}_{F}, \forall X \in R^{m \times n} . \end{matrix}

(22)

Proof.

Since D is orthonormal, $D_{2} D_{2}^{T} = I, {‖D X‖}_{F} = {‖X‖}_{F}$ .

And note that ${‖X‖}_{F} = \sqrt{T r (X^{T} X)}$ , then we can obtain

\begin{array}{l} {‖D_{2}^{T} D_{2} X‖}_{F} = \sqrt{T r (X^{T} D_{2}^{T} D_{2} D_{2}^{T} D_{2} X)} \\ = \sqrt{T r (X^{T} D_{2}^{T} D_{2} X)} = {‖D_{2} X‖}_{F} \\ \leq {‖D X‖}_{F} = {‖X‖}_{F} . \end{array}

(23)

Proposition 2.

Assume that $f (X)$ is defined as (17); then, $f (X)$ is Lipschitz continuous with

\begin{matrix} L = 1 + 2 μ . \end{matrix}

(24)

Proof.

Note that ${‖M \circ X‖}_{F} \leq {‖X‖}_{F}$ , and by using the Lemma 1, we obtain that

\begin{array}{l} {‖\nabla f (X_{1}) - \nabla f (X_{2})‖}_{F} \\ = {‖M \circ (X_{1} - X_{2}) + 2 μ D_{2}^{T} D_{2} (X_{1} - X_{2})‖}_{F} \\ \leq {‖M \circ (X_{1} - X_{2})‖}_{F} + 2 μ {‖D_{2}^{T} D_{2} (X_{1} - X_{2})‖}_{F} \\ \leq {‖X_{1} - X_{2}‖}_{F} + 2 μ {‖X_{1} - X_{2}‖}_{F} \\ = (1 + 2 μ) {‖X_{1} - X_{2}‖}_{F} . \end{array}

(25)

Definition 3.

Decompose the matrix $X \in R^{m \times n}$ of rank r by SVD: $X = U Σ V^{T}$ , where $U \in R^{m \times r}$ and $V \in R^{n \times r}$ are orthonormal matrices and $Σ = d i a g ({σ_{i}}_{1 \leq i \leq r})$ . Define the singular value shrinkage operator [23] $S_{λ}$ as follows:

\begin{matrix} S_{λ} (X) = U S_{λ} (Σ) V^{T}, \\ S_{λ} (Σ) = d i a g ({\{{(σ_{i} - λ)}_{+}\}}_{1 \leq i \leq r}), \end{matrix}

(26)

where

t_{+}

is the positive part of t,

t_{+} = m a x (0, t)

Lemma 4.

Let $G \in R^{m \times n}$ . Then,

\begin{matrix} S_{λ} (G) \equiv \underset{X \in R^{m \times n}}{a r g m i n} \{\frac{1}{2} {‖X - G‖}_{F}^{2} + λ {‖X‖}_{*}\}, \end{matrix}

(27)

where

S_{λ} (G)

is the singular value shrinkage operator of G.

Proof.

The proof of Lemma 4 can be found in [23].

Proposition 5.

Let $Z \in R^{m \times n}$ and $P (X) = λ {‖X‖}_{*}$ . Assume that $f (X)$ is Lipschitz continuous and define $G \in R^{m \times n}$ with

\begin{matrix} G = Z - L^{- 1} \nabla f (Z) . \end{matrix}

(28)

Then, one has

\begin{matrix} \underset{X \in R^{m \times n}}{a r g m i n} Q_{L} (X, Z) = S_{λ L^{- 1}} (G) . \end{matrix}

(29)

Proof.

Consider

\begin{array}{l} Q_{L} (X, Z) = f (Z) + 〈\nabla f (Z), X - Z〉 + \frac{L}{2} {‖X - Z‖}_{F}^{2} \\ + P (X) \\ = f (Z) + 〈\nabla f (Z), X - Z〉 + \frac{L}{2} {‖X - Z‖}_{F}^{2} \\ + λ {‖X‖}_{*} \\ = \frac{L}{2} {‖X - Z - \frac{1}{L} \nabla f (Z)‖}_{F}^{2} + λ {‖X‖}_{*} \\ + f (Z) - \frac{1}{2 L} {‖\nabla f (Z)‖}_{F}^{2} \\ = L \{\frac{1}{2} {‖X - G‖}_{F}^{2} + \frac{λ}{L} {‖X‖}_{*}\} + f (Z) \\ - \frac{1}{2 L} {‖\nabla f (Z)‖}_{F}^{2} . \end{array}

(30)

Thus, combined with Lemma 4, we can obtain that

\begin{array}{l} \underset{X \in R^{m \times n}}{a r g m i n} Q_{L} (X, Z) \\ = \underset{X \in R^{m \times n}}{a r g m i n} \{L (\frac{1}{2} {‖X - G‖}_{F}^{2} + \frac{λ}{L} {‖X‖}_{*}) + f (Z)\} \\ - \frac{1}{2 L} {‖\nabla f (Z)‖}_{F}^{2} = \underset{X \in R^{m \times n}}{a r g m i n} \{\frac{1}{2} {‖X - G‖}_{F}^{2}\} \\ + \frac{λ}{L} {‖X‖}_{*} = S_{λ L^{- 1}} (G .) \end{array}

(31)

6. Complexity Analysis

In this section, we discuss the time complexity of the proposed algorithm.

After analyzing the steps in Algorithm 1, we find that the most computationally intensive step is the singular value shrinkage operation that performs SVD on G, which dominates the computational complexity of this algorithm. The time complexity of SVD for an $m \times n$ matrix is $O (m n^{2})$ [24]. And for any $ϵ > 0$ , the iterative process in Algorithm 1 will terminate in $O (\sqrt{L / ϵ})$ iterations with an ϵ-optimal solution [21, 25]. Consequently, the time complexity of our algorithm is $O (K \sqrt{L / ϵ} m n^{2})$ .

We can further decrease the time complexity of the DRMC algorithm by applying Partial Reorthogonalization Package (PROPACK) [26] in the singular value shrinkage operation. PROPACK uses the Lanczos method [24] to compute only a partial SVD of G. However, it cannot a priori compute singular values that are greater than $λ / L$ . Hence, we need to predetermine the number of singular values to be computed (denoted as $s v_{i}$ ) at the beginning of the ith iteration, and PROPACK can then compute the $s v_{i}$ largest singular values and corresponding singular vectors. We adopt the prediction rule proposed in [27]:

\begin{matrix} s v_{i + 1} = \{\begin{cases} s v p_{i} + 1, & i f s v p_{i} < s v_{i} . \\ \min \{s v p_{i} + 10, m, n\}, & i f s v p_{i} = s v_{i}, \end{cases} \end{matrix}

(32)

where

s v_{i}

is the predicted number of singular values,

s v p_{i}

is the actual number of singular values that are larger than

λ / L

, and

s v_{0} = 10

The time complexity of the Lanczos method is $O (r m n)$ for $m \times n$ matrix with rank of r [24]. Therefore, the time complexity of DRMC algorithm is $O (K \sqrt{L / ϵ} r m n)$ if PROPACK is used. For fixed number of iterations, the complexity of DRMC can be simplified as $O (r m n)$ , while the state-of-the-art matrix completion based methods [11–13] require a complexity of $O (r^{2} m n)$ . So, DRMC is more computationally efficient.

7. Error Analysis

In this section, we analyze the recovery error of the DRMC algorithm and present a theoretical upper bound.

Before starting the analysis, we first introduce some assumptions and lemmas.

Assumption 6.

The original data matrix X can be approximated by the first k DCT coefficients, with approximation error ξ:

\begin{matrix} ξ = {‖X - D_{1}^{T} D_{1} X‖}_{F} = {‖D_{2}^{T} D_{2} X‖}_{F} . \end{matrix}

(33)

Assumption 7.

Let $A : R^{m \times n} \to R^{m \times n}$ be the linear operator defined as

\begin{matrix} A (X) = 2 μ L^{- 1} D_{2}^{T} D_{2} X + L^{- 1} M \circ X \end{matrix}

(34)

and

I

be the identity operator.

Then, there exists a constant $0 \leq η < 1$ , such that

\begin{matrix} \sup_{A (X) \neq 0, {‖X‖}_{F} = 1} {‖(I - A) (X)‖}_{F} \leq η . \end{matrix}

(35)

Lemma 8.

Suppose that $S_{λ}$ is the singular value shrinkage operator; then,

\begin{array}{l} {‖S_{λ} (X_{1}) - S_{λ} (X_{2})‖}_{F} \leq {‖X_{1} - X_{2}‖}_{F}, \\ \forall X_{1}, X_{2} \in R^{m \times n} . \end{array}

(36)

Proof.

The detailed proof of Lemma 8 can be found in [28].

Lemma 9.

Suppose that $X \in R^{m \times n}$ with rank of r; then,

\begin{matrix} {‖S_{λ} (X) - X‖}_{F} \leq r λ . \end{matrix}

(37)

Proof.

Consider

\begin{array}{l} {‖S_{λ} (X) - X‖}_{F} = {‖U (S_{λ} (Σ) - Σ) V^{T}‖}_{F} \\ = {‖U (Σ - S_{λ} (Σ)) V^{T}‖}_{F} \\ = {‖U (diag ({\{σ_{i} - {(σ_{i} - λ)}_{+}\}}_{1 \leq i \leq r})) V^{T}‖}_{F} \\ = {‖U d i a g ({\{d_{i}\}}_{1 \leq i \leq r}) V^{T}‖}_{F} = \sqrt{\sum_{i = 1}^{r} {d_{i}}^{2}}, \end{array}

(38)

where

\begin{matrix} d_{i} = \{\begin{cases} λ, & i f σ_{i} \geq λ; \\ σ_{i}, & i f σ_{i} < λ . \end{cases} \end{matrix}

(39)

Therefore, we have

\begin{matrix} {‖S_{λ} (X) - X‖}_{F} \leq r λ . \end{matrix}

(40)

The upper bound of the recovery error is then given in the following theorem.

Theorem 10.

Suppose that $\hat{X}$ is the estimate of X obtained by the DRMC algorithm; then, one has

\begin{matrix} {‖\hat{X} - X‖}_{F} \leq \frac{r \bar{λ} + 2 ξ μ}{(1 - η) (1 + 2 μ)} . \end{matrix}

(41)

Proof.

When the iterative process in Algorithm 1 has converged, $X_{n e w}$ will be equal to $X_{o l d}$ . Hence, $\hat{X}$ will be the fixed point of the following:

\begin{array}{l} \hat{X} \\ = S_{\bar{λ} L^{- 1}} [(I - 2 μ L^{- 1} D_{2}^{T} D_{2}) \hat{X} - L^{- 1} M \circ (\hat{X} - X)] . \end{array}

(42)

Therefore, we have

\begin{array}{l} {‖ \hat{X} - X ‖}_{F} = ‖ S_{\bar{λ} L^{- 1}} [(I - 2 μ L^{- 1} D_{2}^{T} D_{2}) \hat{X} - L^{- 1} M \\ \circ (\hat{X} - X)] - {X ‖}_{F} = ‖ S_{\bar{λ} L^{- 1}} [(I - 2 μ L^{- 1} D_{2}^{T} D_{2}) \hat{X} \\ - L^{- 1} M \circ (\hat{X} - X)] \\ - S_{\bar{λ} L^{- 1}} [(I - 2 μ L^{- 1} D_{2}^{T} D_{2}) X] \\ {+ S_{\bar{λ} L^{- 1}} [(I - 2 μ L^{- 1} D_{2}^{T} D_{2}) X] - X ‖}_{F} \\ \leq ‖ S_{\bar{λ} L^{- 1}} [(I - 2 μ L^{- 1} D_{2}^{T} D_{2}) \hat{X} - L^{- 1} M \\ {\circ (\hat{X} - X)] - S_{\bar{λ} L^{- 1}} [(I - 2 μ L^{- 1} D_{2}^{T} D_{2}) X] ‖}_{F} \\ + ‖ S_{\bar{λ} L^{- 1}} (X) - X ‖_{F} + 2 μ L^{- 1} {‖ D_{2}^{T} D_{2} X ‖}_{F} . \end{array}

(43)

Combining (43) and Lemma 8, we have

\begin{array}{l} {‖\hat{X} - X‖}_{F} \\ \leq {‖(I - 2 μ L^{- 1} D_{2}^{T} D_{2}) (\hat{X} - X) - L^{- 1} M \circ (\hat{X} - X)‖}_{F} \\ + {‖S_{\bar{λ} L^{- 1}} (X) - X‖}_{F} + 2 μ L^{- 1} {‖D_{2}^{T} D_{2} X‖}_{F} \\ = {‖(I - A) (\hat{X} - X)‖}_{F} + {‖S_{\bar{λ} L^{- 1}} (X) - X‖}_{F} \\ + 2 μ L^{- 1} {‖D_{2}^{T} D_{2} X‖}_{F}, \end{array}

(44)

where

I

is the identity operator and

A

is the operator defined in (34).

Then, applying Assumption 6 and Lemma 9 to (44), we have

\begin{matrix} {‖\hat{X} - X‖}_{F} \leq {‖(I - A) (\hat{X} - X)‖}_{F} + \bar{λ} L^{- 1} r + 2 μ L^{- 1} ξ . \end{matrix}

(45)

Combining Assumption 7, (24), and (45), we finally obtain the following error bound:

\begin{matrix} {‖\hat{X} - X‖}_{F} \leq \frac{r \bar{λ} + 2 ξ μ}{(1 - η) L} = \frac{r \bar{λ} + 2 ξ μ}{(1 - η) (1 + 2 μ)} . \end{matrix}

(46)

Let E represent the upper bound of the recovery error. Then, according to Theorem 10, $E = (r \bar{λ} + 2 ξ μ) / (1 - η) (1 + 2 μ)$ . Note that $E (\bar{λ})$ is an increasing function of $\bar{λ}$ . So, we expect the actual recovery error of the DRMC algorithm to increase with $\bar{λ}$ , which is confirmed later by simulation results in Section 8.3.

8. Evaluation

We designed a data gathering scheme based on the proposed DRMC algorithm. The data gathering procedure is similar to [11]. Firstly, sink node broadcasts a sampling rate to all sensor nodes. Secondly, each sensor node randomly and independently decides whether to forward its readings to the sink according to the sampling rate. Finally, the sink node collects the incomplete data matrix and uses DRMC to retrieve the missing data. After implementing this data gathering scheme by Matlab, we carried out extensive experiments on three real-world datasets (as shown in Table 1) to evaluate the effectiveness of DRMC.

8.1. Baseline Methods

We select two state-of-the-art methods to compare with DRMC. The first method is Compressive Sensing (CS). We choose the DCT matrix defined in (9) to serve as the orthonormal basis in CS. The second method is Spatiotemporal Compressive Data Collection (STCDG). The parameters of STCDG are set to $λ = 0.5$ , $r = 10$ . Note that since our earlier work, namely, DCT-RPMC, depends on historical data while the proposed algorithm does not, we do not select DCT-RPMC as the baseline method.

8.2. Recovery Accuracy

Firstly, we compared the recovery accuracy of the proposed algorithm with two baseline methods described above. The parameters of DRMC are listed in Table 2.

Table 2

Parameter settings for DRMC.

Parameter name	$\bar{λ}$	μ	k

Default value	0.001	1	10

Simulation experiments are carried out on three real-world datasets. Each simulation is conducted for 100 independent trials. The recovery errors are computed according to (6) and are averaged over the 100 trials.

Comparison results are shown in Figures 3–5. For experiments on Intel Temperature Trace, all methods achieve nearly the same recovery accuracy when the sampling rate is high. When the sampling rate is below a certain value ( $τ < 0.1$ ), recovery performance of baseline methods deteriorates quickly, while DRMC still achieves a good recovery accuracy. When the sampling rate is as low as 0.03, which means 97% of data loss, DRMC can reconstruct the lost data with recovery error less than 10%, while recovery error of CS and STCDG is close to 100%.

Figure 3

Recovery accuracy on Intel Temperature Trace.

Figure 4

Recovery accuracy on Intel Humidity Trace.

Figure 5

Recovery accuracy on PARED Temperature Trace.

Comparison results on Intel Humidity Trace are very similar to that on Intel Temperature Trace. Recovery error of DRMC is about 9% when the sampling rate is 0.03, which is noticeably better than that of baseline methods.

For experiments on PARED Temperature Trace, DRMC still outperforms baseline methods for low sampling rate. The recovery error of DRMC is about $16 %$ when the sampling rate is 0.03, which is slightly worse than that on other two traces. This is because the low-rankness and DCT compactness features of PARED Temperature are not as good as that of other two traces, as shown in Figures 1 and 2.

8.3. Parameter Settings

The DRMC algorithm depends on several input parameters. Clearly, the choice of these parameters will affect the recovery performance of DRMC. In this subsection, we discuss how to choose the parameters for DRMC.

The nuclear norm regularization parameter λ is an important parameter to ensure the low-rank feature of the reconstructed data matrix. In DRMC, we adopted a warm-start strategy for λ, in which λ is linearly reduced from $λ_{1}$ to $λ_{K}$ . In the implementation, $λ_{1} = 10 \bar{λ} + 1$ , $λ_{K} = \bar{λ}$ , and $K = 20$ . We tested DRMC on a range of values of $\bar{λ}$ to investigate how $\bar{λ}$ effects the performance of DRMC. Figure 6 shows the experimental results. The recovery errors of DRMC increase with $\bar{λ}$ , just as what we predicted in Section 7 by the theoretical error bound (41). Note that when $\bar{λ} < 0.1$ , the recovery performance of DRMC is not sensitive to $\bar{λ}$ . So, in practice, we just set it to a small enough value, $\bar{λ} = 0.001$ .

Figure 6

Effect of parameter $\bar{λ}$ .

Parameter μ is a regularization parameter that guarantees the DCT compactness feature of the recovered signal. Figure 7 shows the effects of μ on the recovery performance of DRMC. The recovery errors decline with μ and are stable when $μ > 0.1$ . So, we choose to use $μ = 1$ for our experiments.

Figure 7

Effect of parameter μ.

Recall that k represents the number of concentrated DCT coefficients. As discussed before in Section 4.3, the first 10 DCT coefficients concentrate 99% of the total energy. As expected, Figure 8 shows that the recovery errors are dropping fast with k and are stable when $k > 10$ . So, we choose to use $k = 10$ for our experiments. Note that the recovery errors are slightly increasing with the growth of k when $k > 10$ . We explain this by considering the extreme case when $k = m$ . If $k = m$ , $D_{1}$ is equal to D. As a result, the DCT regularization term ${‖D_{2} X‖}_{F}^{2}$ in (15) is automatically equal to zero. Therefore, the DCT compactness property of the sensory data is not utilized and the recovery errors will increase, just as the case in Figure 7 when $μ = 0$ .

Figure 8

Effect of parameter k.

8.4. Energy Consumption and Network Lifespan

DRMC-based data gathering protocol is more energy efficient because it transmits less packets than the classic one (receiving and forwarding). As a result, DRMC-based protocol can save more energy and prolong the lifespan of wireless sensor networks. To verify this, simulation experiments are conducted and the simulation configuration is shown in Table 3.

Table 3

Simulation configuration for network lifespan.

Parameter name	Value
Number of nodes	1000
Initial energy	1 J
Sampling period	10 seconds
Data size	16 bits
$E_{T x}$	100 nJ/bit
$E_{R x}$	120 nJ/bit
$E_{Amp}$	0.01 nJ/(bit⋅m²)

In the simulation, sensor nodes are randomly deployed in a 500 m × 500 m area and the sink node is deployed in the center. Each sensor node is equipped with 1 J energy. To evaluate the energy consumption, we adopt the following energy model [29]:

\begin{matrix} E_{T} (k, d) = \{\begin{cases} (E_{T x} + d^{2} \times E_{A m p}) \times k, & i f d < d_{T h r e s}, \\ (E_{T x} + d^{4} \times E_{A m p}) \times k, & i f d \geq d_{T h r e s}, \end{cases} \\ E_{R} (k) = k E_{R x} . \end{matrix}

(47)

$E_{T} (k, d)$ represents the energy consumption of transmitting k bits data with distance d. $E_{R} (k)$ denotes the energy consumption of receiving k bits data. $E_{T x}$ is the energy consumed by the transmitting circuit to process 1 bit data. $E_{R x}$ is the energy consumed by the receiving circuit to process 1 bit data. $E_{A m p}$ is the energy consumed by power amplifying circuit.

Figure 9 demonstrates the network lifespan of DRMC-based protocol and other baseline protocols under different sampling rate. Note that the network lifespan is defined as the time when the first energy exhausted node appears. Apparently, the sampling rate does not play a role in the classic data gathering, since it directly transmits all data without compression. Therefore, the lifespan curve of the classic protocol in Figure 9 is a straight line. For the CS method, the smaller the sampling rate is, the less measurements are taken. And as shown in Figure 9, the lifespan of CS is decreasing with the sampling rate. However, when the sampling rate is above a certain value, the lifespan of CS is even worse than the classic one. The reason why CS performs badly for large sampling rate is well analyzed in [30]. Figure 9 shows that DRMC-based protocol achieves the best lifespan. Similarly, the lifespan of DRMC is decreasing with the sampling rate. When the sampling rate decreases to 1, DRMC-based protocol is equivalent to the classic protocol and the lifespan of DRMC is equal to the classic one. Note that the lifespan of STCDG is exactly the same as DRMC, because both of the two methods are based on matrix completion. The lifespan of DRMC is longer than CS because the sampling matrix in DRMC is much sparser than that in CS.

Figure 9

Relation of network lifespan to sampling rate.

9. Conclusion

In this paper, we studied the data gathering and reconstruction problem in WSNs. We modeled the problem as matrix completion problem and investigated the data features in real WSN datasets. Then, by taking advantage of the low-rankness and DCT compactness features in WSNs, we proposed a DCT Regularized Matrix Completion (DRMC) algorithm to reconstruct the missing data. The recovery error of DRMC is carefully analyzed and a theoretical error upper bound is presented. Experimental results show that DRMC outperforms state-of-the-art methods for low sampling rate and achieves a good recovery accuracy even if the sampling rate is very low.

Footnotes

Summary of Notations

Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.

Acknowledgment

This work is funded by the National Nature Science Foundation of China under Grant no. 61371135.

References

Baggio

Wireless sensor networks in precision agriculture

Proceedings of the ACM Workshop on Real-World Wireless Sensor Networks (REALWSN ′05)

2005

Stockholm, Sweden

Milenković

Otto

Jovanov

Wireless sensor networks for personal health monitoring: issues and an implementation

Computer Communications 2006 29 13-14 2521 2533

10.1016/j.comcom.2006.02.011

2-s2.0-33746764996

Liu

Zhao

Tang

S.-J.

X.-Y.

Dai

Canopy closure estimates with greenorbs: sustainable sensing in the forest

Proceedings of the 7th ACM Conference on Embedded Networked Sensor Systems

November 2009

ACM

99 112

10.1145/1644038.1644049

2-s2.0-74549201690

Madden

S. R.

Franklin

M. J.

Hellerstein

J. M.

Hong

TinyDB: an acquisitional query processing system for sensor networks

ACM Transactions on Database Systems 2005 30 1 122 173

10.1145/1061318.1061322

2-s2.0-23944487783

Candes

E. J.

Tao

Near-optimal signal recovery from random projections: universal encoding strategies?

IEEE Transactions on Information Theory 2006 52 12 5406 5425

10.1109/tit.2006.885507

MR2300700

2-s2.0-33947416035

Donoho

D. L.

Compressed sensing

IEEE Transactions on Information Theory 2006 52 4 1289 1306

10.1109/tit.2006.871582

MR2241189

2-s2.0-33645712892

Haupt

Bajwa

W. U.

Rabbat

Nowak

Compressed sensing for networked data

IEEE Signal Processing Magazine 2008 25 2 92 101

10.1109/msp.2007.914732

2-s2.0-41949106208

Luo

Sun

Chen

C. W.

Efficient measurement generation and pervasive sparsity for compressive data gathering

IEEE Transactions on Wireless Communications 2010 9 12 3728 3738

10.1109/TWC.2010.092810.100063

2-s2.0-78650203708

Quer

Masiero

Pillonetto

Rossi

Zorzi

Sensing, compression, and recovery for WSNs: sparse signal modeling and monitoring framework

IEEE Transactions on Wireless Communications 2012 11 10 3447 3461

10.1109/twc.2012.081612.110612

2-s2.0-84867899154

10.

Candès

E. J.

Recht

Exact matrix completion via convex optimization

Foundations of Computational Mathematics 2009 9 6 717 772

10.1007/s10208-009-9045-5

MR2565240

2-s2.0-71049116435

11.

Cheng

Jiang

Wang

STCDG: an efficient data gathering algorithm based on matrix completion for wireless sensor networks

IEEE Transactions on Wireless Communications 2013 12 2 850 861

10.1109/twc.2012.121412.120148

2-s2.0-84874989424

12.

Kong

Xia

Liu

X.-Y.

Chen

M.-Y.

Liu

Data loss and reconstruction in wireless sensor networks

IEEE Transactions on Parallel and Distributed Systems 2014 25 11 2818 2828

10.1109/TPDS.2013.269

2-s2.0-84908083605

13.

Zhang

Roughan

Willinger

Qiu

Spatio-temporal compressive sensing and internet traffic matrices

ACM SIGCOMM Computer Communication Review 2009 39 4 267 278

10.1145/1594977.1592600

14.

Luo

Sun

Chen

C. W.

Compressive data gathering for large-scale wireless sensor networks

Proceedings of the 15th Annual International Conference on Mobile Computing and Networking

September 2009

ACM

145 156

10.1145/1614320.1614337

2-s2.0-70450284408

15.

Ebrahimi

Assi

A distributed method for compressive data gathering in wireless sensor networks

IEEE Communications Letters 2014 18 4 624 627

10.1109/LCOMM.2014.030114.132728

2-s2.0-84899639020

16.

Wan

Yao

Bao

Partial matrix completion algorithm for efficient data gathering in wireless sensor networks

IEEE Communications Letters 2015 19 1 54 57

10.1109/lcomm.2014.2371998

2-s2.0-84921419036

17.

Intel Lab Data http://db.lcs.mit.edu/labdata/labdata.html

18.

Feng

Chen

PARED: a testbed with parallel reprogramming and multi-channel debugging for WSNs

Proceedings of the IEEE Wireless Communications and Networking Conference (WCNC ′13)

April 2013

Shanghai, China

IEEE

4630 4635

10.1109/wcnc.2013.6555325

2-s2.0-84881585001

19.

Boyd

Vandenberghe

Convex Optimization 2009

Cambridge, UK

Cambridge University Press

20.

Beck

Teboulle

A fast iterative shrinkage-thresholding algorithm for linear inverse problems

SIAM Journal on Imaging Sciences 2009 2 1 183 202

10.1137/080716542

MR2486527

ZBL1175.94009

21.

Toh

K.-C.

Yun

An accelerated proximal gradient algorithm for nuclear norm regularized linear least squares problems

Pacific Journal of Optimization 2010 6 3 615 640

MR2743047

22.

An accelerated gradient method for trace norm minimization

Proceedings of the 26th Annual International Conference on Machine Learning (ICML ′09)

June 2009

ACM

457 464

10.1145/1553374.1553434

2-s2.0-71149103464

23.

Cai

J.-F.

Candès

E. J.

Shen

A singular value thresholding algorithm for matrix completion

SIAM Journal on Optimization 2010 20 4 1956 1982

10.1137/080738970

MR2600248

2-s2.0-77951291046

24.

Golub

G. H.

van Loan

C. F.

Matrix Computations 2012 3

JHU Press

25.

Tseng

On accelerated proximal gradient methods for convex-concave optimization

submitted to The SIAM Journal on Optimization

26.

Larsen

R. M.

Propack-software for large and sparse SVD calculations

2004, http://sun.stanford.edu/~rmunk/PROPACK/

27.

Lin

Chen

The augmented lagrange multiplier method for exact recovery of corrupted low-rank matrices

http://arxiv.org/abs/1009.5055

28.

Goldfarb

Chen

Fixed point and Bregman iterative methods for matrix rank minimization

Mathematical Programming 2011 128 1-2 321 353

10.1007/s10107-009-0306-5

MR2810961

2-s2.0-79957957723

29.

Heinzelman

W. B.

Chandrakasan

A. P.

Balakrishnan

An application-specific protocol architecture for wireless microsensor networks

IEEE Transactions on Wireless Communications 2002 1 4 660 670

10.1109/TWC.2002.804190

2-s2.0-33646589837

30.

Luo

Xiang

Rosenberg

Does compressed sensing improve the throughput of wireless sensor networks?

Proceedings of the IEEE International Conference on Communications (ICC ′10)

May 2010

IEEE

1 6

10.1109/icc.2010.5502565

2-s2.0-77955383779