Sage Journals: Discover world-class research

Abstract

Data gathering is one of the most important operations in many wireless sensor networks (WSNs) applications. In order to implement data gathering, a tree structure rooted at the sink is usually defined. In most wireless sensor networks, nodes are powered by batteries with limited energy. Prolonging network lifetime is a critical issue for WSNs. As a technique for signal processing, compressed sensing (CS) is being increasingly applied to wireless sensor networks for saving energy. Compressive sensing can reduce the number of data transmissions and balance the traffic load throughout networks. In this paper, we investigate data gathering in wireless sensor networks using CS and aim at constructing a maximum-lifetime data-gathering tree. The lifetime of the network is defined as the number of data-gathering rounds until the first node depletes its energy. Based on the hybrid-CS data-gathering model, we first construct an arbitrary data-gathering tree and then use the random switching decision and optimal parent node selecting strategy to adjust the load of the bottleneck node and prolong the network lifetime. Simulation results show that the proposed algorithm outperforms several existing approaches in terms of network lifetime.

1. Introduction

Wireless sensor networks (WSNs) consist of a great number of nodes that sense the environment and collaboratively work to process and route the sensory data. They have a great application value in the fields of habitat monitoring, medical care, battlefield monitoring, and so on. Data gathering is a basic operation in WSNs, where sensors are responsible for collecting all sensory data and delivering them to the sink node [1]. Typically, sensors have strictly limited communication abilities and energy resources; therefore, how to reduce energy consumption in data gathering so as to prolong network lifetime is an important issue [2].

The problem of energy efficient data gathering in WSNs has been extensively investigated. Data-gathering methods based on cluster and tree are proposed in many literatures [3, 4]. The goal of such methods is to construct a network topology in order to use energy resources of sensor node effectively. However, these methods cannot overcome the “hot pot” problem; that is, nodes close to the sink would suffer from heavy loads and their energy would be exhausted soon, which would shorten the network lifetime. Data aggregation [5] and source coding technique [6] are two efficient methods to overcome the “hot pot” problem. However, data aggregation adopts a simple aggregation function that only extracts certain statistical information from the sensory data. The sink node cannot recover all sensory data. Hence, this technique can only be applied to particular applications that require partial information from WSNs. The distributed source coding technique performs noncollaborative data compression at the source nodes; it is not exactly practical in WSNs either due to the lack of prior knowledge of the data correlation or because of resulting in high communication load.

Compressed sensing is a signal processing technique for efficiently acquiring and reconstructing a signal, by finding solutions to underdetermined linear systems [7]. This technique promises to successfully recover the original signal from far fewer measurements than its original dimension, as long as the signal is sparse or compressible in some domain. Recently, the growing interest in CS has inspired many works in WSNs [8–11]. Many researches show that the CS technique suits well data gathering in WSNs because the sensory data is generally quite sparse in nature [8]. The authors in [12] made the first effort to speculate the potential of using CS theory for data gathering in multihop WSNs. However, they have not proposed a real scheme based on this initial idea. The authors in [8] firstly presented a complete framework for data gathering based on CS in large-scale WSNs where a large number of sensor nodes are densely deployed and sensor readings are spatially correlated. In Compressive Data Gathering (CDG) proposed in [8], the sensory data are transmitted as weighted sums on a routing path so as to reduce global communication cost without introducing intensive computation or complicated transmission control. However, CDG can result in many unnecessary data transmissions during data-gathering procedure.

In [13], the authors introduced two data-gathering models, that is, plain-CS and hybrid-CS. The plain-CS model which is the same as the CDG scheme uses CS encoding for all nodes in the network. The hybrid-CS model applies CS only to relay nodes that are overloaded. The hybrid-CS model marries the merits of nonaggregation data gathering and plain-CS scheme. Recently, Xiang et al. in [14] introduced a Minimum Energy Compressed Data Aggregation (MECDA) algorithm to implement the data gathering in WSNs based on hybrid-CS technique. The aim of MECDA is to allocate the traffic load properly in a given data-gathering round, through joint routing and aggregator assignment, such that the total energy consumption is minimized. Unfortunately, MECDA only takes into account the overall energy consumption of sensor network and does not consider balancing the energy consumption of each node; therefore, the energy of some sensor nodes may be exhausted quickly, which will affect the network lifetime eventually. Therefore, it is significant for the researchers to design a maximum-lifetime data-gathering method which balances the energy consumption of sensor nodes by using CS theory in designing of underlying routing.

In this paper, we propose a method of constructing maximum-lifetime data-gathering tree (MLDGT) in WSNs based on the hybrid-CS data-gathering model. The contributions of this paper are as follows: (1)

We first define the problem of maximum-lifetime data-gathering tree in WSNs based on compressed sensing, prove that it is NP-complete, and then analyze the characterizations of the optimal solutions.

(2)

We propose a novel algorithm, which used random switching and optimal parent node selecting strategy to keep transferring descendants of bottleneck node to realize load balancing and extending the lifetime of WSNs.

(3)

Extensive experiments have been conducted to verify that our algorithm achieves higher lifetime than existing approaches.

The rest of the paper is organized as follows. In Section 2, we give a brief overview of applying compressed sensing in data gathering of WSNs. Section 3 formally defines the problem of maximum-lifetime data-gathering tree and analyzes the complexity of the problem. The proposed algorithms are detailed and analyzed in Section 4. Simulation results are presented in Section 5 and conclusions are drawn in Section 6.

2. Preliminaries

2.1. Compressed Sensing Basics

In CS framework, the real-valued signal $X \in R^{N}$ can be reconstructed from a far fewer measurements than its original dimension N as long as the signal X is sparse or compressible in a certain domain. Let $X = [x_{1}, x_{2}, \dots, x_{N}]^{T}$ be N-dimensional original data that can be represented in some orthonormal basis $Ψ = {ψ_{i}}_{i = 1}^{N} (l e t ψ_{i}$ be N-dimensional column vector) as $X = \sum_{i = 1}^{N} θ_{i} ψ_{i} = Ψ θ$ . If the coefficient vector $θ = [θ_{1}, θ_{2}, \dots, θ_{N}]^{T}$ is k-sparse, that is, there are only k ( $k ≪ N$ ) nonzero entries in vector $θ$ , the signal is k-sparse in basis $Ψ$ . Let $Φ = [ϕ_{1}, ϕ_{2}, \dots, ϕ_{N}]$ be the $M \times N$ -dimensional measurement matrix whose row vectors are largely incoherent with $Ψ$ . Suppose Y is the measurements of the k-sparse X; that is, $Y = Φ X = Φ Ψ θ$ .

According to CS theory, we can recover X from Y by solving the following optimization problem:

\begin{matrix} \min_{θ \in R^{N}} {‖θ‖}_{0} \\ s.t. Y = Φ Ψ θ . \end{matrix}

(1)

Suppose

\hat{θ}

is the optimal solution of the optimization problem (1). The recovery signal of X can be obtained by

\begin{matrix} \hat{X} = Ψ \hat{θ} . \end{matrix}

(2)

The difference between the original signal X and recovery signal

\hat{X}

represents the practical performance of CS coding, which depends on the scarcity of the original signal, as well as on the reconstruction algorithm. Because the sensory data of WSNs is generally quite sparse in nature, CS suits well data gathering in WSNs [15, 16]. In this paper, we define the ratio of the dimension of measurement vector Y to the dimension of original sensory data vector X as sampling rate λ; that is,

λ = M / N

. The sampling rate is an important parameter that affects the energy consumption of nodes in a data-gathering tree that is constructed based on CS.

2.2. Overview of Data Gathering Based on Compressed Sensing in WSNs

For the traditional data-gathering method of WSNs, nodes around the sink need to transmit more raw data and consume more energy than the downstream nodes of the sensor network. The unbalanced energy consumption has a major impact on network lifetime. In [8], Luo et al. proposed CDG for data gathering in large-scale WSNs which is an efficient way to alleviate the bottleneck.

To illustrate the idea of data gathering based on CS, we rewrite the CS coding as

\begin{matrix} Y = Φ X = x_{1} ϕ_{1} + x_{2} ϕ_{2} + \dots + x_{i} ϕ_{i} + \dots + x_{n} ϕ_{n}, \end{matrix}

(3)

where

x_{i}

is sensory data of node

v_{i}

and

ϕ_{i}

i t h

column vector of measurement matrix

Φ

. In CDG, instead of individual readings, the weighted sums of sensor readings are delivered to the sink. In the process of data gathering, node

v_{i}

first multiplies its reading

x_{i}

with

ϕ_{i}

and then adds the product to the M-

d i m e n s i o n a l

column vector

\sum_{j = 1}^{i + 1} x_{j} ϕ_{j}

which is received from children nodes. The sum

\sum_{j = 1}^{i} x_{j} ϕ_{j} = \sum_{j = 1}^{i - 1} x_{j} ϕ_{j} + x_{i} ϕ_{i}

will be sent to the parent node of

v_{i}

. If each node performs this operation, the compressed measurement will be completed during the process of data gathering. Eventually, the sink collects the M-

d i m e n s i o n

vector Y, and then the decoding algorithm is used to recover the N original sensory data. The method mentioned above is referred to as plain-CS model. In plain-CS data gathering, each node needs to send exactly an N-

d i m e n s i o n a l

vector irrespective of what it has received; there is no “hot pot” phenomenon.

However, the plain-CS model may lead to unnecessary higher traffic at the earlier stage of data gathering. Therefore, Luo et al. in [13] proposed the hybrid-CS data-gathering model. The main idea of hybrid-CS model is that the non-CS scheme is applied in the earlier stages of data gathering, and the CS-based compression is only applied at node whose incoming traffic intensity becomes larger than M or equal to M. The data-gathering process based on hybrid-CS model is depicted in Figure 1 through a simple chain-type topology. There are N sensor nodes ${v_{1}, v_{2}, \dots, v_{N}}$ , and each node senses one piece of data in a round of data gathering. These data need to be transmitted to the sink node. During the transmission, sensor nodes $v_{1}, v_{2}, \dots, v_{M}$ transmit the raw sensory data $x_{1}, x_{2}, \dots, x_{M}$ directly, and the amount of transmitting data of each node is no more than M. But nodes $n_{M + 1}, n_{M + 2}, \dots, n_{N}$ transmit data based on CS; that is, node $v_{L} (M < L < N)$ transmits data $\sum_{i = 1}^{L} x_{i} ϕ_{i}$ , and the transmission load of each node in set ${n_{M + 1}, n_{M + 2}, \dots, n_{N}}$ is equal to M.

Figure 1

Hybrid-CS data gathering in a chain-type topology.

In this paper, we focus on maximizing the lifetime of data-gathering trees based on hybrid-CS model. As can be seen from the above analysis, the energy consumption characteristic of sensor node brings about a new challenge for constructing a maximum-lifetime data-gathering tree based on hybrid-CS model.

3. System Model and Problem Formulation

3.1. System Model

We consider a static wireless sensor network with N sensor nodes ${v_{1}, v_{2}, \dots, v_{N}}$ and one sink node S. All sensor nodes are randomly deployed over an $A \times A$ Euclidean plane to continuously monitor the environment and periodically report data to the sink. We use an undirected graph $G (V, E)$ to represent the WSNs, where V is the set of all nodes and E is the set of edges. The underlying communication graph is a unit disk graph. All nodes have the same transmission range r. There is an edge $(v_{i}, v_{j}) \in E$ whenever the Euclidean distance between nodes $v_{i}$ and $v_{j}$ satisfies $‖v_{i} - v_{j}‖ \leq r$ . Each node has unique initial energy. Assume that $E (v_{i})$ denotes the initial energy of node $v_{i}$ . The sink is AC-powered which has infinite power supply and powerful computation ability. Since data communication is the dominant energy consumption in sensor network, we can only take into account the energy consumption of communication. At each data-gathering round, each node generates a reading which is referred to as one-unit data. We denote by $E_{R x}$ and $E_{T x}$ the energy consumption of one-unit data transmission and reception, respectively.

The architecture for hybrid-CS data gathering that we refer to in [13, 14, 17] is shown in Figure 2. A spanning tree rooted at sink is adopted to gather the readings of the whole network. In Figure 2, the arrowed line represents the transmission link, and the data on the left side of each line is ready for transmitting. For leaf nodes, such as nodes $v_{1}$ ~ $v_{i}$ and $v_{i + 3}$ ~ $v_{i + 6}$ , the original sensory data will be transmitted to their parents. For other nodes, the outgoing traffic has two possibilities: (1)

When the number of data received from node v's children is less than M, where M is the dimension of column vector $ϕ_{i}$ , node v forwards the received and its owns data directly. The node v is referred to as forwarding node. In Figure 2, the outgoing traffic of node $v_{i + 7}$ is data set ${x_{i + 3}, \dots, x_{i + 6}, x_{i + 7}}$ , in which $x_{i + 7}$ is its own sensory data; the others are received from its children.

(2)

When the number of data received from node v's children nodes is more than M, node v performs the CS coding according to (3) and transmits exactly M encoded data corresponding to the aggregated column vector. We call node v the aggregating node. In Figure 2, node $v_{i + 1}$ is an aggregating node and performs the CS coding $\sum_{j = 1}^{i + 1} x_{j} ϕ_{j}$ . The outgoing traffic of $v_{i + 1}$ is always M.

Figure 2

The architecture for hybrid-CS data gathering.

Based on the above analysis, in a given data-gathering tree T, the outgoing traffic of node v is

\begin{matrix} R_{T x} (T, v) = \{\begin{cases} \sum_{u \in C h (v)} R_{T x} (T, u) + 1, & if \sum_{u \in C h (v)} R_{T x} (T, u) < M; \\ M, & otherwise, \end{cases} \end{matrix}

(4)

where

C h (v)

is the set of v's children nodes. Thus, the energy consumption of node v in each data-gathering round is

\begin{matrix} C (T, v) = \{\begin{cases} E_{T x} R_{T x} (T, v) + E_{R x} \sum_{u \in C h (v)} R_{T x} (T, u), & if node v is not a leaf node \\ E_{T x}, & otherwise. \end{cases} \end{matrix}

(5)

3.2. Problem Formulation

In this paper, we define a data-gathering round as the process of gathering all data from nodes to the sink, regardless of how much time it takes [18].

The lifetime of a node v in tree T is defined as the number of data-gathering rounds since it began to work until its death:

\begin{matrix} L_{n o d e} (T, v) = \frac{E (v)}{C (T, v)} . \end{matrix}

(6)

We also define the load,

l_{n o d e} (T, v)

, of node v as the ratio of

C (T, v)

E (v)

; that is,

\begin{matrix} l_{n o d e} (T, v) = \frac{C (T, v)}{E (v)} = \frac{1}{L_{n o d e} (T, v)} . \end{matrix}

(7)

The lifetime of a tree T is defined as the lifetime of the first dead node in the tree:

\begin{matrix} L_{t r e e} (T) = \min_{v \in V} \{L_{n o d e} (T, v)\} . \end{matrix}

(8)

For any network G, we refer to

S (G)

as the set of all possible trees for G. The problem of finding a maximum-lifetime data-gathering tree (MLDGT) based on hybrid-CS model can be defined as

\begin{matrix} \max_{T \in S (G)} L_{t r e e} (T) = \max_{T \in S (G)} \min_{v \in V} \{L_{n o d e} (T, v)\} . \end{matrix}

(9)

This problem that is similar to the maximum-lifetime problem proposed in [18–20] is an NP-complete problem. The unique property of the problem is the energy consumption model of the node in data-gathering tree based on hybrid-CS model, which brings about new challenges for constructing data-gathering tree. Since the problem is NP-complete, we try to find an approximate solution to solve the problem. Details of the solution techniques are shown in the next section.

4. Solution Techniques

4.1. Basic Ideal of Solution Techniques

To facilitate the analysis, we first provide some relevant assumptions and definitions in this section. We denote by $P (T, v_{i})$ the path from node $v_{i}$ to sink S in spanning tree T. The expression $v_{j} \in P (T, v_{i})$ represents the notion that node $v_{j}$ is a relaying node on the path $P (T, v_{i})$ . In a spanning tree T, the nodes on the path $P (T, v_{i})$ may have different loads. We define path load, $l_{p a t h} (T, v_{i})$ , of node $v_{i}$ as the maximum load of all nodes along the path from node $v_{i}$ to sink S; that is,

\begin{matrix} l_{p a t h} (T, v_{i}) = \max \{l_{n o d e} (T, v_{j}) ∣ v_{i} \in P (T, v_{j})\} . \end{matrix}

(10)

And the node with the largest load on the path

P (T, v_{i})

is called bottleneck node on the path

P (T, v_{i})

of node

v_{i}

Formula (9) can be transformed into an equivalent form:

\begin{matrix} \max_{T \in S (G)} L_{t r e e} (T) ⟺ \\ \min_{T \in S (G)} \max_{v \in V} \{\frac{1}{L_{n o d e} (T, v)}\} = \min_{T \in S (G)} \max_{v \in V} \{l_{n o d e} (T, v)\} . \end{matrix}

(11)

Let

Q (T)

denote the set of all leaf nodes in tree T. Assume that

q = |Q (T)|

is the number of nodes in set

Q (T)

, and the elements of set

Q (T)

are

u_{1}

u_{2}

,…, and

u_{q}

. Thus, the sensor node set V can be divided into q paths of q leaf nodes; that is,

\begin{matrix} V = P (T, u_{1}) \cup P (T, u_{2}) \cup \dots \cup P (T, u_{p}) = ⋃_{u_{i} \in Q (T)} P (T, u_{i}) . \end{matrix}

(12)

Based on the division of node set V, formula (11) is equal to

\begin{matrix} \underset{T \in S (G)}{m a x} L_{t r e e} (T) ⟺ \\ \min_{T \in S (G)} \max_{v \in V} \{l_{n o d e} (T, v)\} = \min_{T \in S (G)} \max \{l_{n o d e} (T, v) ∣ v \in ⋃_{u_{i} \in Q (T)} P (T, u_{i})\} = \min_{T \in S (G)} \max_{} \{\max_{} \{l_{n o d e} (T, v) ∣ v \in P (T, u_{1})\}, \max \{l_{n o d e} (T, v) ∣ v \in P (T, u_{2})\}, \dots, \max \{l_{n o d e} (T, v) ∣ v \in P (T, u_{p})\}\} = \min_{T \in S (G)} \max_{} \{l_{p a t h} (T, u_{1}), l_{p a t h} (T, u_{2}), \dots, l_{p a t h} (T, u_{p})\} = \min_{T \in S (G)} \max \{l_{p a t h} (T, u_{i}) ∣ u_{i} \in Q (T)\} . \end{matrix}

(13)

Therefore, the MLDGT problem is transformed to construct a spanning tree, in which the maximum path load of leaf nodes is minimal.

Inspired by [19], in order to solve the problem of formula (13), we need to find a spanning tree in which the difference between the maximum path load and the minimum path load of leaf node is minimal. Therefore, the MLDGT problem can be described as follows:

\begin{matrix} \underset{T \in S (G)}{m a x} L_{t r e e} (T) ⟺ \\ \min_{T \in S (G)} \underset{}{\max \{l_{p a t h} (T, u_{i}) ∣ u_{i} \in Q (T)\}} ⟺ \\ \min_{T \in S (G)} \{\max \{l_{p a t h} (T, v_{i}) ∣ v_{i} \in Q (T)\} - \min \{l_{p a t h} (T, v_{i}) ∣ v_{i} \in Q (T)\}\} . \end{matrix}

(14)

For a given connectivity group G, we assume that tree

T_{o p t}

is an optimal spanning tree which provides the maximum lifetime. Let

σ_{o p t}

be the difference between the maximum path load and the minimum path load of leaf node in tree

T_{o p t}

; that is,

\begin{matrix} σ_{o p t} = \max \{l_{p a t h} (T_{o p t}, v_{i}) ∣ v_{i} \in Q (T_{o p t})\} - \min \{l_{p a t h} (T_{o p t}, v_{i}) ∣ v_{i} \in Q (T_{o p t})\} . \end{matrix}

(15)

Each connectivity group G has a unique

σ_{o p t}

which depends on the node's initial energy, network deployment, and network connectivity of WSNs. It is an NP-complete problem for finding

σ_{o p t}

of any given connectivity group.

We can enumerate all spanning trees of graph G and find a spanning tree which minimizes the maximum path load of leaf nodes. However, the number of the spanning trees of graph G can be an exponential function of the number of nodes N. Therefore, we try to find an approximate solution to minimize the maximum path load of leaf node in the data-gathering tree. Details of the algorithm are described in the next section.

4.2. Algorithm Description

In this section, we design an approximate algorithm to deal with the problem raised by formula (13). Our algorithm uses randomized tree transformation and switching method and optimal parent node selecting strategy that make full use of the property of energy consumption in data gathering based on CS to improve convergence speed of the algorithm.

Our algorithm starts from an arbitrary tree T and then keeps transferring descendants of bottleneck node to nodes with smaller path load iteratively. In order to make the algorithm converge, we define a terminate parameter σ as the maximum allowable difference in the path loads among all leaf nodes. If the current spanning tree $T^{'}$ satisfies the condition

\begin{matrix} \max \{l_{p a t h} (T^{'}, v_{i}) ∣ v_{i} \in Q (T^{'})\} - \min_{} \{l_{p a t h} (T^{'}, v_{i}) ∣ v_{i} \in Q (T^{'})\} \leq σ, \end{matrix}

(16)

the iterative operation will be terminated, and

T^{'}

is the final result. It is obvious that the terminate parameter σ is more close to

σ_{o p t}

, and the algorithm result is more close to the optimal solution. Therefore, our algorithm may be superior well under proper choice of terminate parameter σ.

However, if the current spanning tree $T^{'}$ does not satisfy condition (16), we need to continue to adjust the load of bottleneck node. We use Figure 3 to illustrate the details of our algorithm. Figure 3 shows a part of the spanning tree $T^{'}$ . In Figure 3, the solid lines with arrow indicate the directions of data transmission in the tree $T^{'}$ while the dotted lines without arrow represent the edges in graph G. The omitted parts of the tree $T^{'}$ are represented by the dotted lines with arrow. $T_{v_{i}}$ denotes the subtree rooted at node $v_{i}$ .

Figure 3

A part of the spanning tree $T^{'}$ .

In Figure 3, suppose that node $v_{i}$ has the highest load in current tree $T^{'}$ . We need to switch some descendant nodes of $v_{i}$ to new parent node. Node $v_{i}$ has three children nodes, $v_{i + 4}$ , $v_{i + 5}$ , and $v_{i + 6}$ . Those nodes can be selected to switch to an alternate path. If node $v_{i + 4}$ is selected, it may have two potential parent nodes, $v_{i + 1}$ and $v_{i + 2}$ . Assume that node $v_{i + 4}$ is switched to node $v_{i + 1}$ . After switching, the load of node $v_{i + 1}$ increases and may become greater than the load of node $v_{i}$ by more than the terminate parameter σ. Consequently, the path load of leaf nodes in $T_{v_{i + 1}}$ may exceed the path load of leaf nodes in $T_{v_{i}}$ by more than σ. In the next iteration, node $v_{i + 4}$ may be switched to node $v_{i}$ again. In such case, we say that an oscillation has occurred. This phenomenon influences convergence and convergence speed of the algorithm. In order to reduce the influence of the oscillation, we adopt some mechanisms such as random switching decision and optimal parent node selecting strategy.

In the random switching decision, whether a node is selected for switching depends on its switching probability. For a considered node, we generated a random number, 0 or 1, according to its switching probability. If the random number is 1, the corresponding node is selected to switch; otherwise, the node is not selected. It is more likely to generate random number 1 when the switching probability is higher. The initial switching probability for each node is $1 / 2$ . Then, the switching probability $p (v_{i})$ of node $v_{i}$ is decided by the number of switching instances where node $v_{i}$ is selected. The switching probability can be expressed as

\begin{matrix} p (v_{i}) = 2^{- [n (v_{i}) + 1]}, \end{matrix}

(17)

where

n (v_{i})

denotes the switching numbers of node

v_{i}

. When a node has been selected for switching many times, there may be an oscillation that happened on such node. We use formula (17) to give such node with a smaller switching probability, reduce the chance of the node being selected, and give other nodes more opportunities. This method is beneficial to reduce the oscillation.

In Figure 3, if node $v_{i + 4}$ is switched many times and the algorithm cannot converge, the switching probability of node $v_{i + 4}$ will decrease, and the opportunities of other $v_{i}$ 's children node, such as $v_{i + 5}$ or $v_{i + 6}$ , will increase. This feature is advantageous to improve the convergence of the algorithm.

We have used a random switching decision to select a switching node. We will design an optimal parent node selecting strategy for assigning an optimal parent node to the selected switching node. Intuitively, in graph G, each neighbor of a selected switching node is likely to be the parent node of the switching node. However, in fact, only a part of the neighbor nodes are qualified to be the parent eventually. Because the path loads of some neighbor nodes are high originally, the node switching may increase the path load of such nodes, and the oscillation may happen. Therefore, we should choose some neighbor nodes as the potential parent nodes. Suppose that $v_{i}$ is a switching node and its original path load is $l_{p a t h} (T, v_{i})$ in tree T. If $v_{i}$ is switched to a neighbor node $v_{j}$ , the new tree $T^{'}$ is constructed, and the new path load of $v_{i}$ is $l_{p a t h}^{v_{j}} (T^{'}, v_{i})$ . If the new path load of $v_{i}$ is bigger than the original path load, the neighbor node $v_{j}$ cannot become a potential parent node. The set of $v_{i}$ 's potential parents can be determined as follows:

\begin{matrix} P_{v_{i}} = \{v_{j} ∣ l_{p a t h}^{v_{j}} (T^{'}, v_{i}) < l_{p a t h} (T, v_{i}), v_{j} \in {N e i}_{v_{i}}\}, \end{matrix}

(18)

where

{N e i}_{v_{i}}

is the set of

v_{i}

's neighbor nodes.

In neighbor nodes set ${N e i}_{v_{i}}$ , there may be two kinds of node: the first is the forwarding node and the other is the aggregating node. As described in formula (4), when neighbor node $v_{j}$ is an aggregating node, the outgoing traffic of aggregating node $v_{j}$ is a constant value, M. If $v_{i}$ switches to $v_{j}$ , the outgoing traffic of $v_{j}$ is also a constant value M; that is, the load of nodes on the path from $v_{j}$ to sink does not change, and the node load of $v_{j}$ changes. Therefore, the new path load $l_{p a t h}^{v_{j}} (T^{'}, v_{i})$ of $v_{i}$ can be determined as

\begin{matrix} l_{p a t h}^{v_{j}} (T^{'}, v_{i}) = \max (l_{p a t h} (T, v_{j}), l_{n o d e} (T^{'}, v_{j})), \end{matrix}

(19)

where

l_{n o d e} (T^{'}, v_{j})

is the node load of

v_{j}

in the new tree

T^{'}

. Otherwise, if neighbor node

v_{j}

is a forwarding node, the loads of nodes on the path from

v_{j}

to sink change, until a node becomes an aggregating node.

We need to choose a node v from the set $P_{v_{i}}$ to be the ultimate parent of $v_{i}$ . When node $v_{i}$ is switched to the ultimate parent, its new path load is the smallest; that is,

\begin{matrix} v = \arg \min_{v_{j} \in P_{v_{i}}} \{l_{p a t h}^{v_{j}} (T^{'}, v_{i})\} . \end{matrix}

(20)

In Figure 3, if node

v_{i + 5}

is selected to switch,

v_{i + 5}

has three neighbor nodes,

v_{i + 1}

v_{i + 2}

, and

v_{i + 8}

. If nodes

v_{i}

and

v_{i + 2}

satisfy

l_{p a t h}^{v_{i}} (T^{'}, v_{i + 5}) < l_{p a t h} (T, v_{i + 5})

and

l_{p a t h}^{v_{i + 2}} (T^{'}, v_{i + 5}) < l_{p a t h} (T, v_{i + 5})

, respectively, they may become

v_{i + 5}

's potential parents. Then, we select a node whose new path load is smaller from

v_{i}

and

v_{i + 2}

v_{i + 5}

's ultimate parent according to formula (20).

4.3. Details of the Algorithm

The core of our algorithm is shown in Algorithm 1. In our algorithm, $β_{v_{i}}$ counts how many times node $v_{i}$ is selected as the node with the highest load in tree T, and $p (v_{i})$ denotes switching probability of node $v_{i}$ . The initial values of $β_{v_{i}}$ and $p (v_{i})$ for each $v_{i} \in V$ are set to 0 and $1 / 2$ , respectively. In line (2), $v_{a}$ denotes the node with the highest load in tree T, whose children nodes will be selected for switching. $β_{m a x}$ denotes the maximum allowed number of times that a node can be selected for switching. It is an important parameter that affects the convergence of the algorithm. If the node $v_{a}$ with the highest load has been selected for $β_{m a x}$ times, the switching procedure will be terminated (line (3)) and the data-gathering tree T is returned (line (27)). Within the while loop, if the current spanning tree T satisfies inequality (16), the iterative operations of the algorithm will be terminated, and the data-gathering tree T is returned (lines (4)-(5)). Otherwise, the children of $v_{a}$ are inserted in queue α, and each node in queue α is qualified to be selected to switch. If queue α is not empty, we remove a node $v_{b}$ from queue α in FIFO order and then determine $v_{b}$ 's potential parents set $P_{v_{b}}$ using formula (18) (lines (9)-(10)). If node $v_{b}$ has no potential parent, the children nodes $C h (v_{b})$ of $v_{b}$ are added to the queue and considered for switching in subsequent rounds (line (12)). If $v_{b}$ has some potential parents, a random decision is made based on its current switching probability $p (v_{b})$ through the switching decision function. If the outcome of the decision is to switch, a node ${P a r}_{v_{b}}$ is chosen as the parent of node $v_{b}$ through formula (20), and then the tree T is updated by switching the subtree $T_{v_{b}}$ rooted at node $v_{b}$ to ${P a r}_{v_{b}}$ . The new loads $l_{p a t h} (T, v_{i})$ and $l_{n o d e} (T, v_{i})$ of each $v_{i} \in V$ are updated (lines (14)–(16)). Because node $v_{b}$ is switched, the subsequent switching probability becomes a half of its original; that is, $p (v_{b}) = p (v_{b}) / 2$ (line (17)). Since node $v_{b}$ is switched and the tree T is updated, we should terminate the while loop (line (18)) and then reselect a new node $v_{a}$ with the highest load (line (25)) for subsequent iterations. If the outcome of the decision is not to switch, node $v_{b}$ remains with its current parent and the descendants of $v_{b}$ are added to the queue for subsequent consideration (lines (19)-(20)).

Algorithm 1: Construction of data-gathering tree.

Input: graph G, arbitrary spanning tree T, $l_{p a t h} (T, v_{i})$ and $l_{n o d e} (T, v_{i})$ for each $v_{i} \in V$ , $β_{m a x}$

Output: Data-gathering tree

(1) Set $β_{v_{i}} \leftarrow 0$ and $p (v_{i}) = 1 / 2$ for each $v_{i} \in V$ ;

(2) Let $v_{a}$ be the node with the highest load in tree T;

(3) While $β_{v_{a}} \leq β_{m a x}$ do

(4) if $\max \{l_{p a t h} (T, v_{i}) ∣ v_{i} \in Q (T)\} - \min$ $\{$ $l_{p a t h} (T, v_{i}) ∣ v_{i} \in Q (T) \leq σ\}$ then

(5) Return T;

(6) else

(7) Set $α \leftarrow C h (v_{a})$ , $β_{v_{a}} = β_{v_{a}} + 1$ ;

(8) While $α \neq ⌀$ do

(9) Remove node $v_{b}$ from α in FIFO order;

(10) $P_{v_{b}} = \{v_{c} ∣ l_{p a t h}^{v_{c}} (T^{'}, v_{b}) < l_{p a t h} (T, v_{b}), v_{c} \in {N e i}_{v_{b}}\}$ ;

(11) if $P_{v_{b}} = ⌀$ then

(12) $α = α \cup C h (v_{b})$ ;

(13) else

(14) if SwitchingDecision ( $p (v_{b})$ ) == 1 then

(15) ${P a r}_{v_{b}} = \arg \underset{}{\min_{v \in P_{v_{b}}}} \{l_{p a t h}^{v} (T^{'}, v_{b})\}$ ; / $^{*} {P a r}_{v_{b}}$ is new parent of $v_{b}$

(16) Update T by switching $T_{v_{b}}$ to ${P a r}_{v_{b}}$ , update $l_{p a t h} (T, v)$ and $l_{n o d e} (T, v)$ for each $v \in V$ ;

(17) $p (v_{b}) = \frac{p (v_{b})}{2}$ ;

(18) break;

(19) else

(20) $α = α \cup C h (v_{b})$ ;

(21) end if

(22) end if

(23) end while

(24) end if

(25) Select $v_{a}$ as the node with the highest load;

(26) end while

(27) Return T

4.4. Analysis of Algorithm

In this section, we present the time complexity of our algorithm. The calculation of $l_{p a t h} (T, v_{i})$ and $l_{n o d e} (T, v_{i})$ for each $v_{i} \in V$ that is done during the initialization can be performed in $O (N)$ time. Lines (1)-(2) of our algorithm can be performed in $O (N)$ time. On line (4), since there can be at most $O (N)$ leaf nodes in current spanning tree, the calculation of the maximum and minimum path load of leaf node takes at most $O (N)$ time. The initialization of α on line (7) takes $O (C_{m a x})$ time, where $C_{m a x}$ is the maximum number of children of any node. On line (10), we determine the potential parents of the selected switching node $v_{b}$ . Because each neighbor of the switching node $v_{b}$ has the potential to become the parent, we need to calculate $v_{b}$ 's new path load $l_{p a t h}^{v_{c}} (T^{'}, v_{b})$ when $v_{b}$ switch to each neighbor. The calculations take $O (Q_{m a x})$ times, where $Q_{m a x}$ is the maximum number of neighbors of a node in graph G. In order to determine $l_{p a t h}^{v_{c}} (T^{'}, v_{b})$ , we need to calculate the load of each node on the path from the selected switching node to the sink. It takes at most $O (D)$ time where D is the diameter of the network. Therefore, determining the potential parents on line (10) takes $O (D Q_{m a x})$ time. Updating α on lines (12) and (20) takes $O (C_{m a x})$ time. Selecting the ultimate parent on line (15) takes at most $O (Q_{m a x})$ since the new path load has been calculated on line (10). On line (16), updating $l_{p a t h} (T, v)$ and $l_{n o d e} (T, v)$ takes $O (N)$ time. Since $C_{m a x} \leq N$ and $Q_{m a x} \leq N$ , the running time for operations from line (9) to line (22) is dominated by $O (D Q_{m a x} + N)$ . Because the subtree $T_{v_{a}}$ rooted at node $v_{a}$ can have at most N descendants, the second loop on line (8) can run for $O (N)$ times. Therefore, the running time for the second loop takes $O (N (D Q_{m a x} + N))$ time. If all nodes periodically participate in switching, the loop on line (3) runs at most $N β_{m a x}$ times. Hence, the total time complexity of our algorithm is $O (N^{2} (D Q_{m a x} + N))$ .

5. Performance Evaluation

In this section, we evaluate the performance of our algorithm through simulation experiments.

5.1. Simulation Setup

We evaluate the performance of our algorithm using Matlab simulations. In our simulation, we randomly deploy sensor nodes in a square region of size 100 m × 100 m. The number of sensors is varied from 100 to 300. We consider three scenarios associated with the different location of the sink in the network. In Scenario 1, the sink was placed at the corner of the deployment area, that is, at the (0 m, 0 m) coordinate. In Scenario 2, the sink was placed at one side of the deployment area, specifically at the (50 m, 0 m) coordinate. In Scenario 3, the sink was placed in the middle of the deployment area, namely, at the (50 m, 50 m) coordinate. Each sensor is randomly assigned an initial energy between 0.5 and 1 joule (J). We assumed that all nodes have the same energy consumption model. The radio transmission range of each node is set to 25 m. Similar to [18], the energy required to receive data is 50 nJ/bit and the energy required to transmit data is 100 nJ/bit. The size of one-unit data is 16 bytes. The energy consumption for one-unit data transmission and reception is 6400 nJ and 12800 nJ, respectively; that is $E_{R x} = 6400$ nJ and $E_{T x} = 12800$ nJ. We generate 100 random networks and present the average results for performing comparisons. We evaluate the lifetime performance measured as the number of data-gathering rounds until the first node runs out of energy.

The choice of the terminate parameter σ governs convergence speed of the algorithm and lifetime of the tree produced at converged point. However, it is not possible to know the optimal value of σ for a random connectivity graph. Intuitively, it may appear that choosing σ as low as possible results in a higher lifetime tree. When σ is too low, the algorithm will perform more switching to find a spanning tree to meet the terminate parameter. However, such spanning tree may not topologically exist in a given connectivity graph representing the sensor network. The oscillations may occur and diminish the lifetime of the final tree. Therefore, we need to learn the appropriate ranges of σ for different network configurations. To solve this problem, we generate 100 random deployed networks for each scale sensor network as training inputs to our algorithm. Then, we measure their lifetimes with varying σ values and select the optimal σ value for simulation experiments. The maximum allowed switching time $β_{m a x}$ is set to 1000.

5.2. Simulation Results

In this section, we first compare our algorithm with the MITT [18], RaSMaLai [19], and MECDA [14] data-gathering schemes for Scenario 2. The lifetime performance is compared as a function of the number of nodes. The sampling rate is set to 10% and 20%.

It can be seen from Figure 4 that our algorithm outperforms the compared algorithms. Furthermore, the lower the sampling rate, the better the lifetime performance of our algorithm. When the sampling rate is low, more nodes in the network can become the aggregating nodes, so as to save energy. The MITT algorithm is based on a min-max-weight spanning tree. The algorithm improves the lifetime of a given tree by switching the parent of the node under consideration. The MECDA algorithm uses hybrid-CS technology, and the objective of the algorithm is to minimize the total energy consumption in a data-gathering tree. It does not consider the balance of energy consumption and energy consumption of each node. The nodes near the sink may have higher energy consumption, which affects the lifetime of the network. The RaSMaLai algorithm uses randomized switching scheme to maximize the lifetime of data collection trees based on the concept of the bounded balanced trees. However, because the algorithm collects the raw data directly, the nodes close to the sink need to relay data of all the downstream nodes on the path, so that the energy consumption is higher, and the network lifetime is shorter. Our algorithm uses both the hybrid-CS strategy and the random switching method to adjust the load of bottleneck node, thus obtaining the best lifetime performance.

Figure 4

Lifetime of different algorithms.

Figure 5 shows the lifetime of different algorithms when the sampling rate is varied from 10% to 30%, and the number of nodes is 200 and 300, respectively. As shown in Figure 5, with the increase of the number of nodes, the nodes near the sink need to forward more data, the energy consumption increases, and the lifetime is reduced. With the increase of sampling rate, the lifetime for our algorithm and MECDA decreases. This is because when the sampling rate increases, the maximum data length M is increased, and the intermediate nodes need to gather more data to become aggregating nodes, the energy consumption of such nodes increases, and the network lifetime is reduced. However, the lifetime of RaSMaLai does not change with the varying of sampling rate, because it does not use the CS strategy. When the sampling rate is over 30%, the performance of our algorithm is more and more close to that of the RaSMaLai algorithm. The reason is that when the sampling rate is large, few nodes can become the aggregating node, and the characteristic of energy consumption is consistent with the RaSMaLai algorithm.

Figure 5

Lifetime with different sampling rates.

As shown in Figure 6, we compare our algorithm with MECDA in different scenarios. No matter which scenario is used, our algorithm outperforms the MECDA algorithm. In Scenario 1, the performance of our algorithm is more superior, because when the sink is placed at (0, 0) coordinate, the nodes around the sink need to forward more data of downstream nodes and have heavy loads, and thus our algorithm can make the load of such nodes become more balanced by switching. However, in Scenario 3, the performance improvement of our algorithm is relatively small; this happens since the sink located at the middle naturally gives better balancing for each algorithm.

Figure 6

Lifetime in different scenarios.

6. Conclusion

In this paper, we have studied the problem of constructing a maximum-lifetime data-gathering tree. We used hybrid-CS theory to data gathering in wireless sensor networks. We first constructed an arbitrary data-gathering tree, used a random switching decision to select a bottleneck node's child as switching node, and then designed an optimal parent node selecting strategy for assigning an optimal parent to the selected switching node. We kept transferring descendants of the bottleneck node to realize load balancing until the terminate parameter or maximum allowed switching time is satisfied. Simulation results show that the proposed algorithms can significantly increase the lifetime of WSNs and outperform several existing approaches in terms of network lifetime.

Footnotes

Competing Interests

The authors declare that they have no competing interests.

Acknowledgments

This research is supported by the Natural Science Foundation of Jiangsu Province under Grants nos. BK20130096 and BK2011754, the National Natural Science Foundation of China under Grants nos. 61272084, 61202004, 61202353, 61300240, and 61302158, the Postdoctoral Science Foundation of China under Grant no. 2015M581794, the Project of Natural Science Research of Jiangsu University under Grants nos. 15KJB520027 and 14KJB520031, the Postdoctoral Science Foundation of Jiangsu Province under Grant no. 1501023C, the Key Project of Natural Science Research of Jiangsu University under Grant no. 11KJA520002, and the Natural Science Foundation of Anhui Province (no. 1608085MF127).

References

S. L.

Cai

Z. P.

Distributed data collection in large-scale asynchronous wireless sensor networks under the generalized physical interference model

IEEE/ACM Transactions on Networking 2013 21 4 1270 1283

10.1109/tnet.2012.2221165

2-s2.0-84882568097

Duong

Nguyen

Data collection in sensor networks via the novel fast markov decision process framework

IEEE Transactions on Wireless Communications 2015 14 8 4522 4533

10.1109/TWC.2015.2422307

2-s2.0-84939548713

Mao

Fahmy

Shroff

N. B.

Constructing maximum-lifetime data-gathering forests in sensor networks

IEEE/ACM Transactions on Networking 2010 18 5 1571 1584

10.1109/TNET.2010.2045896

2-s2.0-77958121214

Lin

Uster

Exact and heuristic algorithms for data-gathering cluster-based wireless sensor network design problem

IEEE/ACM Transactions on Networking 2014 22 3 903 916

10.1109/TNET.2013.2262153

2-s2.0-84903276145

X.-Y.

Wan

P.-J.

Tang

Efficient scheduling for periodic aggregation queries in multihop sensor networks

IEEE/ACM Transactions on Networking 2012 20 3 690 698

10.1109/TNET.2011.2166165

2-s2.0-84862584082

Chen

J. M.

Yau

D. K. Y.

Sun

Cross-layer optimization of correlated data gathering in wireless sensor networks

IEEE Transactions on Mobile Computing 2012 11 11 1678 1691

10.1109/tmc.2011.210

2-s2.0-84655170952

Candes

E. J.

Wakin

M. B.

An introduction to compressive sampling

IEEE Signal Processing Magazine 2008 25 2 21 30

10.1109/msp.2007.914731

Luo

Sun

Chen

C. W.

Compressive data gathering for large-scale wireless sensor networks

Proceedings of the 15th Annual International Conference on Mobile Computing and Networking (MobiCom '09)

September 2009

Beijing, China

145 156

10.1145/1614320.1614337

2-s2.0-70450284408

Quer

Masiero

Munaretto

Rossi

Widmer

Zorzi

On the interplay between routing and signal representation for Compressive Sensing in wireless sensor networks

Proceedings of the Information Theory and Applications Workshop (ITA '09)

February 2009

San Diego, Calif, USA

206 215

10.1109/ita.2009.5044947

2-s2.0-70349294164

10.

Wang

Tang

Yin

X.-Y.

Data gathering in wireless sensor networks through intelligent compressive sensing

Proceedings of the IEEE INFOCOM (INFOCOM '12)

March 2012

Orlando, Fla, USA

IEEE

603 611

10.1109/infcom.2012.6195803

2-s2.0-84861600807

11.

Tang

Zhang

Jing

Cheng

Robust compressive data gathering in wireless sensor networks

IEEE Transactions on Wireless Communications 2013 12 6 2754 2761

10.1109/TWC.2013.040413.120796

2-s2.0-84880169131

12.

Haupt

Bajwa

W. U.

Rabbat

Nowak

Compressed sensing for networked data: a different approach to decentralized compression

IEEE Signal Processing Magazine 2008 25 2 92 101

10.1109/msp.2007.914732

2-s2.0-41949106208

13.

Luo

Xiang

Rosenberg

Does compressed sensing improve the throughput of wireless sensor networks?

Proceedings of the IEEE International Conference on Communications (ICC '10)

May 2010

Cape Town, South Africa

1 6

10.1109/icc.2010.5502565

2-s2.0-77955383779

14.

Xiang

Luo

Vasilakos

Compressed data aggregation for energy efficient wireless sensor networks

Proceedings of the 8th Annual IEEE Communications Society Conference on Sensor, Mesh and Ad Hoc Communications and Networks (SECON '11)

June 2011

Salt Lake City, UT, USA

46 54

10.1109/SAHCN.2011.5984932

15.

Zhou

Xiong

Global correlated data gathering in wireless sensor networks with compressive sensing and randomized gossiping

Proceedings of the IEEE Global Telecommunications Conference (GLOBECOM '11)

December 2011

Houston, Tex, USA

IEEE

1 5

10.1109/GLOCOM.2011.6133960

16.

Zhang

B. W.

Cheng

X. Z.

Zhang

Cui

Liang

Sparse target counting and localization in sensor networks based on compressive sensing

Proceedings of the 30th IEEE International Conference on Computer Communications, Joint Conference of the IEEE Computer and Communications Societies (INFOCOM '11)

April 2011

Shanghai, China

IEEE

2255 2263

10.1109/infcom.2011.5935041

2-s2.0-79960852449

17.

Xiang

Luo

Rosenberg

Compressed data aggregation: energy-efficient and high-fidelity data collection

IEEE/ACM Transactions on Networking 2013 21 6 1722 1735

10.1109/tnet.2012.2229716

2-s2.0-84882961435

18.

Liang

J. B.

Wang

J. X.

Cao

J. N.

Chen

J. N.

M. M.

An efficient algorithm for constructing maximum lifetime tree for data gathering without aggregation in wireless sensor networks

Proceedings of the 29th IEEE International Conference on Computer Communications, Joint Conference of the IEEE Computer and Communications Societies (INFOCOM 2010)

March 2010

San Diego, Calif, USA

1 5

19.

Imon

S. K. A.

Khan

Di Francesco

Das

S. K.

RaSMaLai: a randomized switching algorithm for maximizing lifetime in tree-based wireless sensor networks

Proceedings of the 32nd IEEE Conference on Computer Communications (INFOCOM '13)

April 2013

Turin, Italy

2913 2921

10.1109/infcom.2013.6567102

2-s2.0-84883096111

20.

Imon

S. K. A.

Khan

Di Francesco

Das

S. K.

Energy-efficient randomized switching for maximizing lifetime in tree-based wireless sensor networks

IEEE/ACM Transactions on Networking 2014 23 5 1401 1415

10.1109/tnet.2014.2331178

2-s2.0-84903859934

Constructing Maximum-Lifetime Data-Gathering Tree in WSNs Based on Compressed Sensing

Abstract

1. Introduction

2. Preliminaries

2.1. Compressed Sensing Basics

2.2. Overview of Data Gathering Based on Compressed Sensing in WSNs

3. System Model and Problem Formulation

3.1. System Model

3.2. Problem Formulation

4. Solution Techniques

4.1. Basic Ideal of Solution Techniques

4.2. Algorithm Description

4.3. Details of the Algorithm

Algorithm 1: Construction of data-gathering tree.

4.4. Analysis of Algorithm

5. Performance Evaluation

5.1. Simulation Setup

5.2. Simulation Results

6. Conclusion

Footnotes

Competing Interests

Acknowledgments

References