Sage Journals: Discover world-class research

Abstract

We present a Markovian decision process (MDP) framework for multihop wireless sensor networks (MHWSNs) to bound the network performance of both energy constrained (EC) networks and energy harvesting (EH) networks, both with and without relay cooperation. The model provides the fundamental performance limit that a MHWSN can theoretically achieve, under the general constraints from medium access control, routing, and energy harvesting. We observe that the analyses for EC and EH networks fall into two branches of MDP theory, which are finite-horizon process and infinite-horizon process, respectively. The performance metrics for EC and EH networks are different. For EC networks, an appropriate metric is the network lifetime; for EH networks, an appropriate metric is, for example, the network throughput. To efficiently solve the models with high dimension, for the EC networks, we propose a novel computational algorithm by taking advantage of the stochastic shortest path structure of the problem; for the EH networks, we propose a dual linear programming based algorithm by considering the sparsity of the transition matrix. Under the unified MDP framework, numerical results demonstrate the advantages of cooperation for the optimal performance, in both EC and EH networks.

1. Introduction

This paper considers modeling the optimal performance of multihop wireless sensor networks (MHWSNs), which can be energy constrained (EC) networks or energy harvesting (EH) networks, both with and without relay cooperation. The network performance of a MHWSN is a complex function of sensors' harvested energy, traffic volume, routing protocol, and medium access (MAC) technique. An analytical approach that is suitable for WSN traffic and that derives the optimum would serve as a valuable benchmark for heuristics MAC and routing protocols. However, such a multihop framework has not been presented. In this paper, we explore the optimum by presenting a unified Markov Decision Process (MDP) analysis that can analyze both EC and EH networks. We observe that the treatments for both networks fall into two branches of MDP theory, the finite-horizon process and the infinite-horizon process, respectively.

Existing analyses have mainly focused on a single node [1], a single link [2], or a single-hop network [3]. Departing from the literature, this paper addresses the general multihop network with neither assumptions on node distribution nor assumptions on total traffic pattern, motivated by the following reasons. First, since the communication range of a single node is limited, single-hop deployment [3] cannot always meet the requirements of many applications, such as environmental and structural monitoring, security and defense, and wireless body area networks [4]. Second, the traffic pattern of nodes in a MHWSN is unbalanced, invalidating the homogeneous or deterministic traffic assumptions, which are widely assumed for wireless ad hoc or mesh networks. In particular, some nodes near the Sink consume their energy faster because they must use their energy to relay other nodes' data. In EC networks, this unbalance creates the well-known “energy hole” problem [5]. In EH networks, the impact on energy consumption is still nontrivial. Moreover, the extent to which the traffic is unbalanced is not known a priori and is usually determined by a routing protocol. Therefore, towards quantifying the network performance, careful consideration should be taken to capture the peculiarity of the traffic pattern. Third, because the network's energy consumption rate is closely related to nodes' interaction in various aspects including channel access, routing, and cooperation, an optimal solution obtained by focusing only on a point-to-point link does not imply the global optimum for the network-wide objective. In fact, quantifying the latter problem is fairly challenging.

On another track, cooperative transmission (CT) is a mixture of communication protocol and physical (PHY) layer combining scheme that allows spatially separated nodes to collaborate to transmit a single message by forming a virtual multiple-input-single-output (VMISO) array. A receiver, gaining a signal-to-noise ratio (SNR) advantage of 10 dB to 20 dB through PHY combining [6], can decode the message at an extended range [7]. CT range extension has been experimentally demonstrated in [8], and it shows significant impact on Layer Two and Layer Three of a WSN by eliminating the “energy hole” in EC network [9, 10] and by providing better Quality of Service in EH networks [11]. Time-division CT (TDCT) is one type of cooperation, in which the source node multicasts the packet, and the cooperators that decode the packet forward it in orthogonal time slot [12]. Our previous work OSC-MAC [13] shows notable lifetime improvement over state-of-the-art MAC protocols by using TDCT to balance energy. Yet, it is noted that the performance gap between the CT MAC protocol [9, 10] with fixed routes and the theoretically optimal value is unknown. Also note that the analysis in [11], among others, assumes perfect link scheduling, which contrasts with the practical situation wherein link activities are subject to MAC constraints and packet collisions occur. All the above discussions have further motivated our analysis to comprehensively consider energy harvesting, routing, MAC, and cooperation, from the network perspective. Our previous work [14] models the lifetime of battery based multihop sensor networks, which, however, does not apply to energy unconstrained networks (e.g., through energy harvesting). This paper extends the model in [14] to provide an analysis framework that apply to both energy constrained (EC) networks and energy harvesting (EH) networks.

The rest of the paper is organized as follows. Section 2 presents the related work. Section 3 describes the system model and assumptions. An MDP formulation is detailed in Section 4. The computational method and numerical evaluations for non-CT and CT networks are presented in Sections 5 and 6, respectively. Finally, Section 7 concludes the paper.

2. Related Work

In this section, we describe existing works related to modeling for wireless sensor networks in terms of the scope of modeling, the assumptions, and the technical tools. Then we highlight the challenges solved and contributions made in this work in comparison.

Existing analyses are mainly limited to a single node, a single link, or a single-hop network [15]. For example, [1] focuses on energy management of a single energy harvesting (EH) node for achieving maximum throughput and minimizing delay, using an MDP [16] model. While it is assumed in [1] that the characteristic of traffic and energy is stationary (same as in this work), it also assumes infinite packet queue and energy queue, which is not practical. In [17], a method based on stochastic timed automata and statistical model checking is proposed to model a WSN protocol called timing-sync protocol, which however focuses only on node behavior. In [18], an MDP model is used to model the sensor processor (CPU) energy and Petri nets are used to model an energy constrained (EC) sensor node. In [2], an MDP model is presented for EH tags in a point-to-point link, assuming a time-slotted system and a stationary energy harvesting model. Transmission of EH nodes in a single link and in the context of a fading channel is tackled in [19]. In [3], an MDP model is used to formulate the lifetime of a single-hop EC network, which considers sensor scheduling in a scenario where only a fraction of sensors collect information and communicate directly with a one-hop Sink. In [20], considering nodes that are equipped with a hybrid energy storage system, the authors provide an MDP model for a single-hop EH network. The major limitations of these works are their inability to analyze the performance of a network that extends beyond one-hop deployment and the lack of analysis for cooperative transmission.

For multihop networks, the past analyses have exploited various approaches. The optimal lifetime of a multihop WSN is derived in [21] using linear programming (LP) formulation based on traffic balance conditions. However, the analyses in [21] are limited to only the routing layer. In [22], a model based on mixed integer linear programming is proposed to identify energy-efficient route from the source sensor to the Sink; yet, it considers only energy minimization and does not produce network-wide objective optimization. In [23], the authors provide a Lyapunov analysis for multihop EH WSNs, which considers the dynamics of energy and queue, to maximize the utility. However, [23] is based on queue stability, which is not suitable for low-traffic WSNs. Reference [24] studies the upper bound on the lifetime of data gathering sensor networks, while [25] derives the upper bound on the lifetime of large-scale networks using the theory of coverage process. Both [24, 25] are based on the assumption that the data sources are deployed with a particular probability density function or process; thus they are unable to capture the inhomogeneous traffic characteristic in a multihop network. We note that all the aforementioned works fail to consider MAC constraints, as they assume perfect link scheduling and thus ignore packet losses and retransmissions, which contrasts the practical situation wherein link activities are subject to MAC constraints and packet collisions occur. Moreover, none of these works, except for [21], considers cooperative transmission.

For cooperative networks, the existing analyses are typically limited to single-hop networks. In [26], the authors consider the lifetime maximization in an amplify-and-forward cooperative network and model the energy dissipation of nodes as a Markov chain. In [27], the authors develop a general probability model to study the outage performance of the best-relay selection with adaptive decode-and-forward cooperative network. Relatively less works have focused on analysis of multihop cooperative networks. The LP model in [28] has been recently extended to multihop CT networks [29] by considering all single-input-single-output (SISO) and virtual multiple-input-single-output (VMISO) links. Again, none of these works has considered MAC layer link constraints, and therefore the bounds that they provide are optimistic. Moreover, because the interference range has significant effect on the performance of the MAC, and the interference range of CT could differ from that of a SISO transmission, a complete evaluation of multihop cooperative networks is hard to obtain without the inclusion of MAC constraints.

The challenges in modeling a multihop WSN reside mainly in the scope of modeling and the computational complexity. In this paper, we report a unified modeling framework with reasonable complexity that provides the fundamental performance limit that a MHWSN can theoretically achieve. Our model applies to both energy constrained networks and energy harvesting networks, and it captures the general constraints from MAC, routing, and energy dynamics. We further present two computational methods taking the advantage of the characteristics of EC networks and EH networks, respectively.

3. System Model and Assumptions

Each sensor node has two queues, the packet queue where data packets are stored and the energy queue where energy is stored. Both the packet queue and the energy queue have limited capacity. We assume that no dynamic transmit power control technique is used; that is, all the nodes use the same transmit power level, which means CT is used exclusively for range extension purpose instead of power reduction.

3.1. Time Slots in the System

Unlike [30] where the time slot is varied and is defined as the interval between two random events on a node, that is, between instances of energy arrivals, we consider a mini-time-slotted system as shown in Figure 2, where the slot duration is constant and slots are normalized to integral units $t \in \{1,2, 3, \dots\}$ . By time t, we refer to the duration within $[t, t + 1]$ . The randomness of events occurs during the slot. The reason to use constant slot interval is twofold: first, this provides the time from the perspective of a network that comprises many nodes; second, this allows us to model the packet transmission duration, against the instantaneous information capacity, because packetizing is very common in wireless standards, including IEEE 802.15.4, in which rate changing is infeasible in the duration of one-packet transmission.

In addition, we remark that the time slots in this paper are different and more general than those in the link scheduling literature. In the latter case, the time is divided into slots equal to the length of packet transmission time, and all nodes are synchronized to slot boundaries. Our model does not have such a constraint, so the packet length can span multiple slots and nodes are not necessarily synchronized to slot boundaries.

3.2. Topology and Link Models

3.2.1. Topology Model

The multihop network topology is modeled as a directed graph $G = (V, E)$ . $V = {1, \dots, N}$ is the set of nodes except the Sink, whose ID is $0$ , and E is the set of links comprising both single-input-single-output (SISO) links and VMISO links; that is, $E = E^{s o} \cup E^{v o}$ . A SISO link l exists if its source and the destination are within the maximum transmission range. Given the transmission power $P_{t x}$ and the required receiving power $P_{r x, \min}$ , the maximum transmission range $d_{\max}$ is defined as $d_{\max} = {(k P_{t x} / P_{r x, m i n})}^{1 / ρ}$ , where ρ is the path loss exponent and k is the constant of proportionality [31]. A VMISO link exists if the source and the destination are within maximum transmission of CT with the cooperators. A VMISO link can be formed between the cooperating nodes (the initiator and cooperators) and the VMISO receiver. In the case of VMISO link, we assume the destination is the Sink; for example, Nodes A and B form a VMISO link to the Sink in Figure 1. Therefore, a link is represented by the 3-tuple, $〈 s (l), d (l), C (s (l)) 〉 \in E$ , where $s (l)$ and $d (l)$ are the source and the destination, respectively; $C (s (l))$ is the cooperators of $s (l)$ . This link is a valid VMISO link if and only if

\begin{matrix} {(1 0^{G_{|C (s (l))|} / 10} \sum^{i \in s (l) \cup C (s (l))} D_{i, d (l)}^{- ρ})}^{- 1 / ρ} \leq d_{\max}, \end{matrix}

(1)

the same as in [21], where

D_{i, d (l)}

is the distance between Node i and the VMISO receiver and

G_{n}

is the SNR gain as a function of the number of cooperating nodes; for example,

G_{2} = 10

dB,

G_{3} = 13.5

dB according to [32] for BPSK modulation at a bit error rate of

1 0^{- 3}

. Note that, in the case of SISO link,

C (s (l)) = ⌀

and, in the case of VMISO link,

d (l) = 0

Figure 1

An illustration of interference model and VMISO link. Node v is the receiver of Node u. Node u's hidden nodes in the gray area will interfere with v's reception.

Figure 2

A timeline illustration of the decision process model for the network. The decision epochs are at the beginning of each minislot.

3.2.2. Interference Model

A node has three ranges associated with it: the transmission range ( $T X$ ), the interference range ( $I F$ ), and the carrier sensing range ( $C S$ ), as shown in Figure 1. Similar to [33], we assume that link $(i, j, \cdot)$ interferes with link $(u, v, \cdot)$ if either the distance $d i s t (i, v) \leq I F (i)$ or $i = v$ . For instance, in Figure 1, Node u and any node in the gray area can start transmission at the same time because they are out of carrier sensing (CS) range of (i.e., hidden from) each other. Also, located in Node v's interference (IF) range, the hidden node's transmission interferes with Node v's reception. This phenomenon directly results from the CSMA mechanism of the MAC layer and the relationship between the three ranges (TX, IF, and CS) [34].

3.2.3. Medium Access Control Model

We assume that carrier-sensing-medium-access (CSMA) is performed before a node attempts to transmit. Further, similar to [35], it is assumed that (i) a node cannot transmit and receive at the same time; (ii) a node can transmit if none of its neighbors (in $C S$ range) is transmitting; (iii) link errors result only from collisions due to hidden terminals; (iv) nodes receive with perfect capture; that is, a packet is successfully decoded if the receiver and none of its neighbors are transmitting at the start of packet; and (v) the propagation time is zero. Our model can be extended to incorporate fading and shadowing effects of the wireless channel in computing packet errors; however, in this paper we assume (iii) to simplify the presentation.

3.3. Traffic and Energy Models

3.3.1. Traffic Model

The transmission duration of a link $l \in E$ is assumed exponentially distributed with the expectation ${\bar{T}}_{l}$ , the same as in [35, 36]. Note that this assumption is inaccurate; however, it is necessary to make the MDP problem tractable [36]. The memoryless property of the exponential distribution indicates that, given a packet is being transmitted at the beginning of a time slot, it completes the transmission within the slot of length $Δ T$ with probability $β_{l} = 1 - e^{- Δ T / {\bar{T}}_{l}}$ , regardless of the number of slots it has been transmitting. Making $Δ T$ small enough allows us to assume that at most one link can finish transmission during a time slot; that is, the probability that more than one link finishes transmission is $o (Δ t)$ .

The number of exogenous (self-generated) packets at Node i of commodity d (destined to Sink d), $f_{i}^{d} (t)$ , during any slot t is modeled as a binary random variable (RV), with probability mass functions (PMF) $\Pr [f_{i}^{d} (t) = 1] = α^{d} (i)$ and $\Pr [f_{i}^{d} (t) = 0] = 1 - α^{d} (i)$ , where $α^{d} (i)$ is assumed i.i.d. over time slots.

3.3.2. Energy Harvesting Model

A realistic power consumption model for a sensor node has been studied in [37], which decouples the power usage into baseband DSP circuit, the RF front-end circuit for transmitting and receiving, the power amplifier (PA), and low noise amplifier (LNA). A similar circuit-level analysis is also performed in [38] for energy modeling in cooperative multiple-input-multiple-output (MIMO) sensor networks. In [39], six different statistical models have been used to fit empirical datasets of a solar-powered energy harvesting sensor node. Our recent work [40] provides an energy model for energy harvesting node considering harvesting and leakage in a supercapacitor.

While we are aware of more complex energy models, for simplicity, in this paper we assume a simpler model. However, we note that it should be straightforward to incorporate complex energy consumption and harvesting models into our framework. In this study, the energy harvested at Node i during any slot t is modeled as a binary RV, which is denoted by $h_{i} (t) \in {0,1}$ . This RV has PMF $\Pr [h_{i} (t) = 1] = γ (i)$ and $\Pr [h_{i} (t) = 0] = 1 - γ (i)$ . Representing the energy harvesting rate, $γ (i)$ is assumed i.i.d. over time slots. Thus, similar to [1, 41], the energy of Node i evolves as

\begin{matrix} e_{i} (t + 1) = \min \{{(e_{i} (t) - e_{com}^{(i, l)})}^{+} + h_{i} (t), e_{\max}\}, \end{matrix}

(2)

where

e_{com}^{(i, l)}

is the energy consumption of Node i if it participates in a finished link transmission l and

e_{\max}

is the battery capacity of a node. We note that other EH models can also be applied, for example, the correlated EH model [2] where the energy harvested in a slot is correlated to the previous slot. Further, we assume that the energy harvested in a particular slot is available at the end of the slot.

4. Markov Decision Process Formulation

In this section, we describe the Markov Decision Process formulation for the EC and EH networks. We model the network state space by a tuple consisting of the transmission set in the network considering the MAC constraints, the queuing level, and the energy level for each node. Next we specify the action space at each time slot, which affects the network state. Then we analyze the system transition dynamics by decoupling the system into the tuple components with condition and then combining them based on the chain rule.

4.1. Energy Constrained (EC) Networks

4.1.1. Network State Space

The network state is defined as $s ≜ {L, q, e}$ to include the transmission set $L$ , the queuing level q of each node, and the energy e of each node. Note that the state space is not the Cartesian product of each component's state space, because they are interactive (e.g., an active link cannot have an empty queue at its source node).

The transmission set $L (t)$ includes the links that are active (in transmission) at time t, which must not violate the carrier sensing constraints of the MAC. Note that it does not mean that all the receivers of links in the transmission set will receive successfully. Denote the state space of $L$ as $L$ . This component includes the collisions in the MAC layer. We denote the links that are free of collision as $Φ (L) \subset L$ . $Φ (L)$ is deterministic given $L$ , and $Φ = Φ^{s o} \cup Φ^{v o}$ .

To find $L$ , it is equivalent to find the matchings of a graph. Given a graph $G = (V, E)$ , a matching $M (G)$ in G is a set of nonadjacent edges (i.e., no two edges share a common vertex), implying $L \in {M (G)}$ . Further, the MAC layer CSMA adds constraints on $L$ , where for any two links $l_{1}, l_{2} \in L$ , the sources of the two links ( $s (l_{1}), s (l_{2})$ ) must be out of carrier sensing (CS) range of each other. Therefore, $L$ is determined by graph G and the binary hearing matrix $H = [h_{i j}]_{i, j \in V}$ of the network, where the element $h_{i, j} = 1$ if and only if (iff) Node i and Node j are within CS range of each other. Though enumerating matchings of a graph is NP-complete, Bron-Kerbosch algorithm has been shown to be one of the fastest [42] and has been used in this paper.

4.1.2. Decision Epochs and Action Space

A decision epoch corresponds to the beginning of a time slot. The set of decision epochs are denoted by ${1,2, \dots, T}$ . When T is finite (infinite), the decision problem is referred to as a finite-horizon (infinite-horizon) problem. An action, $a (t)$ , is to admit a new link to the “remaining” transmission set $L (t)$ , which comprises those links that did not complete transmission during slot $t - 1$ . Note that $L (t) \subset L (t - 1) \cup a (t - 1)$ . The action space at time t when the system state is $s (t)$ is represented by

\begin{matrix} A (s (t)) = \{l \in E : q_{s (l)} (t) > 0, l \notin L (t), L (t) \cup l \subset L, E_{s u f}\}, \end{matrix}

(3)

where

E_{s u f}

indicates

e_{i} \geq e_{com}^{(i, l)}

\forall i \in l

. Therefore, a link l can join

L (t)

iff its source has a nonempty queue,

L (t) \cup l

is also a transmission set, and the participating nodes have enough energy. Note that it is assumed that at most one link can be admitted to

L (t)

, because

Δ T

is small. Also, note that the null set

⌀ \in A (s (t))

. As a result, an action can be either a “CT” link, a “non-CT” link, or null (no new transmission).

4.1.3. State Transition Dynamics

We first characterize the state transition of each component.

( 1) Transmission Set Dynamics. Let action $a (t) \in A (s (t))$ denote the link admitted to the transmission set. Denote $p \{z^{l} (t) ∣ L (t), a (t)\}$ as the probability that only link $l \in L (t) \cup a (t)$ finishes transmission during slot t and $p \{z^{⌀} (t) ∣ L (t), a (t)\}$ as the probability that no link finishes transmission during slot t. The time index is dropped when there is no ambiguity. For abbreviation, we denote the transition kernel as $p \{L^{'} ∣ L, a\} : = p \{L (t + 1) ∣ L (t), a (t)\}$ ,

\begin{matrix} p \{z^{l} ∣ L, a\} = β_{l} \prod_{\begin{smallmatrix} \underset{k \neq l}{k \in L \cup a} \end{smallmatrix}} (1 - β_{k}), \forall l \in L \cup a, \\ p \{z^{⌀} ∣ L, a\} = \{\begin{cases} \prod_{l \in L \cup a} (1 - β_{l}) & if L \cup a \neq ⌀, \\ 1 & if L \cup a = ⌀ . \end{cases} \end{matrix}

(4)

Define $D^{(L)} ≜ \{L \cup a\} ∖ L^{'}$ ; then we can get

\begin{matrix} p \{L^{'} ∣ L, a\} = \{\begin{cases} p \{z^{l} ∣ L, a\} & if D^{(L)} = l, \\ p \{z^{⌀} ∣ L, a\} & if D^{(L)} = ⌀, \\ 0 & otherwise . \end{cases} \end{matrix}

(5)

( 2) Queue Length Dynamics. Considering that EC networks are most suitable for light traffic applications, such as monitoring, we assume that at most one node self-generates a packet during a slot. Let $I_{i}$ represent a vector with its ith element being $1$ and other elements being $0$ , $1 \leq i \leq N$ , and let $I_{0}$ be the zero vector. The probability that only Node i or no node ( $i = 0$ ) self-generates a packet is given by

\begin{matrix} P \{I_{i}\} = \{\begin{cases} α (i) \prod_{\begin{smallmatrix} j \in V, j \neq i \end{smallmatrix}} (1 - α (j)) & if i \in V, \\ \prod_{\begin{smallmatrix} j \in V \end{smallmatrix}} (1 - α (j)) & if i = 0, \end{cases} \end{matrix}

(6)

where

α (i)

is the probability of a self-generated packet arrival for Node i in a time slot and is assumed i.i.d. over time slots.

Then the queue evolution can be described by the following cases.

Case 1 (( $c_{1}$ ): $q^{'} = q$ ).

This can happen for two reasons: ( $c_{1 a}$ ) no link transmission affects the queue state and no new exogenous arrival affects the queues and ( $c_{1 b}$ ) some link $(i, 0, \cdot)$ destined to the Sink successfully transmits and the source self-generates a new packet:

\begin{matrix} P_{c_{1 a}} = (1 \{Δ L = ⌀\} + 1 \{Δ L \in \bar{Ψ} (L \cup a)\} + 1 \{Δ L \in Ψ (L \cup a), q_{d (Δ L)} = q^{\max}\}) \cdot (P \{I_{0}\} + \sum_{q_{k} = q^{\max}, k \in V} p \{I_{k}\}), \\ P_{c_{1 b}} = 1 \{Δ L \in Ψ (L \cup a), s (Δ L) = i, d (Δ L) = 0\} \cdot P \{I_{i}\} . \end{matrix}

(7)

Case 2 (( $c_{2}$ ): $q^{'} = q + I_{j}$ , $j \in V$ ).

This case includes two possibilities: ( $c_{2 a}$ ) some SISO link $(\cdot, j, ⌀)$ that is directed to Node j successfully transmits and the source self-generates a new packet and ( $c_{2 b}$ ) no link transmission affects the queue state and there is a new exogenous packet arrival to Node j:

\begin{matrix} P_{c_{2 a}} (j) = 1 \{Δ L \in Ψ (L \cup a), d (Δ L) = j\} \cdot P \{I_{s (Δ L)}\}, \\ P_{c_{2 b}} (j) = (1 \{Δ L = ⌀\} + 1 \{Δ L \in \bar{Ψ} (L \cup a)\} + 1 \{Δ L \in Ψ (L \cup a), q_{d (Δ L)} = q^{m a x}\}) \cdot P \{I_{j}\} . \end{matrix}

(8)

Case 3 (( $c_{3}$ ): $q^{'} = q - I_{i}$ , $i \in V$ ).

This happens because some link $(i, 0, \cdot)$ destined to the Sink successfully transmits and no new exogenous arrival affects the queues:

\begin{matrix} P_{c_{3}} (i) = 1 \{Δ L \in Ψ (L \cup a), Δ L = (i, 0, \cdot)\} \cdot (P \{I_{0}\} + \sum_{q_{k}^{'} = q^{\max}, k \in V} P \{I_{k}\}) . \end{matrix}

(9)

Case 4 (( $c_{4}$ ): $q^{'} = q - I_{i} + I_{j}$ , $i, j \in V$ , $i \neq j$ ).

This case includes two possibilities: ( $c_{4 a}$ ) some SISO link $(i, j, ⌀)$ successfully transmits and there was no change in queue due to new exogenous arrival and ( $c_{4 b}$ ) link $(i, 0, \cdot)$ successfully transmits and Node j self-generates a new packet:

\begin{matrix} P_{c_{4 a}} (i, j) = 1 \{Δ L \in Ψ (L \cup a), Δ L = (i, j, ⌀)\} \cdot (P \{I_{0}\} + \sum_{q_{k}^{'} = q^{\max}, k \in V} P \{I_{k}\}), \\ P_{c_{4 b}} (i, j) = 1 \{Δ L \in Ψ (L \cup a), Δ L = (i, 0, \cdot)\} \cdot P \{I_{j}\} . \end{matrix}

(10)

Case 5 (( $c_{5}$ ): $q^{'} = q - I_{i} + 2 I_{j}$ , $i, j \in V$ , $i \neq j$ ).

This occurs because the SISO link $(i, j, ⌀)$ successfully transmits and Node j self-generates a new packet:

\begin{matrix} P_{c_{5}} (i, j) = 1 \{Δ L \in Ψ (L \cup a), Δ L = (i, j, ⌀)\} \cdot P \{I_{j}\} . \end{matrix}

(11)

Case 6 (( $c_{6}$ ): $q^{'} = q - I_{i} + I_{j} + I_{k}$ , $i, j, k \in V$ , $i \neq j \neq k$ ).

This occurs because the SISO link $(i, j, ⌀)$ successfully transmits and a different Node k self-generates a new packet:

\begin{matrix} P_{c_{6}} (i, j, k) = 1 \{Δ L \in Ψ (L \cup a), Δ L = (i, j, ⌀)\} \cdot P \{I_{k}\} + 1 \{Δ L \in Ψ (L \cup a), Δ L = (i, k, ⌀)\} \cdot P \{I_{j}\} . \end{matrix}

(12)

Therefore, the transition kernel of

q (t)

is expressed as (23).

( 3) Energy Evolution Dynamics. The process $e (t)$ is dictated by transmission energy and receiving energy consumption incurred during a finished link transmission, regardless of success or failure. In the CT case, the cooperators consume additional energy in receiving the broadcast packet initiated by the source and in conducting CT. The transition kernel of $e (t)$ is given in (25).

( 4) System Dynamics. The transition matrix of the system states $s ≜ {L, e, q}$ can be obtained by the following theorem.

Theorem 1.

The transition kernel of the EC system, $p {s^{'} ∣ s, a}$ , is equal to the product of (5), (13), and (14):

\begin{matrix} P \{q^{'} ∣ L^{'}, L, q, a\} = \{\begin{cases} P_{c_{1 a}} + P_{c_{1 b}} & if q^{'} = q, \\ P_{c_{2 a}} (j) + P_{c_{2 b}} (j) & if q^{'} = q + I_{j}, j \in V, \\ P_{c_{3}} (i) & if q^{'} = q - I_{i}, i \in V, \\ P_{c_{4 a}} (i, j) + P_{c_{4 b}} (i, j) & if q^{'} = q - I_{i} + I_{j}, i, j \in V, \\ P_{c_{5}} (i, j) & if q^{'} = q - I_{i} + 2 I_{j}, i, j \in V, i \neq j, \\ P_{c_{6}} (i, j, k) & if q^{'} = q - I_{i} + I_{j} + I_{k}, i, j, k \in V, i \neq j \neq k, \\ 0 & otherwise, \end{cases} \end{matrix}

(13)

\begin{matrix} P \{e^{'} ∣ L^{'}, L, e, a\} = \{\begin{cases} 1 \{e^{'} = e - I_{i} e_{t x} - I_{j} e_{r x}\} & i f Δ L = (i, j, ⌀) \in {(L \cup a)}^{so}, j \in V^{*}, \\ 1 \{e^{'} = e - I_{i} e_{init}^{CT} - \sum_{k \in H (i)} I_{k} e_{c o}^{CT}\} & i f Δ L = (i, 0, H (i)) \in {(L \cup a)}^{vo}, i \in V, \\ 1 \{e^{'} = e\} & i f Δ L = ⌀, \\ 0 & o t h e r w i s e . \end{cases} \end{matrix}

(14)

Proof.

According to the chain rule, $p {s^{'} ∣ s, a}$ can be expressed as $p (q^{'} ∣ A) \cdot p (e^{'} ∣ B) \cdot p (L^{'} ∣ C)$ , where $A = {(L^{'}, e^{'}, L, q, e, a)}$ , $B = {(L^{'}, L, q, e, a)}$ , and $C = {(L, q, e, a)}$ . Further, since $q^{'}$ and $(e^{'}, e)$ are conditionally independent, given $(L^{'}, L, q, a)$ , the first element in the product, $p (q^{'} ∣ A)$ , is reduced to (13). Applying similar arguments to $p (e^{'} ∣ B)$ and $p (L^{'} ∣ C)$ , Theorem 1 is proved.

4.1.4. Expected Total Rewards

During a time slot, the system obtains a reward $g (s, a) = 1$ if a packet was delivered to the Sink, and $g (s, a) = 0$ otherwise, where s is the current state. Then, with the state space S and the termination states $S_{t}$ , for $s \in S ∖ S_{t}$ , we have

\begin{matrix} g (s, a) ≜ E [r] = \sum_{\begin{smallmatrix} \underset{d (l) = 0}{l \in Ψ (L \cup a)} \end{smallmatrix}} P \{z^{l} ∣ L, a\} 1 \{Ψ (L \cup a) \neq ⌀\} . \end{matrix}

(15)

And, for

s \in S_{t}

g (s, a) = 0

. The expected total rewards of the process starting from an initial state s, under the policy π, are denoted by

\begin{matrix} J^{π} (s) = \sum_{t = 1}^{\infty} g (s (t), a (t)) . \end{matrix}

(16)

4.1.5. MDP Formulation

A transmission policy is a series of decision rules $π = [a (1), a (2), \dots]$ , where $a (t) : S (t) \to A (S (t))$ . The maximum lifetime $J^{*} (s)$ is given by

\begin{matrix} J^{*} (s) = \max_{π} J^{π} (s) . \end{matrix}

(17)

The optimal lifetime $J^{*} (s)$ is the unique solution of the Bellman's optimality equation [16]:

\begin{matrix} J (s) = \max_{a} \{g (s, a) + \sum_{s^{'}} P \{s^{'} ∣ s, a\} J (s^{'})\} . \end{matrix}

(18)

A policy

π^{*}

is optimal if it achieves the maximum expected lifetime for all starting states; that is,

\begin{matrix} J^{π^{*}} (s) = J^{*} (s), \forall s \in S ∖ S_{t} . \end{matrix}

(19)

4.2. Energy Harvesting Networks

The link set dynamics in the EH network are the same as in the EC network. In this section, we give a more general expression for the queue system, relaxing the low-traffic assumption in the previous sections. EH networks may provide better support for high traffic load than EC networks [43].

In a uniform form, the evolutions of q and e are governed by the following balance equations:

\begin{matrix} ψ^{(i)} (t + 1) = ψ^{(i)} (t) + R^{(i)} M^{(i)} (t) Γ^{(i)} (t) + σ^{(i)} (t), \end{matrix}

(20)

where

i \in {q, e}

ψ^{(q)} = q

, and

ψ^{(e)} = e

; and

R^{(i)}

and

M^{(i)}

will be defined below.

4.2.1. Queue Length Dynamics

In (20), $M^{(q)}$ is a $|E| \times |E|$ diagonal matrix. The diagonal element $M_{l}^{(q)} = 1$ if the transmission of link l is completed and successful (a finished transmission, while always consuming energy, is not necessarily successfully decoded by the receiver, because of collision due to hidden terminals; in case of failure, the packet remains in the queue of the transmitter), and $M_{l}^{(q)} = 0$ otherwise. Note that it is assumed $\sum_{l \in E} ‍ M_{l}^{(q)} \leq 1$ ; that is, at most one link can finish transmission. $Γ^{(q)}$ is the transmission schedule vector, which satisfies $Γ_{l}^{(q)} = 1$ , $\forall l \in L \cup a$ , and $Γ_{l}^{(q)} = 0$ otherwise. $R^{(q)}$ is the $N \times |E|$ routing matrix, whose element in the ith row and lth column is

\begin{matrix} r_{i l}^{(q)} = \{\begin{cases} 1 & if d (l) = i and  Node i is  not  Sink, \\ - 1 & if s (l) = i, \\ 0 & otherwise . \end{cases} \end{matrix}

(21)

σ^{(q)}

is an

N \times 1

vector representing the number of self-generated packets and

σ_{i}^{(q)} = f_{i}

Let $I_{i}$ represent a vector with its ith element being $1$ and other elements being $0$ , $1 \leq i \leq N$ , and let $I_{0}$ be the zero vector. Denote $V^{(q)} (t)$ as the nodes whose queue is not full at time t and define the difference

\begin{matrix} D^{(q)} = \{\begin{cases} q^{'} - q & if D^{(L)} = ⌀, \\ q^{'} - q + I_{s (D^{(L)})} - I_{d (D^{(L)})} & if D^{(L)} \in Φ^{so} \{L \cup a\}, \\ q^{'} - q + I_{s (D^{(L)})} & if D^{(L)} \in Φ^{vo} \{L \cup a\} . \end{cases} \end{matrix}

(22)

Then considering exogenous packet arrivals, the queue status is updated as follows:

\begin{matrix} p \{q^{'} ∣ L^{'}, L, q, a\} = \{\begin{cases} \prod_{i \in V} (α_{i} 1 \{D_{i}^{(q)} = 1\} + (1 - α_{i}) 1 \{D_{i}^{(q)} = 0, i \in V^{(q)}\}) & if D^{(L)} \in \{Φ^{so} \{L \cup a\}, Φ^{vo} \{L \cup a\}, ⌀\}, \\ 0 & otherwise, \end{cases} \end{matrix}

(23)

where

1 \{\cdot\}

is the indicator function.

4.2.2. Energy Evolutions

Using analysis similar to that for the queue in Section 4.2.1, denote $V^{(e)} (t)$ as the nodes whose battery is not full at time t and define

\begin{matrix} D^{(e)} = \{\begin{cases} e^{'} - e & if D^{(L)} = ⌀, \\ e^{'} - e + I_{s (D^{(L)})} \cdot e_{t x} + I_{d (D^{(L)})} \cdot e_{r x} & if D^{(L)} \in E^{s o}, \\ e^{'} - e + I_{s (D^{(L)})} e_{init}^{C T} + \sum_{k \in C (D^{(L)})} I_{k} e_{co}^{C T} & if D^{(L)} \in E^{v o} . \end{cases} \end{matrix}

(24)

Then, considering the influence of energy harvesting, we get

\begin{matrix} p \{e^{'} ∣ L^{'}, L, e, a\} = \{\begin{cases} \prod_{i \in V} (h_{i} 1 \{D_{i}^{(e)} = 1\} + (1 - h_{i}) 1 \{D_{i}^{(e)} = 0, i \in V^{(e)}\}) & if D^{(L)} \in \{E^{so}, E^{vo}, ⌀\}, \\ 0 & otherwise . \end{cases} \end{matrix}

(25)

4.2.3. System Dynamics

Similar to the EC network, the transition matrix of the system states $s ≜ {L, e, q}$ for EH network can be obtained by the following theorem.

Theorem 2.

The transition kernel of the EH system, $p {s^{'} ∣ s, a}$ , is equal to the product of (5), (23), and (25).

4.3. Performance Criterion and Reward Function

Let $v^{π} (s)$ denote the expected total reward, given an initial state s and a policy π (a series of decisions),

\begin{matrix} v^{π} (s) = \lim_{T \to \infty} E_{s}^{π} [\sum_{t = 1}^{T} r (s (t), a (t))], \end{matrix}

(26)

where T is the decision-making horizon length.

The definition of the reward function, $r (s, a)$ , depends on application requirements and can be subjective to network designers. We define two reward functions as follows:

\begin{matrix} r^{(1)} (s, a) = \sum_{\begin{smallmatrix} \underset{d (l) = 0}{l \in Φ (L \cup a)} \end{smallmatrix}} P \{z^{l} ∣ L, a\} 1 \{Φ (L \cup a) \neq ⌀\}, \end{matrix}

(27)

\begin{matrix} r^{(2)} (s, a) = ω_{1} r^{(1)} (s, a) - ω_{2} 1 \{s \in S_{D}\} . \end{matrix}

(28)

In (27), a unit of reward is obtained if a packet is successfully received by the Sink. Equation (28) is a weighted sum of the rewards from delivered packets and the penalty from a QoS degradation (e.g.,

S_{D}

are states where any node's energy drops to zero).

Since the state space and the action space are finite, with a finite reward function, (26) can be expressed as

\begin{matrix} v_{λ}^{π} (s) = E_{s}^{π} \{\sum_{t = 1}^{\infty} λ^{t - 1} r (s (t), a (t))\}, \end{matrix}

(29)

where it is assumed that T is geometrically distributed with parameter λ,

0 \leq λ < 1

. Thus, the expected value of

T

1 / (1 - λ)

. The parameter λ is the discount factor, which measures the present value of one unit of reward received one period in the future [16].

4.4. Optimality Equations

The optimality equation for the expected total discounted reward criteria given an initial state, a.k.a. Bellman's equation, is given as follows:

\begin{matrix} v_{λ}^{*} (s) = \max_{a \in A (s)} \{r (s, a) + \sum_{s^{'}} λ p \{s^{'} ∣ s, a\} v_{λ}^{*} (s^{'})\} . \end{matrix}

(30)

5. Computational Methods

In this section, we present the offline computational algorithms for the EC and EH networks. The methods seek to find the global optimum and do not require any information exchange among the nodes.

5.1. Energy Constrained Networks

Considering that the total energies are nonincreasing, we group the states according to the sum energy (in decreasing order) and solve the stages backwards. The transmission sets are computed using the Bron-Kerbosch algorithm [42], which is widely used and considered as one of the fastest. In each stage, the computation of the energy and queue state spaces is similar to the Bin-Ball problem (the Bin-Ball problem refers to enumerating the ways of allocating $n_{1}$ balls into $n_{2}$ bins. In our problem, the “bins” are the nodes and the “balls” are the total energies or queued packets). For instance, the number of ways to allocate $n_{1}$ units of energies into $n_{2}$ nodes with minimum energy requirement of $θ = e_{th} + 1$ (in $S ∖ S_{t}$ ) is given by

\begin{matrix} \sum_{l = 0}^{n_{2}} {(- 1)}^{l} (\begin{pmatrix} n_{2} \\ l \end{pmatrix}) (\begin{pmatrix} n_{2} + n_{1} - n_{2} θ - l (e^{\max} + 1 - θ) - 1 \\ n_{2} - 1 \end{pmatrix}) . \end{matrix}

(31)

For an integer m (stage), the set of states $S_{m}$ is defined as

\begin{matrix} S_{m} = \{(L, e, q) ∣ \sum_{i \in V} e_{i} = m\}, m = N θ, \dots, N e^{\max} . \end{matrix}

(32)

For each $S_{m}$ and $s \in S_{m}$ , we have

\begin{matrix} J^{*} (s) = \max_{a \in A (s)} \{g (s, a) + \sum_{s^{'}} P \{s^{'} ∣ s, a\} J^{*} (s^{'})\} = \max_{a \in A (s)} \{g (s, a) + \sum_{s^{'} \in S_{N θ} \cup \dots \cup S_{m - 1}} P \{s^{'} ∣ s, a\} J^{*} (s^{'}) + \sum_{s^{'} \in S_{m}} P \{s^{'} ∣ s, a\} J^{*} (s^{'})\}, s \in S_{m} . \end{matrix}

(33)

Therefore, we formulate a mixed integer linear programming (MILP) problem (Subproblem-1):

\begin{matrix} M i n i m i z e \sum_{s \in S_{m}} J (s) \\ s . t . J (s) - \sum_{s^{'} \in S_{m}} P (s^{'} ∣ s, a) J (s^{'}) \geq b (s, a), \\ J (s) \in Z^{+}, s \in S_{m}, \end{matrix}

(34)

where

b (s, a)

are obtained from previously computed stages:

\begin{matrix} b (s, a) = g (s, a) + \sum_{s^{'} \in S_{N θ} \cup \dots \cup S_{m - 1}} P \{s^{'} ∣ s, a\} J^{*} (s^{'}) . \end{matrix}

(35)

The computational method is summarized in Algorithm 1.

Algorithm 1: SSP-MILP algorithm.

input:

$N θ$ = Minimum Sum energy in states $S$ ∖ $S_{t}$ ;

$N e^{\max}$ = Maximum Sum energy in states $S$ ∖ $S_{t}$ ;

$J (s)$ = $0$ , $\forall s \in S_{t}$ ;

output: $J (s)$ , $\forall s \in S$ ∖ $S_{t}$ ;

begin

Find transmission link set using Bron-Kerbosch alg.;

$m \leftarrow N θ$ ;

while $m \leq N e^{\max}$ do

Solve Bin-ball problem to obtain $S_{m}$ .

Solve MILP for Subproblem-1 to obtain $J (s)$ , $s \in S_{m}$ .

$m \leftarrow m + 1$ .

end

5.2. Energy Harvesting Networks

The following theorem provides the basis for a linear programming (LP) approach to solve the MDP problem. The proof can be found in [16].

Theorem 3.

Suppose there exists v, for which $v \geq T (v)$ ; then $v \geq v_{λ}^{*}$ . $T$ is the nonlinear operator defined as $T (v) \equiv {s u p}_{a l l a} {r_{a} + λ P_{a} v}$ .

From the observation in Theorem 3, the primal LP is constructed as follows.

Primal LP. Consider the following:

\begin{matrix} M i n i m i z e \sum_{s \in S} η (s) v (s) \\ s . t . v (s) - \sum_{s^{'} \in S} λ P (s^{'} ∣ s, a) v (s^{'}) \geq r (s, a) \end{matrix}

(36)

for all

a \in A (s)

and all

s \in S

η (s)

are chosen to be positive scalars that satisfy

\sum_{s \in S} ‍ η (s) = 1

However, it is more informative to solve this model using its dual LP [16], which has less rows in the constraints matrix.

Dual LP. Consider the following:

\begin{matrix} M a x i m i z e \sum_{s \in S} \sum_{a \in A (s)} r (s, a) x (s, a) \\ s . t . \sum_{a \in A (s)} x (s, a) - \sum_{s^{'} \in S} \sum_{a \in A (s^{'})} λ P (s ∣ s^{'}, a) x (s^{'}, a) = η (s) \end{matrix}

(37)

and

x (s, a) \geq 0

for

a \in A (s)

and

s \in S

Note that the primal LP has $\sum_{s \in S} ‍ |A (s)|$ rows and $|S|$ columns, while the dual LP has $|S|$ rows and $\sum_{s \in S} ‍ |A (s)|$ columns. In [44], it is shown that the value function for each state in the primal LP is equal to the dual price (the dual price (a.k.a. shadow price) of a constraint is instantaneous change in the objective function by relaxing the constraint by one unit) corresponding to the constraint associated with the state in the dual LP. Thus, the value function associated with a given initial state of the primal LP is obtained by firstly solving the dual problem and then finding the dual price of the corresponding constraint.

5.3. Computation Complexity of Large-Scale MDP

It has been shown that MDP is solvable in polynomial time by successive approximation methods such as value iteration, without the existence of a known strongly polynomial time algorithm [45]. Notoriously, when dealing with large-scale MDP, the computation of iterations may appear prohibitive. In contrast, computation based on large-scale linear programming is fast solvable with the state-of-the-art LP solver, in polynomial time depending only on the size of A (the constraint matrix in $A x \leq b$ ) [46]. Also, the practical fast ϵ-approximation algorithm is most useful for solving LP with exponentially many constraints to trade off solution accuracy for time [47].

In this paper, we do not target finding the optimal policy. However, we note that standard policy iteration algorithms and their various modifications can be applied offline to obtain the optimal stationary policies, at the cost of iterations over the whole state space. In addition, policies obtained from this network model would require a network centralizer to conduct the execution, which is less appealing than a distributed approximate optimal policy and which is out of the scope of this paper.

6. Numerical Evaluations

We present some numerical results on the optimal lifetime for the EC network and the expected total discounted reward for the EH network, under the 2-hop funnel topology (Nodes A, B, C, and D and the Sink), as in Figure 1. Besides the computational limitation inherent to MDP, the motivation for choosing a small network is as follows. It is observed from our previous work [10] that under duty-cycling, a large network is typically reduced into isolated small networks in the time domain, to reduce interference and collisions. This observation renders the possibility to analyze a small tree topology towards the whole network.

6.1. Energy Constrained Networks

In this section, we evaluate the optimal lifetime of the EH network and impacts from certain parameters. Note that the funnel topology (with the existence of the bottleneck Node C) captures the essence of the energy hole problem.

Figure 3 depicts the optimal lifetime for non-CT and CT networks. We observe the lifetimes are linearly increasing with the battery capacity. We also observe that the performance of CT network is significantly higher than that of the non-CT network. For example, with battery capacity of $10$ units, the lifetime improvement factor of CT network is $1.89$ with $1$ cooperator. We assume that either one or two cooperators are required to reach the Sink. This is motivated by the fact that theory [32] predicts that one cooperator can double the range for certain path loss exponents, but our testbed experiment shows that at least two cooperators are needed to double the range [8]. Considering our VMISO link model in (1), we can show that two helpers ( $|C (s (l))| = 2$ ) are required for double range instead of one helper, if the path loss exponent increases by a factor $\log (G_{3}) / \log (G_{2})$ . In practice, the number of cooperators required to conduct CT range extension is regulated by the channel status and different CT physical layer techniques; an efficient protocol should select just enough cooperators to avoid energy overuse. The lifetime with larger battery capacities can be obtained from linear regression. From the slope of the extrapolated lines (not shown), the lifetime growth rate with respect to battery capacity is $1.0$ for non-CT, $1.6$ for CT with 2 cooperators, and $2.1$ for CT with $1$ cooperator. Therefore, we conclude that CT gives less of an advantage in terms of lifetime, if more cooperators are required for the same factor of range extension.

Figure 3

Optimal lifetime for non-CT network and CT network, as battery capacity varies. $N = 4$ (number of nodes). $N_{H} = 1$ or 2 (required number of cooperators). $e_{th} = 1$ , $e_{t x} = e_{r x} = 1$ , $e_{i n i t}^{C T} = 1$ , $e_{c o}^{C T} = 2$ , $q_{\max} = 1$ .

Figure 4 shows the computed lifetime for different packet arrival rates (PARs). The PAR is normalized with the packet length, that is, the number of new exogenous packets per node during a packet duration. The PARs where the curves start to become flat represent the PAR thresholds, beyond which the numerical results are accurate. Figure 4 demonstrates that our model is accurate for a very large range of arrival rates (<0.4). For example, if a node generates a packet every 100 seconds and the packet transmission time is 100 ms, then the PAR is 0.001, well below $0.4$ . In other words, the queue model assumption is valid as long as the application generates less than $4$ packets per node per second.

Figure 4

Lifetime for non-CT network and CT network, as the normalized packet arrival rate (PAR) varies. $N = 4$ , $N_{H} = 2$ . $e_{th} = 1$ , $e_{t x} = e_{r x} = 1$ , $e_{init}^{CT} = 1$ , $e_{co}^{CT} = 2$ , $q^{m a x} = 1$ .

6.2. Energy Harvesting Networks

In this section, we evaluate the expected total discounted reward and the impacts from certain parameters. The reward function in (27) is used for plotting Figures 5, 6, and 7, while (28) is used for Figure 8.

Figure 5

The effect of discount factor λ. $α = 0.1$ , $γ = 0.01$ , $e_{\max} = 4$ , $q_{\max} = 1$ .

Figure 6

The effect of energy harvesting rate γ. $α = 0.1$ , $λ = 0.9999$ or $0.99999$ , $e_{\max} = 4$ , $q_{\max} = 1$ .

Figure 7

The effect of packet arrival rate α. $λ = 0.9999$ , $γ = 0.02$ , $e_{\max} = 4$ .

Figure 8

The effect of weight factor $ω_{1}$ . $α = 0.1$ , $λ = 0.99999$ , $γ = 0.02$ , $e_{\max} = 4$ .

Figure 5 shows the expected total discounted reward of non-CT and CT networks, versus different discount factors λ. The decision-making horizon is indicated by $1 / (1 - λ)$ time units. The system parameters are $α = 0.1$ , $e_{m a x} = 4$ , $q_{m a x} = 1$ , $e_{t x} = e_{r x} = e_{i n i t}^{C T} = 1$ , and $e_{c o}^{C T} = 2$ , where $e_{t x}$ and $e_{r x}$ represent transmitting/receiving energy, $e_{i n i t}^{C T}$ represents the energy consumption of initiating CT, and $e_{c o}^{C T}$ is the energy consumption of a cooperator. We observe that the rewards increase as λ increases due to an increase in the number of decision epochs, which allows prolonged operation of the network in the optimal states prescribed by the optimal policy. While CT network always outperforms non-CT network ( $55 %$ to $61 %$ improvement), the curve of the CT network also grows with a steeper rate.

Figure 6 depicts the expected total discounted reward of non-CT and CT networks versus energy harvesting rate γ, with $λ = 0.9999$ or $λ = 0.99999$ . Other system parameters are the same as used in Figure 5. We observe that, for both non-CT and CT network, the rewards increase linearly when γ is below $0.01$ . Then the curves grow with a slower rate, before the performances of non-CT and CT networks start to converge when the value of γ reaches a threshold value $γ_{t h}$ . For $λ = 0.9999$ , $γ_{th} = 0.04$ , and for $λ = 0.99999$ , $γ_{t h} = 0.07$ . This threshold is related to both energy harvesting rate and battery capacity. High harvesting rate, with which packet transmission opportunities are less constrained by energy level, diminishes the benefits of CT. However, in practice, when harvesting rate is not large enough, the CT network still provides a significant gain over the non-CT network, as shown in the figure.

Figure 7 shows the expected total discounted reward of non-CT and CT networks versus exogenous packet arrival rate α, with $q_{\max} = 1$ or $q_{\max} = 2$ . The system parameters are $λ = 0.9999$ and $γ = 0.02$ . It can be seen that when the offered traffic load (α) increases, while not saturating the networks, the rewards show a linear increase. When α is large enough, the rewards are bottlenecked by the queue capacity. We also observe that the reward of non-CT shows slight instability between $α = 0.01$ and $0.02$ , between which the rewards slightly decrease, for both $q_{\max} = 1$ and $q_{\max} = 2$ . We did not explain this observation. However, the CT network does not show this behavior.

Figure 8 gives an example that the calculated benchmark of performance is highly affected by the choice of reward function. The network designer can choose a reward function according to a particular QoS requirement. The x-axis is the preference weighting factor $ω_{1}$ in the reward function in (28). Note that $ω_{2} = 1 - ω_{1}$ represents the significance of an occurrence of QoS degradation. When a certain application deems a state with a node having zero energy as a severe situation, both non-CT and CT networks will be conservative in initiating a packet transmission, limiting the packets delivered in a given time horizon. On the other hand, the value functions increase as the significance relaxes. Again, CT network still outperforms non-CT network in both magnitude and rate in all cases.

In Figure 9, we evaluate the effect of the CT overhead on the expected total discounted reward of the CT network, with different preference weighting factors $ω_{1} = 0.1,0.5,0.9$ . The expected total discounted rewards of the non-CT network, which are constant, are also plotted for reference. The CT overhead is modeled as an extended packet transmission time, which is a result of the extra preamble symbols in front of the packet payload to achieve diversity and synchronization for the purpose of range extension [12, 48]. Specifically, the horizontal axis in Figure 9 is the CT overhead, quantified in terms of the fraction of the total packet length, from $0$ to $10 %$ . For $ω_{1} = 0.9$ , at the overhead rate of $0.1$ ( $10 %$ ), the CT network outperforms the non-CT network by $38.2 %$ . For the CT network, comparing with the overhead rate of $0$ , only $2.3 %$ degradation is observed at the overhead rate of $0.1$ . We note that, in a more practical setting with control packets accounted for, we expect to see a bigger degradation value. However, fortunately it is shown that, even with control packets accounted for, CT still has a significant advantage over non-CT in both small networks [9] and large-scale multihop networks [13].

Figure 9

The effect of CT overhead rate. Weight factors $ω_{1} = 0.1,0.5,0.9$ , $α = 0.1$ , $λ = 0.99999$ , $γ = 0.02$ , $e_{\max} = 4$ .

7. Conclusions

Energy harvesting (EH) and energy constrained (EC) multihop wireless sensor networks are modeled under a unified Markovian decision process framework. The model extends the literature by encompassing, from the network interaction point of view, energy harvesting, routing, MAC, and cooperative transmission. The performance of EC and EH networks is characterized by finite-horizon and infinite-horizon problems, respectively. The EC model captures the traffic imbalance due to the “energy hole,” which is a known bottleneck problem limiting network lifetime. In the EH network, the expected total discounted reward model allows one to formulate different performance metrics of interest. The model of EC networks is solved by mixed integer linear programing, capitalizing on the stochastic shortest path nature of the problem, while the model of EH networks is solved by a dual linear programming approach. Numerical results evaluate the sensitivity of several network parameters on the optimal performance of non-CT and CT networks and show the nontrivial impacts of cooperative transmission.

Footnotes

Disclosure

Mary Ann Weitnauer was formerly Mary Ann Ingram.

Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.

Acknowledgment

The authors gratefully acknowledge support for this research from the US National Science Foundation (NSF) under Grant CNS-1017984.

References

Sharma

Mukherji

Joseph

Gupta

Optimal energy management policies for energy harvesting sensor nodes

IEEE Transactions on Wireless Communications 2010 9 4 1326 1336

10.1109/TWC.2010.04.080749

2-s2.0-77951293654

Wang

Tajer

Wang

Communication of energy harvesting tags

IEEE Transactions on Communications 2012 60 4 1159 1166

10.1109/tcomm.2012.022912.110298

2-s2.0-84862820668

Chen

Zhao

Krishnamurthy

Djonin

Transmission scheduling for optimizing sensor network lifetime: a stochastic shortest path approach

IEEE Transactions on Signal Processing 2007 55 5 2294 2309

10.1109/tsp.2007.893213

MR2473844

2-s2.0-34247874478

Akyildiz

I. F.

Sankarasubramaniam

Cayirci

Wireless sensor networks: a survey

Computer Networks 2002 38 4 393 422

10.1016/s1389-1286(01)00302-4

2-s2.0-0037086890

Mohapatra

An analytical model for the energy hole problem in many-to-one sensor networks

Proceedings of the IEEE 62nd Vehicular Technology Conference (VTC-Fall ′05)

September 2005

Dallas, Tex, USA

2721 2725

10.1109/VETECF.2005.1559043

Laneman

J. N.

Tse

D. N.

Wornell

G. W.

Cooperative diversity in wireless networks: efficient protocols and outage behavior

IEEE Transactions on Information Theory 2004 50 12 3062 3080

10.1109/tit.2004.838089

MR2103484

2-s2.0-5044252003

Lin

Jung

Chang

Y. J.

Jung

J. W.

Weitnauer

M. A.

On cooperative transmission range extension in multi-hop wireless ad-hoc and sensor networks: a review

Ad Hoc Networks 2015 29 117 134

10.1016/j.adhoc.2015.01.018

2-s2.0-84923085332

Jung

Chang

Y. J.

Ingram

M. A.

Experimental range extension of concurrent cooperative transmission in indoor environments at 2.4GHz

Proceedings of the IEEE Military Communications Conference (MILCOM ′10)

November 2010

San Jose, Calif, USA

148 153

10.1109/milcom.2010.5680162

2-s2.0-79951625314

Guntupalli

Lin

Weitnauer

M. A.

F. Y.

ACT-MAC: an asynchronous cooperative transmission MAC protocol for WSNs

Proceedings of the IEEE International Conference on Communications Workshops (ICC ′14)

June 2014

Sydney, Australia

IEEE

848 853

10.1109/iccw.2014.6881306

2-s2.0-84906748876

10.

Lin

Ingram

M. A.

SCT-MAC: a scheduling duty cycle MAC protocol for cooperative wireless sensor network

Proceedings of the IEEE International Conference on Communications (ICC ′12)

June 2012

Ottawa, Canada

345 349

10.1109/icc.2012.6364580

2-s2.0-84871966253

11.

Jung

J. W.

Ingram

M. A.

Using range extension cooperative transmission in energy harvesting wireless sensor networks

Journal of Communications and Networks 2012 14 2 169 178

10.1109/jcn.2012.6253065

12.

Lin

Weitnauer

M. A.

Diversity in synchronization for scheduled OFDM time-division cooperative transmission

Proceedings of the IEEE Military Communications Conference (MILCOM ′15)

October 2015

Tampa, Fla, USA

IEEE

570 575

10.1109/milcom.2015.7357504

13.

Lin

Ingram

M. A.

OSC-MAC: duty cycle scheduling and cooperation in multi-hop wireless sensor networks

Proceedings of the IEEE Wireless Communications and Networking Conference (WCNC ′13)

April 2013

Shanghai, China

866 871

10.1109/wcnc.2013.6554677

2-s2.0-84881598866

14.

Lin

Weitnauer

M. A.

A Markovian approach to modeling the optimal lifetime of multi-hop wireless sensor networks

Proceedings of the IEEE Military Communications Conference (MILCOM ′13)

November 2013

San Diego, Calif, USA

IEEE

1702 1707

10.1109/milcom.2013.288

2-s2.0-84897692440

15.

Jacoub

J. K.

Liscano

Bradbury

J. S.

A survey of modeling techniques for wireless sensor networks

Proceedings of the 5th International Conference on Sensor Technologies and Applications (SENSORCOMM ′11)

2011

103 109

16.

Puterman

M. L.

Markov Decision Processes: Discrete Stochastic Dynamic Programming 1994

New York, NY, USA

John Wiley & Sons

17.

Zhang

Wang

Zhao

Chen

Zhang

Modeling and evaluation of wireless sensor network protocols by stochastic timed automata

Electronic Notes in Theoretical Computer Science 2013 296 261 277 Proceedings the Sixth International Workshop on the Practical Application of Stochastic Modelling (PASM) and the Eleventh International Workshop on Parallel and Distributed Methods in Verification (PDMC)

10.1016/j.entcs.2013.09.001

18.

Shareef

Zhu

Effective stochastic modeling of energy-constrained wireless sensor networks

Journal of Computer Networks and Communications 2012 2012 20

870281

10.1155/2012/870281

2-s2.0-84871399252

19.

Ozel

Tutuncuoglu

Yang

Ulukus

Yener

Transmission with energy harvesting nodes in fading wireless channels: optimal policies

IEEE Journal on Selected Areas in Communications 2011 29 8 1732 1743

10.1109/jsac.2011.110921

2-s2.0-80052040201

20.

Iannello

Simeone

Spagnolini

Optimality of myopic scheduling and whittle indexability for energy harvesting sensors

Proceedings of the 46th Annual Conference on Information Sciences and Systems (CISS ′12)

March 2012

Princeton, NJ, USA

IEEE

1 6

10.1109/ciss.2012.6310816

2-s2.0-84868595047

21.

Jung

J. W.

Weitnauer

M. A.

On using cooperative routing for lifetime optimization of multi-hop wireless sensor networks: analysis and guidelines

IEEE Transactions on Communications 2013 61 8 3413 3423

10.1109/tcomm.2013.052013.120707

2-s2.0-84883287626

22.

Lee

J.-H.

Moon

Modeling and optimization of energy efficient routing in wireless sensor networks

Applied Mathematical Modelling 2014 38 7-8 2280 2289

10.1016/j.apm.2013.10.044

MR3176250

2-s2.0-84895924067

23.

Huang

Neely

M. J.

Utility optimal scheduling in energy harvesting networks

Proceedings of the 12th ACM International Symposium on Mobile Ad Hoc Networking and Computing (MobiHoc′11)

May 2011

ACM

10.1145/2107502.2107531

2-s2.0-84857499425

24.

Bhardwaj

Chandrakasan

Bounding the lifetime of sensor networks via optimal role assignments

Proceedings of the 21st IEEE Annual Joint Conference of the IEEE Computer and Communications Societies (INFOCOM ′02)

2002

New York, NY, USA

1587 1596

10.1109/INFCOM.2002.1019410

25.

Zhang

Hou

On deriving the upper bound of α-lifetime for large sensor networks

Proceedings of the 5th ACM International Symposium on Mobile Ad Hoc Networking and Computing (MobiHoc ′04)

May 2004

Tokyo, Japan

ACM

121 132

10.1145/989459.989475

26.

Huang

W.-J.

Peter Hong

Y.-W.

Jay Kuo

C.-C.

Lifetime maximization for amplify-and-forward cooperative networks

IEEE Transactions on Wireless Communications 2008 7 5 1800 1805

10.1109/twc.2008.061075

2-s2.0-47149116514

27.

Ikki

S. S.

Ahmed

M. H.

Performance analysis of adaptive decode-and-forward cooperative diversity networks with best-relay selection

IEEE Transactions on Communications 2010 58 1 68 72

10.1109/tcomm.2010.01.080080

2-s2.0-76649093617

28.

Chang

J.-H.

Tassiulas

Energy conserving routing in wireless ad-hoc networks

Proceedings of the IEEE 19th Annual Joint Conference of the IEEE Computer and Communications Societies (INFOCOM ′00)

March 2000

Tel Aviv, Israel

22 31

10.1109/INFCOM.2000.832170

29.

Jung

J. W.

Ingram

M. A.

On the optimal lifetime of cooperative routing for multi-hop wireless sensor networks

Proceedings of the IEEE Wireless Communications and Networking Conference (WCNC ′13)

April 2013

Shanghai, China

IEEE

1315 1320

10.1109/wcnc.2013.6554754

2-s2.0-84881581086

30.

Tutuncuoglu

Yener

Optimum transmission policies for battery limited energy harvesting nodes

IEEE Transactions on Wireless Communications 2012 11 3 1180 1189

10.1109/twc.2012.012412.110805

2-s2.0-84858332615

31.

Stuber

G. L.

Principles of Mobile Communication 1996 1st

Norwell, Mass, USA

Kluwer Academic

32.

Jakllari

Krishnamurthy

S. V.

Faloutsos

Krishnamurthy

P. V.

Ercetin

A cross-layer framework for exploiting virtual MISO links in mobile ad hoc networks

IEEE Transactions on Mobile Computing 2007 6 6 579 594

10.1109/TMC.2007.1068

2-s2.0-34247608371

33.

Nardelli

Knightly

E. W.

Closed-form throughput expressions for CSMA networks with collisions and hidden terminals

Proceedings of the IEEE INFOCOM

March 2012

Orlando, Fla, USA

2309 2317

10.1109/infcom.2012.6195618

2-s2.0-84861598653

34.

Liu

Lin

Tao

Korakis

Erkip

Panwar

The hidden cost of hidden terminals

Proceedings of the IEEE International Conference on Communications (ICC ′10)

May 2010

Cape Town, South Africa

IEEE

1 6

10.1109/icc.2010.5502214

2-s2.0-77955408386

35.

Boorstyn

R. R.

Kershenbaum

Maglaris

Sahin

Throughput analysis in multihop CSMA packet radio networks, communications

IEEE Transactions on Communications 1987 35 3 267 274

10.1109/tcom.1987.1096769

2-s2.0-0023310717

36.

Bui

Zhu

Botta

Pescapé

A markovian approach to multipath data transfer in overlay networks

IEEE Transactions on Parallel and Distributed Systems 2010 21 10 1398 1411

10.1109/tpds.2010.13

2-s2.0-77956177444

37.

Wang

Hempstead

Yang

A realistic power consumption model for wireless sensor network devices

Proceedings of the 3rd Annual IEEE Communications Society on Sensor and Ad Hoc Communications and Networks (SECON ′06)

September 2006

Reston, Va, USA

IEEE

286 295

10.1109/sahcn.2006.288433

2-s2.0-43849110586

38.

Cui

Goldsmith

A. J.

Bahai

Energy-efficiency of MIMO and cooperative MIMO techniques in sensor networks

IEEE Journal on Selected Areas in Communications 2004 22 6 1089 1098

10.1109/JSAC.2004.830916

2-s2.0-4344710765

39.

Lee

Z. A.

Han

Tan

H.-P.

Empirical modeling of a solar-powered energy harvesting wireless sensor node for time-slotted operation

Proceedings of the IEEE Wireless Communications and Networking Conference (WCNC ′11)

March 2011

Cancun, Mexico

IEEE

179 184

10.1109/wcnc.2011.5779157

2-s2.0-79959311532

40.

Kailas

Brunelli

Ingram

M. A.

A simple energy model for the harvesting and leakage in a supercapacitor

Proceedings of the IEEE International Conference on Communications (ICC ′12)

June 2012

Ottawa, Canada

IEEE

6278 6282

10.1109/icc.2012.6364809

2-s2.0-84871982289

41.

Mao

Cheung

M. H.

Wong

V. W. S.

An optimal energy allocation algorithm for energy harvesting wireless sensor networks

Proceedings of the IEEE International Conference on Communications (ICC ′12)

June 2012

Ottawa, Canada

265 270

10.1109/icc.2012.6364174

2-s2.0-84871961307

42.

Bron

Kerbosch

Algorithm 457: finding all cliques of an undirected graph

Communications of the ACM 1973 16 9 575 577

10.1145/362342.362367

43.

Z. A.

Tan

H.-P.

Probabilistic polling for multi-hop energy harvesting wireless sensor networks

Proceedings of the IEEE International Conference on Communications (ICC ′12)

June 2012

Ottawa, Canada

IEEE

271 275

10.1109/icc.2012.6363641

2-s2.0-84871955682

44.

Stidham

Jr.

Optimal control of markov chains

Computational Probability 2000

Springer

325 363

10.1007/978-1-4757-4828-4_9

45.

Littman

M. L.

Dean

T. L.

Kaelbling

L. P.

On the complexity of solving markov decision problems

Proceedings of the 11th International Conference on Uncertainty in Artificial Intelligence

August 1995

Montreal, Canada

394 402

46.

Tardos

A strongly polynomial algorithm to solve combinatorial linear programs

Operations Research 1986 34 2 250 256

10.1287/opre.34.2.250

MR861043

2-s2.0-0022672734

47.

Leighton

Makedon

Plotkin

Stein

Tardos

Tragoudas

Fast approximation algorithms for multicommodity flow problems

Journal of Computer and System Sciences 1995 50 2 228 243

10.1006/jcss.1995.1020

MR1330255

2-s2.0-0029289525

48.

Lin

Weitnauer

M. A.

SINR analysis and energy allocation of preamble and training for time division CT with range extension

Proceedings of the IEEE Military Communications Conference (MILCOM ′15)

October 2015

Tampa, Fla, USA

1697 1702

10.1109/milcom.2015.7357689

Modeling of Multihop Wireless Sensor Networks with MAC,Queuing,and Cooperation

Abstract

1. Introduction

2. Related Work

3. System Model and Assumptions

3.1. Time Slots in the System

3.2. Topology and Link Models

3.2.1. Topology Model

3.2.2. Interference Model

3.2.3. Medium Access Control Model

3.3. Traffic and Energy Models

3.3.1. Traffic Model

3.3.2. Energy Harvesting Model

4. Markov Decision Process Formulation

4.1. Energy Constrained (EC) Networks

4.1.1. Network State Space

4.1.2. Decision Epochs and Action Space

4.1.3. State Transition Dynamics

Case 1 (( c 1 ): q ′ = q ).

Case 2 (( c 2 ): q ′ = q + I j , j ∈ V ).

Case 3 (( c 3 ): q ′ = q - I i , i ∈ V ).

Case 4 (( c 4 ): q ′ = q - I i + I j , i , j ∈ V , i ≠ j ).

Case 5 (( c 5 ): q ′ = q - I i + 2 I j , i , j ∈ V , i ≠ j ).

Case 6 (( c 6 ): q ′ = q - I i + I j + I k , i , j , k ∈ V , i ≠ j ≠ k ).

Theorem 1.

Proof.

4.1.4. Expected Total Rewards

4.1.5. MDP Formulation

4.2. Energy Harvesting Networks

4.2.1. Queue Length Dynamics

4.2.2. Energy Evolutions

4.2.3. System Dynamics

Theorem 2.

4.3. Performance Criterion and Reward Function

4.4. Optimality Equations

5. Computational Methods

5.1. Energy Constrained Networks

Algorithm 1: SSP-MILP algorithm.

5.2. Energy Harvesting Networks

Theorem 3.

5.3. Computation Complexity of Large-Scale MDP

6. Numerical Evaluations

6.1. Energy Constrained Networks

6.2. Energy Harvesting Networks

7. Conclusions

Footnotes

Disclosure

Conflict of Interests

Acknowledgment

References

Case 1 (( $c_{1}$ ): $q^{'} = q$ ).

Case 2 (( $c_{2}$ ): $q^{'} = q + I_{j}$ , $j \in V$ ).

Case 3 (( $c_{3}$ ): $q^{'} = q - I_{i}$ , $i \in V$ ).

Case 4 (( $c_{4}$ ): $q^{'} = q - I_{i} + I_{j}$ , $i, j \in V$ , $i \neq j$ ).

Case 5 (( $c_{5}$ ): $q^{'} = q - I_{i} + 2 I_{j}$ , $i, j \in V$ , $i \neq j$ ).

Case 6 (( $c_{6}$ ): $q^{'} = q - I_{i} + I_{j} + I_{k}$ , $i, j, k \in V$ , $i \neq j \neq k$ ).