Abstract
The scale and functional complexity of future-generation wireless sensor networks will call for a non-homogeneous architecture, in which different sensors play different logical roles or functions, or have different physical capabilities in terms of energy, computing power, or network bandwidth. When sensors of the same group need to communicate with each other, their communications often have to pass through other sensors, thus forming an overlay on top of the wireless sensor network. The topology of the overlay is critical. It must have a low diameter to reduce the communication latency between those sensors. It also needs to avoid using other sensors for relaying the communications as much as possible, so as to preserve the energy of other sensors. In this paper, we propose a distributed overlay formation protocol taking account of the above factors. Through simulation, we compare our protocol with two overlay formation protocols, one that generates a fully connected topology and the other a minimum spanning tree. The results show that our protocol can achieve better performance both in message latency and energy consumption.
Introduction
As techniques and applications of wireless sensor networks mature, we expect their scale and functional complexity to increase dramatically. Progressively, they will move away from homogeneous architectures and become more and more heterogeneous [1–6]. Heterogeneity may come in different flavors. For example, different sensors in the network may have different physical capabilities, such as energy, computing power, network interface, sensing modules, and communication bandwidth. Even if sensors are homogenous physically, different sensors may play different roles and assume different responsibilities, e.g., cluster heads in a hierarchical sensor network. Sensors in a heterogeneous wireless sensor network can thus be classified into groups according to their characteristics and attributes.
Sensors of the same group may need to communicate with each other. If these sensors are scattered in different locations of the sensor network and they use the same network interface as others, then the communications between these sensors have to pass through other sensors. Logically, these sensors are interconnected by an overlay network, which is laid on top of the underlying wireless sensor network. Without loss of generality, we will call the sensors under consideration as powerful sensors and those not in the group as normal sensors.
The topology of the overlay network is critical. The overlay must have a low diameter to reduce the communication latency between the powerful sensors. It also needs to avoid using normal sensors for relaying messages as much as possible, so as to preserve the energy of normal sensors. At one extreme is a fully connected overlay, in which every powerful sensor has an overlay link to every other powerful sensor. However, since an overlay link may lay over a number of normal sensors, the more overlay links there are, the more normal sensors will be used to relay messages and consume their energy. Another extreme in the spectrum is a ring or linear list. Although a ring can minimize the number of overlay links, the communication path between any two powerful sensors may be too long to support upper-layer applications efficiently. Apparently, we need to find a middle ground to balance between application efficiency and sensor power consumption.
In this study, we first introduce three basic overlay formation mechanisms based on a centralized approach. The results then lead to our distributed overlay formation protocol. The distributed protocol aims to achieve two goals. The first is to provide efficient communications between the powerful sensors in the overlay. The second goal is to reduce the power consumption of the normal sensors for relaying messages for the powerful sensors. The protocol consists of two phases. In the first phase, powerful sensors discover some number of other powerful sensors as their neighbors. This limits the degree of a powerful sensor in the overlay. In the second phase, powerful sensors try to eliminate redundant messages and use the shortest paths for message forwarding by building a spanning tree out of each powerful sensor. This procedure is called active edge pruning (AEP). Details will be given later.
The rest of this paper is organized as follows. Section II describes our problem. In Section III, we discuss and compare three basic overlay formation mechanisms. Then we propose our distributed overlay formation protocol in Section IV. Performance evaluation of the discussed algorithms are presented in Section V. Related studies are given in Section VI. We summarize the paper in Section VII.
Problem Statement
Consider a sensor network G(V, E), where V represents the set of sensors ν i for 1 ν i |V|. A link ν i → νj ∊ E if ν j is within the radio range of ν i and ν i can send a message to ν j directly. Without loss of generality, we assume that the sensor network consists of two types of nodes: powerful and normal sensors, denoted VPN and VSN respectively. They are placed randomly in the field, and the number of powerful sensors is known. The communication between the powerful sensors requires the normal sensors in between to relay. The network formed by VPN and the communication paths between them form an overlay.
An overlay over the powerful sensors may take various topologies. Different applications
may prefer different overlays. In this study, we consider an overlay that interconnects all
powerful nodes in VPN and the diameter of the overlay, measured
in number of hops through the normal sensors, is the minimum. That is, the problem is to
where pathlength(ν i , ν j ) is the number of hops to send a message from νvi to ν j . Such an overlay is useful for applications such as multicast and resource discovery in sensor networks, in which a powerful sensor needs to send a message to all other powerful sensors.
In addition to forming desirable topologies, the overlay formation protocol also needs to
prolong the lifetime of the sensor network G by balancing the energy
consumption of the normal sensor nodes in VSN that help relay
messages among powerful nodes. This can be expressed as follows.
where lifetime(ν j ) is the time interval starting from the network initialization until νvj runs out of its energy.
In summary, the goal of our study is to design an overlay formation protocol that allows a powerful node to disseminate data to other powerful nodes efficiently. Meanwhile, the protocol should lengthen the life time of the sensor network as much as possible.
To simplify the discussion, we assume that the sensor nodes in the network have an identical radio communication range, which is modeled as a disk. The communication between two neighboring sensors is assumed to be symmetric. We also assume that wireless ad hoc end-to-end routing protocols such as AODV [7] and TORA [8] are available for routing messages between sensors. We will use the hop count to estimate the delay of sending a message from ν i to ν j .
Each node ν i ∊ V has an initial energy budget denoted as EB(i). Normal sensors have a finite energy budget EB. Each node ν i ∊ V takes Esnd and Ercv energy units to send and receive a message, respectively. A node ν i can send (or receive) a message if EB(i)-Esnd > 0 (or EB(i)-Ercv > 0). A node ν i is treated as a dead node if EB(i) = 0. After ν i has sent (or received) a message, then EB(i) becomes EB(i) = EB(i)-Esnd (or EB(i) = EB(i)-Ercv).
In this section, we present three centralized algorithms for forming an overlay network
comprised of powerful nodes. They represent different extremes in the spectrum of overlay
topologies and serve as a basis for comparison with our proposed distributed algorithm, to
be presented in the next section. The metrics for comparison are listed below. Total number of messages (Tm): This is
the number of messages required to disseminate a data item from a powerful node
ν
i
∊ VPN to all nodes in
VPN — {ν
i
}. Average message delay (Dm): This metric
measures the average delay required to send a data from ν
i
∊ VPN to each node V
j
∊
VPN — {ν
i
}. Link cost (Lc): The link cost is the
total number of wireless links required to maintain the connectivity of all nodes in
VPN. Average node degree (Nd): This metric
indicates the total inbound and outbound degrees of the node.
Fully Connected Overlay (FC)
Giver a sensor network G(V,E), where V = VPN ∪ VSN, the fully connected network (FC) is to let each Vi ∊ VPN construct a shortest path towards each vj ∊ V — {ν i }. A fully connected overlay represents one extreme in overlay topologies that emphasizes on the performance of disseminating messages to all the powerful sensors.
Consider a square region in which sensor nodes are randomly deployed. Suppose the network diameter is 2 × k and there are n powerful sensors. Clearly, in FC, the average number of hops of a routing path between two sensors is Nd = k. Tm is k × (n − 1) since each node in VPN has n − 1 paths to other nodes in VPN. The Lc in FC is k × n × (n − 1). Finally, Nd is n × (n − 1). Apparently, Lc and Nd in FC will be increased dramatically if |VPN| increases. However, FC has the minimum Dm.
Ring-Based Overlay (Ring )
The ring-based network (R
There are different ways to building a R
where V, u ∊ VPN and message delay (V, u) denotes the message delay between V and u.
To solve the problem, we adopt the approximation algorithm proposed by [9]. The idea is to first build a
minimum spanning tree comprised of nodes in VPN, where the
initial root node is any node in VPN. We include the nodes and
links in R
It is proved that the ring network constructed using the above algorithm has the property that the sum of delays of sending a messages among nodes in VPN is smaller than two times of the smallest delay.
It is possible to use Prime's or Krussal's algorithm to implement a minimum spanning tree
(MST) on top of nodes in VPN. If so,
Dm is 2 × d × k, where d
is the depth of the tree. Tm and
Lc are both equal to k ×
(n − 1).
Comparisons
Table 1 summarizes the
performance metrics of the three centralized algorithms discussed above. MST is a better
choice in terms of Lc and Nd, but
its Dm is larger than FC's. R
We want to note that in this section FC, R
Proposed Protocol
In this section, our proposed overlay formation protocol is introduced. The algorithm can be divided into two parts. The first part is the basic overlay construction algorithm. In this algorithm, each powerful node tries to find a number of neighboring powerful nodes to form an overlay. We will use the term neighbor threshold to denote the number of neighbors in the following discussion.
Unfortunately, this basic construction procedure may cause redundant messages in the overlay. Thus, in the second part, we introduce a procedure called active edge pruning (AEP). This procedure builds a spanning tree over the overlay for each powerful node to eliminate redundant messages and use the shortest paths for message forwarding on the overlay.
Network Construction Algorithm
The idea of the basic overlay construction algorithm is to let each powerful node find other powerful nodes in its vicinity using expanded ring search [10]. That is, each powerful sensor A first searches other powerful sensors within its α-hop horizon (or the search range). If there exist some powerful sensors, A then records the hop distance and identifier (id in short) of those sensors. Otherwise, A expands its search to the (α + δ × k)-hop horizon, where k = 1, 2, 3, … Note that k can be chosen properly according to the network diameter.
Algorithm 1 shows the overlay initialization operation of our algorithm, where nn is the minimum number of neighbors that each overlay node will have. This value will affect the shape of the resultant overlay and can be used to control the diameter of the formed overlay. nbr[nn] i is the buffer to store neighbor information, including node id and corresponding hop count.
Summary of FC, Ring and MST, where k is the average hop
count between any two nodes and d is the depth of the tree
Summary of FC, R
Lines [1–7] are the main procedure for constructing an overlay. Its task is to find a set of neighbors for each overlay node. First, line [2] uses a expended ring search (ERS) [10] to find a set of initial neighbors. In the algorithm, TRIES is used to limit search time of EPS. In line [1], a node will stop searching when it has found enough neighbors or it cannot find enough nodes after tried TRIES times. This is to prevent a far away node from flooding to the whole network.
Unfortunately, the algorithm in Algorithm 1 cannot guarantee connectivity of the overlay, because each node will stop finding neighbors if it has found enough neighbors. It is possible to form multiple disjoint overlays. With the assumption that the size of VPN is known, each node can check the connectivity of the overlay when it builds the spanning tree. If the overlay is disconnected, a special flooding is conducted for each source. Details will be given later.

Algorithm of overlay initialization, where nn denotes the minimum number of neighbors that a node would have, nbr[nn] i represents the set of neighbors of node i (nbr[nn] i includes the id of each neighbor and the distance in hopcount from i to the neighbor), TRIES is the number of search times, and Dji denotes the distance from j to i.
After the powerful nodes have connected to some nearby neighbors, we then build a shortest-path tree for each node. The reason is to eliminate redundant messages due to broadcasting messages in the overlay. We build a spanning tree, like a minimum cost spanning tree in Section III. This spanning tree is well suited for sensor networks because the low mobility of sensors reduces the tree maintaining cost. However, considering the routing efficiency for each node, building multiple spanning trees rooted at each node is preferable.
Thus we propose a mechanism, active edge pruning (AEP), to construct spanning trees. The idea is to choose the shorter links to the tree root and prune unnecessary links. Figure 1 shows a spanning tree rooted at node A. The solid lines are tree links and dotted lines are pruned edges. The detailed algorithm is given in Algorithm 2.

A spanning tree rooted at node A.
In Fig. 1, when node

Algorithm of active edge pruning, where C M DBT denotes the command—BuildTree—that is originated from source S, DISTSJ represents the distance from source S to itself through neighbor J in hops, and DISTScur denotes the currently shortest hopcount from source S to itself in hops.
This tree building process can also help to repair the tree on node failure. Each tree root will collect information of all tree members in the tree building process. When the root could not receive responses from a certain node, it will try to repair the tree by finding a direct route to that node. This could guarantee the integrity of each tree and connect each overlay partition.
When the system runs for some time, the relaying nodes on the overlay links will run out of their energy, thereby breaking the links. There are at least two strategies to maintain an overlay. First, when an overlay node finds that its relaying neighbor fails, the overlay node had to find another link to keep the connectivity of the overlay. Second, the overlay could notify the nodes to change their links. The former can be done by simply selecting new neighbors from the candidate set collected from overlay initialization as mentioned previously. The second approach can solve the hot spot problem effectively. A hot spot is a node where a considerable amount of traffic passes through in a short period. This will cause the links nearby the hot spot to exhaust energy quickly.
Figure 2 shows two examples such
that A may become a hot spot. In when is the opposite of (i), in which node A will find large amount of
traffic passing from nodes

Two examples show that
Consider the problem in (i) first. When
In this section, we will evaluate the performance of the discussed algorithms using simulations.
Experimental Settings
The evaluation is based on the ns2 simulator with the CMU wireless extension [11]. All the sensors in simulations are stationary, which are deployed uniformly at random in a square field ranging from 600m × 600m to 1000m × 1000m, subject to the condition that the network is connected. In the simulations, there are |V| = |VSN ∪ VPN| ≍ 130 nodes deployed in the field. We assume that each sensor has a radio range of 100m.
Each node in VSN has an initial energy budget 10 joule, and the nodes in VPN have an infinite energy budget. We adopt the energy consumption model in [12]. A sensor node takes 81 mW to send a message and 30 mW to receive a message. The size of each simulated message is 128 bytes. We measure the remaining energy budget EB(i) of each node ν i ∊ VSN. Since we are interested in the performance of the overlay using different network formation algorithms, each node in VPN will generate a message every t seconds, where t follows the exponential distribution. The default routing protocol is AODV [7]. Table 2 summarizes the parameters used in the simulations.
The performance metrics include the average message delay (Dm) of transmitting a message between any two nodes in VPN, and the average number of messages (Tm) of disseminating a data item issued by V ∊ VPN to nodes VPN — {V}. We measured the remaining energy of sensor nodes in VSN, and calculated that the CDF CDF allows us to evaluate whether the energy that remained in the different nodes was balanced or not.
Experimental settings
Experimental settings
We first compare the three centralized algorithms discussed in Section III. The first experiment is to evaluate
the effects of the size of the network. In this experiment, we let
|VPN| = 10 and
Figure 3(a) depicts the
simulation result of the average message delay versus the size of the network. Clearly, FC
outperforms R

Effects of network size, where (a) the average message delay and (b) the average number of messages.
Figure 3(b) shows the result for
the average number of messages required to disseminate a data item to each node in
VPN. We can see that MST takes a smaller number of
messages to disseminate compared to FC and R
Next, we consider the effects of |VPN| and

Effects of numbers of powerful nodes for

Effects of numbers of powerful nodes for
From the above discussion, we conclude that FC outperforms R
We compare FC, MST and our proposed algorithm. The number of overlay nodes is 15, and the performance for the network size from 600m × 600m to 1000m × 1000m is investigated. In this experiment, we let nn = 2, α = 3, and δ = 2.
Figure 6(a) shows the measured message latency. Our proposed protocol is better than MST and worse than FC. In Fig. 6 (b), the message cost in our protocol is obviously less than FC's and is slightly more than MST's.

Comparison of FC, MST, and our proposed protocol, where (a) message latency versus network size, (b) message cost versus network size, and (c) CDF of remaining energy.
The results conclude that our protocol can have a smaller message delay while keeping the message cost low. Figure 6(c) depicts the CDF for remaining energy. FC uses more energy than our protocol and MST. Our protocol is slightly better than MST.
After comparing with other protocols, we examine our proposed protocol under different settings. We first vary the number of neighbors (i.e., the neighbor threshold) that a node can have. Figure 7(a) shows that when the neighbor threshold increases, the resultant overlay could have a shorter message latency. In Fig. 7 (b), the message cost increases along with the increase of neighbor threshold. The remaining energy of nodes in VSN increases when neighbor threshold is small, however (see Fig. 7 (c)). This is because the resultant overlay will be degenerated to FC if nn is larger. If nn = |VPN − 1|, nodes in the overlay will be fully connected.

Performance of our porposed algorithm, where (a) message latency versus overlay size, (b) messages cost versus overlay size, and (c) CDF of remaining energy.
Our simulation study also varies α and δ. The simulations results are depicted in Fig. 8 (a)–Fig. 8(c). We can see in Fig. 8 (a) that a shorter search range can shorten the search delay. However, smaller α leads to the partition of the overlay network. We solve this partition problem by having each tree root build a direct route to the nodes that it did not receive a response from earlier.

(a) Message latency versus neighbor threshold, (b) Messages cost versus neighbor threshold, (c) CDF of remaining energy in VSN versus neighbor threshold.
From the experimental results presented in this section, we conclude that our proposal
has a shorter message latency than MST's (and R
Various approaches have been proposed to address the power and latency problem in wireless
sensor networks. Generally, they can be classified into the following categories. Clustering: Building a layered structure by clustering sensors can
both conserve energy and improve system throughput. In LEACH [13], homogeneous sensors are organized into
clusters. Heinzelman et al. [5] considers a heterogeneous network composed
of powerful and normal sensors. They investigate the optimal number of clusters and
suitable energy allocation for each type of sensors in a heterogeneous environment to
prolong the network life. Centralized Approach: Youssef et al. [14] focus on energy conserving
routing and try to maintain good routing delay in a cluster. They assume that sensors
can tune the radio amplifier to change their maximum transmission range. A constrained
shortest path routing protocol is proposed to adjust the radio transmission power to
obtain diverse in-cluster topologies. The path to the cluster-head is then set with
the minimum transmission energy. Lifetime optimization for in-home heterogeneous
sensor networks is discussed in [1]. Its resource oriented protocol (ROP) divides the
sensors into several energy levels. The protocol then constructs a hierarchical
structure and calculates the energy-optimal topology in a centralized manner. Dominating Set: Guo et al. [15] build a backbone-like structure using the
dominating set to eliminate redundant messages during broadcast and to avoid message
collision. The node receiving a broadcast packet will wait a period before it declares
itself as a forwarding node. This latency is determined by the remaining power of the
node.
The system model used in [5], [13] is different from ours. In LEACH [13], it assumes a homogeneous network consisting of a single base station aggregating data from a large sensing field and sensors having the capability to control radio electronics. The network model in our study is similar to [5], but [5] assumes that those overlay sensors would take turns to become cluster-heads and cover the whole area. Besides, [5] centers on finding the optimal number of clusters. ROP [1] targets at home-scale networks and periodically builds energy efficient topology for routing. Unfortunately, it is a centralized method and is not suitable for large outdoor applications.
Both dominating set [6], [15] and forwarding set selection [16] can be used to reduce redundant packets caused by the broadcast. However, our work is to provide an energy-efficient communication substrate where many applications can be implemented. Our focus is on shortening the end-to-end delay among the overlay sensors under the energy constraint of the normal sensors. Our model is established on a heterogeneous environment. These features are different from previous works.
Summary
In this paper, we study different overlay formation protocols through mathematical analysis and simulations. A fully connected overlay can be treated as a tree whose height is one. On the other hand, the height of a ring overlay is equal to n/2, where n is the number of overlay nodes. We then propose a novel distributed overlay formation protocol for shortening routing delay while conserving energy. Our protocol builds a spanning tree for each overlay node. These spanning trees not only are helpful for removing routing loops, but also can provide short routing paths.
Simulation results show that our protocol can achieve shorter message latency while retaining low energy deviation. We also evaluate our protocol with various parameters and the results show that by choosing a suitable neighbor threshold and search range, it is possible to obtain better performance and avoid the risk of network partition.
There are several possible improvements to our protocol. For example, we can build a layered overlay. That is, we cluster all nearby overlay nodes first and then use our protocol to build a spanning tree in each layer. Constructing such a hierarchical structure could not only save the energy consumption but make our protocol more scalable. Another possible improvement is the routing protocol of the underlay network. In this paper we only focus on solving problems on overlay, but the underlay routing deserves further study.
Footnotes
Acknowledgment
This work was supported in part by the National Science Council, R.O.C., under Grant NSC 93-2752-E-007-004-PAE, by the MOEA Research Project under Grant No. 94-EC-17-A-04-S1-044, and by the CCL of ITRI.
