Abstract
We study multicasting over wireless lossy links. Instead of downloading all the data from the source node, we allow the destination nodes themselves to locally exchange the packets, as local communication within a cluster achieves higher packet reception probability with less transmission cost. However, when shall we stop the transmission from the source node? If the source stops too early, the destination nodes locally cannot reconstruct all the original packets, while if the source stops too late, the benefit of cooperative data exchange cannot be fully exploited. In this paper, we propose a network coded hybrid source and cooperative exchange scheme to determine when to stop the source sending and start the exchange process, so as to minimize the total transmission cost. For the case when the clusters are predefined, we derive the expected total transmission cost with our hybrid scheme. Our theoretical results show that under a special condition, the source node should keep sending the packets until all the destinations get the complete information. For the case when the clusters are not predefined, we propose a cluster division algorithm such that the destination nodes within each cluster can conduct data exchange locally with energy efficiency. Finally, simulation results demonstrate the effectiveness of the proposed scheme.
1. Introduction
Over the past decades, wireless sensor networks have attracted a great deal of research attentions [1–3]. Once deployed, sensors are expected to operate for a long period of time, and it is impractical to reach these sensors physically. However, it is quite often necessary to update the software running on those sensors or add new functionalities to the sensors [4, 5], which needs to reliably multicast large data objects with energy efficiency [5, 6]. Particularly, in wireless lossy sensor networks, multicasting packets from a single source node is still a challenge problem due to the heterogeneous lossy links to the destination nodes. To satisfy all the destination nodes, the source node needs to keep sending until the destination node with the worst packet reception link successfully receives all the packets, which is inefficient for the source node.
Recently, cooperative data exchange [7–11] has become a promising approach to achieve the efficient data communications. Instead of downloading all the packets from the source node (e.g., the server) [12, 13], cooperative data exchange allows the destination nodes to cooperatively exchange their received packets among themselves, once the destination nodes collectively hold all the packets. Compared with pure source-dominated multicast, cooperative data exchange has two main benefits: (a) short-range communication among destination nodes is often more reliable and consumes less transmission cost, (b) the bandwidth saved at the source node can serve more other nodes in the system.
However, most of the existing works on cooperative data exchange problem never consider when to stop downloading the data from the source node. On the one hand, if the source node stops too early, the destination nodes themselves may be unable to collectively reconstruct the complete packets. On the other hand, if the source node stops too late, the benefit of cooperative data exchange cannot be fully utilized. In addition, although the average transmission cost with cooperative data exchange is lower than with source transmission, the sum of the transmission costs within multiple clusters may exceed the cost with pure source sending. To sum up, for wireless multicasting, it is important for us to consider when to stop the source sending and start the cooperative data exchange, so as to reduce the total transmission cost or energy consumption [1–3, 14, 15].
So far, only the work in [16] studies the hybrid source transmission and cooperative data exchange for wireless multicasting, and it stops the source sending exactly once the destination nodes within each cluster collectively hold the complete information. Although the scheme in [16] performs well in reducing the transmission delay, it is still possible that the cooperative data exchange by multiple clusters may consume more transmission cost/energy than source-dominated scheme. As discussed above, the cooperative data exchange needs to be conducted separately in each cluster, and thus the sum of their transmission costs may exceed the transmission cost with pure source transmission. In addition, the predefined clusters in [16] might not perform well, as the destination nodes in some clusters may collect the complete information more quickly than the destinations in other clusters, which may stop the source transmission too late for the other clusters.
Recent work also shows that network coding [17–19] can improve the network throughput and reliability, especially for wireless lossy networks. Instead of sending/forwarding the original packets directly, network coding allows the source/transmitting node to linearly combine multiple packets together. With this approach, each transmitted packet has almost the same contribution in reconstructing the original packets, and hence improves wireless reliability [20]. Considering the benefits of network coding, we assume that the packets sent from the source node or the packets exchanged among the destination nodes are all linear encoded at the source/transmitting nodes before sending.
In this paper, given the heterogeneous link loss probability and the transmission cost of the source transmission and data exchange among the destinations, we aim to design a hybrid source and cooperative data exchange scheme to minimize the total transmission cost consumed during multicasting. For the case when the clusters are predefined, we determine when the source should stop sending the packets to the destination nodes and the cooperative data exchange should start. For the case when the clusters are not predefined, we consider how to divide the destinations into the clusters. The main contribution of the paper can be concluded as follows.
We theoretically derive the expected total transmission cost required with traditional source-dominated scheme and our hybrid transmission scheme. Our analysis shows that under a special condition, the source node should keep sending the packets until all the destinations get the complete information, so as to reduce the total transmission cost. We also propose an efficient algorithm to determine how to group the destination nodes into the clusters, so as to make sure that the destination nodes within each cluster can collectively recover all the original packets with energy efficiency. We compare the performance of the proposed scheme with some existing schemes. Simulation results show that the proposed scheme can significantly reduce the total transmission cost.
The rest of the papers are organized as follows. In Section 2, we introduce the background and some related works. The system model and the problem description are presented in Section 3. The expected total transmission costs with source-dominated and our scheme are discussed in Section 4. In Section 5, we consider how to group the destinations into clusters after receiving the sufficient number of packets from the source. The simulation results are presented in Section 6. We conclude the paper in Section 7.
2. Background and Related Work
In this section, we provide a brief introduction to the existing wireless network coded multicast scheme and summarize some related works.
2.1. Wireless Network Coding
Network coding was originally proposed in information theory [17] and recently has become a promising approach to improve the network performance in throughput [21, 22], reliability [19, 23], security [18, 24], and so forth. Instead of forwarding the original packets directly, network coding allows the source node/intermediate node to combine multiple packets together before sending it out. The work in [25] ensures that all the encoded packets generated by the same peer are linearly independent with a high probability, if we use linear network coding based on a sufficient large field size.
Network coding in wireless lossy networks was also considered in the literature. The work in [21] proposed a first wireless network coding architecture, named COPE. By exploiting the broadcast nature of wireless medium, each node stores the overheard packets for a while. When a node transmits a packet, it uses its knowledge of what its neighbors have overheard to perform opportunistic coding [21, 22]. After receiving an encoded packet, multiple neighbors can decode their wanted packets with their overheard packets. In other words, the sender or transmitting node can deliver “multiple packets” to different neighbors in a single transmission, which thus improves the throughput. The work in [19, 23] theoretically shows that network coding significantly reduces the expected number of retransmissions in lossy networks compared to traditional ARQ scheme. The work in [5] considers the impact of both wireless unreliable communication and sleep scheduling of sensor nodes and proposes a deterministic code design at the source node so as to accomplish the data dissemination process at the earliest time.
2.2. Reliable Multicast in Lossy Wireless Networks
Traditionally, wireless single-hop multicast mainly focuses on source-dominated transmission, where the source node keeps sending the packets until all the destination nodes in the system obtain the complete information. However, the performance of such a source-dominated transmission degrades significantly when the packet reception probabilities of the destination nodes are heterogeneous. In this case, the source node cannot stop sending the packets until the destination node with the worst link state successfully gets the packets, even if all the other destination nodes received the packets in a much earlier time.
Recently, cooperative data exchange [7] among the users (e.g., mobile users) becomes one of the most promising approaches in designing efficient data transmissions. In cooperative data exchange, each client initially holds a subset of the packets and wants all the packets that others have. In the literature, it is assumed that there is a common communication channel among the users to receive/send the packets from/to all other users. Cooperative data exchange is to make sure that each client finally can get the complete packets by exchanging packets among themselves through the common channel. Compared with source-dominated transmissions, cooperative data exchange appears ti have two benefits: (1) short-range communications among the users are often more reliable and faster, (2) the bandwidth saved at the server can serve more other clients in the system.
Most recent works on cooperative data exchange mainly focus on how to minimize the total number of packets to be exchanged/transmitted among the users [7–9], or the total transmission cost consumed at the users [10, 11], such that all the users in the system can finally recover/receive the complete packets. Current works also show that network coding can reduce the number of transmissions or the total transmission cost required for cooperative data exchange process [7–11].
However, in the literature, most of the works either focus on pure source-dominated multicast or focus on cooperative data exchange by assuming that each user initially holds a subset of packets. The only work that considers the hybrid architecture of source transmission and data exchange transmission is in [16]. Specifically, the source node is set to stop sending the packets once the destinations in each cluster can collectively reconstruct the original packets, followed by the cooperative data exchange within each cluster. The numerical results show that with this hybrid transmission, the transmission delay of the multicast can be reduced compared with pure source-dominated multicast. However, the source node may stop too early, as the sum of the energy consumed by the cooperative data exchange within multiple clusters may consume more energy, which is inefficient for wireless sensor networks.
3. System Model and Problem Description
In this section, we first introduce the system model of our problem. Then, we discuss the problem to be solved in two different cases: with predefined clusters and without predefined clusters.
3.1. System Model
In this paper, we consider a multicast application, where a source node s needs to send n packets to m destination nodes in
Suppose that the destination nodes are formed into multiple clusters in
Our multicasting process consists of two transmission stages. In the first stage, the source node sends the packets to all the destination nodes in D. In the second stage, the destination nodes within the same cluster perform the cooperative data exchange among themselves. These two stages need to make sure that each destination finally can reconstruct the original n packets. Due to unreliable wireless communications, suppose that the packet loss probability from the source node to every destination node is
In this paper, we aim to minimize the total transmission cost consumed during transmissions, while we ensure that all the destination nodes in D can successfully reconstruct the original n packets. Let
An example of the system model is shown in Figure 1. The source node s needs to send three packets to the destinations in

An example of system model after sending three packets in
3.2. Problem Description with Predefined Clusters
In this section, we consider the case when the destination nodes in each cluster are predetermined.
Due to unreliable wireless communications, some packets may be lost at some destination nodes. So, one challenge problem is to determine when the source node s stops sending the packets such that the destination nodes within each cluster can exchange the packets among themselves, and the total transmission cost defined in (1) is minimum.
If the source node s stops too early, the destination nodes in some clusters may be unable to collectively recover all the original packets. If the source node s stops too late, the low-cost transmission among the destination nodes themselves cannot be fully utilized, which may incur high transmission cost because of high packet loss probability and high transmission cost of the source transmission. The sum of the transmission costs by the data exchange by all the clusters may exceed the transmission cost by the pure source transmission. In this case, the source node should keep sending until all the destination nodes obtain the complete information.
As shown in Figure 1, if the source node s stops after sending
We refer to the above problem of determining when to stop the source transmission, as minimizing the multicast transmission cost problem with predefined clusters.
3.3. Problem Description with Nonpredefined Clusters
In the above section, we assume that the destination nodes in each cluster are given. However, the predefined clusters may not perform well, as the destination nodes in some clusters may collectively hold the complete information much later than the destination nodes in other clusters, which thus delays the starting time of the cooperative data exchange in other clusters.
Still take Figure 1 as an example, if we group the destinations in
Thus, after the source node sends sufficient number of packets, we need to consider how to group the destination nodes into multiple clusters such that the destination nodes within each cluster collectively can reconstruct the original packets with energy efficiency. In this case, the number of clusters in 𝒞 and the set of the destinations in each cluster
4. Minimum Multicast Transmission Cost with Predefined Clusters
In this section, we consider the case when the clusters are predefined. We first analyze the expected total transmission costs with source-dominated scheme and our hybrid transmission scheme. We then formulate the problem of minimizing the total transmission cost with our hybrid transmission scheme as an integer programming.
To ease the understanding, the main notations used in the paper are given in Table 1.
Main notations and their descriptions.
4.1. Transmission Cost with Source-Dominated Transmission
With source-dominated scheme, the source node s keeps sending the encoded packets that are generated based on n original packets, until all the destination nodes in D successfully decode the original n packets. With linear network coding, after receiving any n encoded packets from s, the destination node can decode the original n packets.
Suppose that
4.2. Transmission Cost with Hybrid Transmission Scheme
With hybrid transmission scheme, there are two stages. In the first stage, the source node s sends
We now consider the first stage. For the success of the second stage, the number of packets sent by the source s should make sure that the destination nodes within each cluster can collectively reconstruct all the original packets. The probability that at least one of the destinations in cluster
We then consider the second stage. After the source node sends
Thus, the total transmission cost with hybrid transmission scheme can be formulated as follows:
If we assume that the packet loss probability within each cluster is the same, that is,
Theorem 1.
If
Proof.
Let
Based on the above equation (12), we then compare the expected total transmission cost required by source-dominated scheme with that by the hybrid transmission scheme for any
From the above equation, we can see that the expected total transmission cost with hybrid scheme is not lower than source-dominated transmission scheme when
4.3. Hybrid Transmission Scheme with Minimum Total Transmission Cost
As discussed above, in some cases, the sum of the transmission costs within all the clusters may be more than transmission cost of the source transmission. Under such circumstances, the source node should keep sending the packets until all the destination nodes get the complete information, which is a special case of our hybrid scheme, that is,
To determine
With the above integer formulation, we can obtain the best time to stop the source transmission and start the cooperative data exchange process.
5. Cluster Determination in Cooperative Data Exchange Stage
As discussed in Section 3.3, after the source transmissions in the first stage, the destination nodes in some clusters may be unable to reconstruct the original packets if the clusters are predefined. In this section, we consider after the source node sends
Assume that after the source node sends
5.1. Number of Transmissions Required for Data Exchange Process within a Local Cluster
We now discuss the number of transmissions required for data exchange within a specific cluster
Without loss of generality, we assume that after the source s sends
Before describing the number of transmissions for data exchange process within cluster
Theorem 2 (see [26]).
Provided that the encoding field size is large enough, the minimum number of packets to be exchanged is given by
Note that in our problem, the packets sent by the source node s are linear independent with each other. Thus, for a given cluster
Given the packets received by the destination nodes in
5.2. Optimal Cluster Division with Minimum Transmission Cost in the Data Exchange Process
We now consider how to divide all the destination nodes into multiple clusters, so as to minimize the total transmission cost in the second stage.
For m destination nodes in D, the maximum number of clusters that can be formed is m. The problem of minimizing the total transmission cost of the data exchange within the clusters by cluster division can be formulated as follows:
Although the above formulation can derive the optimal cluster division to minimize the total transmission cost in the data exchange process, the complexity of calculating
5.3. Algorithm Design
In this section, given the packet reception state of each destination node, we consider to group all the destination nodes into multiple clusters.
Without loss of generality, we suppose that after
We then introduce how to select the destinations in U into the kth cluster The destination Deleting The sum of the transmission costs of the new cluster
We then describe the process of adding new destinations to We start from the destination node If we can find a feasible destination If they still cannot recover all the original packets, we continue the above two steps. The algorithm continues until we cannot find a feasible node
The detailed process of the above algorithm is shown in Algorithm 1. With the above algorithm, we can make sure that the destination nodes within each cluster
//after sending The source s should send more packets before the second stage; not_finish=1; find a destination (1), (2), and (3) and incurs the least cost to not_finish=0; add delete recover the original packets
5.4. Illustration Example
We take Figure 2 as an example to illustrate how Algorithm 1 works, where five destination nodes in

An example of five destination nodes.
According to the Algorithm 1, initially we have
We first construct the cluster
We then construct the cluster
6. Simulation Results
In the simulation, we study a connected network graph, where nodes are randomly deployed in a two-dimensional (2D) space. We use
To demonstrate the performance of our proposed scheme, we also conduct two baseline algorithms: source-dominated scheme, and the hybrid scheme by [16]. As introduced before, in source-dominated scheme, the source node keeps sending until all the destination nodes get the complete packets. The difference between the hybrid scheme proposed by [16] and ours is that the scheme in [16] stops the source sending exactly once the destination nodes in each cluster can collectively reconstruct the full packets.
We define cost gain, denoted as δ, as the ratio of the transmission cost difference with source-dominated scheme and the hybrid scheme (our hybrid scheme or the scheme in [16]) to the total transmission cost with source-dominated scheme, for example, the cost gain by our hybrid scheme can be written as
6.1. The Impact of the Transmission Cost
In this section, we conduct the simulation to investigate the impact of the transmission cost on the cost gain. In this setting, we set
As shown in Figure 3, the cost gain with our hybrid scheme is much better than the scheme proposed by [16]. We can observe that in some cases, for example,

The impact of transmission cost of each cluster on the transmission cost gain ratio.
From Figure 3, we can also observe that with the increase of the transmission cost within a cluster, the cost gain decreases. This is reasonable, as when the transmission cost within a cluster increases, the sum of the total transmission costs of multiple clusters increases quickly and correspondingly reduces the gain.
6.2. The Impact of Packet Loss Probability
We now investigate the impact of packet loss probability on the performance of the total transmission cost. We fix
As shown in Figure 4, the cost gain with our hybrid scheme is better than with the scheme in [16], and our hybrid scheme also always consumes less transmission cost compared with source-dominated transmission scheme. From the figure, we observe that with the increase of the packet loss probability from the source to the destination nodes, the transmission cost gains with both hybrid schemes increase. This is because, when the links from the source to the destinations nodes are too bad, the source node should stop sending the packet as early as possible.

The impact of packet loss probability on the transmission cost gain ratio.
In addition, when the packet loss probability
6.3. The Impact of the Number of Clusters
We now conduct the simulation to investigate the impact of the number of the clusters on the total transmission cost of the proposed schemes. In this setting, we fix
As shown in Figure 5, when the number of clusters is few, the total transmission costs with both hybrid schemes are less than with the source-dominated scheme. This is because, when the number of clusters is few, the sum of their transmission costs does not exceed the source transmission cost. However, when the number of clusters increases, for example, more than

The impact of the number of clusters on the transmission cost gain ratio.
6.4. The Impact of the Number of Receiver Nodes within Clusters
Finally, we investigate the impact of the number of destination nodes within each cluster on the performance of our transmission scheme.
As shown in Figure 6, we fix

The impact of the number of destinations within the cluster on cost gain by our scheme.
From Figure 6, we also observe that the cost gain with the setting
By comparing Figures 6(a) and 6(b), we can also obtain that when the number of clusters in the network increases, the cost gain of our hybrid scheme decreases. As each cluster consumes its independent transmission cost in the cooperative data exchange process, the sum of the transmission costs of the data exchange by all the clusters increases, which thus decreases the cost gain.
7. Conclusion
In this paper, we proposed a hybrid source and cooperative data exchange transmission scheme for reliable multicasting over wireless lossy links. Our hybrid scheme determines when to stop the source sending and start the cooperative data exchange, so as to minimize the total transmission cost. We theoretically derive the total transmission cost required with traditional source-dominated scheme and our hybrid scheme. We give a condition under which the source node should not stop sending the packets until all the destination nodes successfully get the complete information, so as to reduce the total transmission cost. If the clusters are predefined, we propose an efficient algorithm to divide the destination nodes into multiple clusters, such that the destination nodes within each cluster can conduct the cooperative data exchange separately with energy efficiency. Finally, simulation results demonstrate the effectiveness of the proposed scheme in reducing the total transmission cost.
Footnotes
Acknowledgments
This work was supported by the Fundamental Research Funds for the Central Universities (no. 2012HGBZ0640). It was also supported in part by National Natural Science Foundation of China under Grants no. 61202378, 61070169, Natural Science Foundation of Jiangsu Province under Grant no. BK2011376, Specialized Research Foundation for the Doctoral Program of Higher Education of China no. 20103201110018, and Application Foundation Research of Suzhou of China no. SYG201118.
