Abstract
Most routing algorithms in delay tolerant networks (DTN) need nodes serving as relays for the source to carry and forward message. Due to the impact of selfishness, nodes have no incentive to stay in the network after getting message (e.g., free riders). To make them be cooperative at specific time, the source has to pay certain reward to them. In addition, the reward may be varying with time. On the other hand, the source can obtain certain benefit if the destination gets message timely. This paper considers the optimal incentive policy to get the best trade-off between the benefit and expenditure of the source for the first time. To cope with this problem, it first proposes a theoretical framework, which can be used to evaluate the trade-off under different incentive policies. Then, based on the framework, it explores the optimal control problem through Pontryagin's maximum principle and proves that the optimal policy conforms to the threshold form in certain cases. Simulations based on both synthetic and real motion traces show the accuracy of the framework. Through extensive numerical results, this paper demonstrates that the optimal policy obtained above is the best.
1. Introduction
With the increasing of mobile operating systems, such as iPhone OS, Android, and Windows Phone OS, mobile phones have already evolved from simple voice communication means into powerful devices able to provide a variety of services to users. For example, they can transfer the videos or music in a peer-to-peer (P2P) way through the short-range communication technologies (Bluetooth, WiFi, etc.) in them. Due to the limited communication range or mobility nature, it is hard to maintain the continuous end-to-end path between them as for classic Internet P2P applications [1]. Obviously, such network belongs to delay tolerant networks (DTN), in which end-to-end path between communication nodes may not exist [2]. In particular, the concept of DTN is proposed to support many new emerging networking applications, and other classic examples include deep-space exploration [3], military networks [4], and vehicular networks [5].
Because the communication opportunity is uncertain, message propagation in DTN often needs the help from others. In particular, nodes in DTN adopt a store-carry-forward communication strategy. This strategy exploits the opportunistic connectivity and node mobility to relay and carry messages. In other words, when the next hop is not available for the current node, it will store the message in its buffer, carry the message along the movement, and forward it until a new communication opportunity arises. Obviously, this method needs nodes working in a cooperative way. That is, nodes should stay in the network and forward message to others after getting message. However, nodes have no incentive to stay in the network after getting message due to the selfish nature [6, 7]. In particular, nodes may leave the network at once when it receives message, and these nodes can be seen as free riders [6, 7]. For example, users may make their communication devices (e.g., Bluetooth) off to save their limited energy. If all nodes in the network become free riders, other nodes can get message only from the source, and this is similar to the direct transmission protocol [8], which is very inefficient. In fact, the source can obtain certain benefit if other nodes get message timely. For example, if the source is the advertiser, it is good for it to make other nodes (consumers) get the message (advertisement) timely. In addition, such benefit may be varying with time. For example, the sooner the nodes get message, the more the benefit will be. Therefore, the source has the incentive to push the message to other nodes timely. To achieve this goal, it has to pay certain reward to the relay nodes to make them be cooperative. In addition, such reward may be varying with time too. For example, the longer the time nodes stay in the network, the more the energy may be used, so nodes may ask for more rewards. In fact, nodes (e.g., phone, PDA) are often devices that can be manipulated by humans [9, 10], and the buffer space or the forwarding ability of nodes can be seen as goods. Therefore, the event that the source requests help from other nodes can be seen as the event that the source buys certain goods from humans. The things that are used to buy goods by the source may be virtual currency [11] or discount of service [12] and so forth. Therefore, the message propagation process can be seen as the commodities trading process, and humans want to maximize its reward in this process. Therefore, these humans may adjust the price of their goods according to the market state. For example, if the remaining lifetime of message is shorter, they may think that the source is eager to transmit the message as soon as possible, so they think that their goods (e.g., the forwarding service) are important for the source. In this case, they may increase the price. On the other hand, if the remaining lifetime of the message is long, they may think that the source may be not eager to transmit the message quickly and is not willing to pay too many fees. In this case, they may help the source with only a little reward. Therefore, the price of the goods (e.g., the forwarding service) may be varying with time. In this environment, whether to make nodes be cooperative at specific time is an important problem for the source. For example, suppose that node j is willing to help the source by charging m
nuggets (price of the goods) at time
The main contribution of this paper can be summarized as follows.
We consider the optimal incentive policy to get the best trade-off when the reward is varying with time for the first time. We propose a unifying framework through a continuous time Markov process, which can be used to evaluate the trade-off between the benefit and expenditure of the source under different incentive policies. Then based on the framework, we formulate an optimization problem. Through Pontryagin's maximum principle, we explore the optimal control problem and prove that the optimal policy conforms to the threshold form in some cases. By comparing the simulation results with the theoretical results, we show that our theoretical framework is very accurate. In addition, we compare the performance of the optimal policy with other policies through extensive numerical results and find that the optimal policy obtained by our model is the best.
2. Related Works
In fact, our work is similar to the optimal controlling problem for epidemic routing algorithm (ER), such as the works in [13, 14]. These works mainly study how to maximize the average delivery ratio when the energy is limited, and the energy consumption for forwarding the message once is not related to time. Therefore, these methods cannot be used to solve the optimization problem in our paper, in which the reward that the relay nodes require is related to time. On the other hand, the work in [15] studies similar problem as that in this paper, but it tries to get the optimal forwarding policy when the total fees are limited. However, we study the trade-off between the income and expenditure, so they are different.
There are many other selfish behaviors, such as individual selfishness and social selfishness [16, 17]. At present, some works study the impact of these selfish behaviors. For example, Li et al. study the impact of social selfishness on the epidemic routing protocol [17]. Then, they explore the impact of both individual selfishness and social selfishness on multicasting in DTN [18]. However, these selfish behaviors depend on the social relations between nodes. For example, the social selfishness denotes the selfish behavior between friends. Therefore, the distribution of friends may have certain impact. Existing studies have shown that the number of friends of nodes may be different. In particular, the distribution of friends may conform to a power law distribution [19]. Therefore, if we consider those selfish behaviors, we have to classify the nodes according to their number of friends, and this will be a controlling problem with multiple parameters. Such interesting problem is an extension of our work, and we will study it deeply in the future.
3. Network Model
Suppose that there are one source S, N relay nodes, and a destination node D. At time 0, only the source has message, and it wants to make the destination obtain the message before the maximal lifetime T. To achieve this goal, the source needs the help from others. However, it has to pay certain reward every time it makes a relay node be cooperative, so it may not do this all the time. Note that only the relay nodes that have message can forward it to others, so the source only makes these nodes be cooperative. In other words, the source is not willing to pay reward to nodes that do not have message. In this paper, we assume that the source makes a relay node (e.g., node m) that has message cooperative with probability
Nodes in the network can communicate with each other only when they come into the transmission range of each other, which means a communication contact, so the mobility rule of nodes is critical. In this paper, we assume that the occurrence of contacts between two nodes follows a Poisson distribution. This assumption has been used in wireless communications for many years. At present, some works show that this assumption is only an approximation to the message propagation process. For example, the work in [20] reveals that nodes encounter with each other according to the power law distribution. However, it also finds that if you consider long traces, the tail of the distribution is exponential. Furthermore, a more recent work in [21] studies the vehicles’ dataset in large-scale urban environment and finds that the intermeeting time can be modeled by a three-segmented distribution. Though the first and second parts of the contact intervals do not obey the exponential distribution, it also recognizes that the tail obeys the exponential distribution. In addition, the work in [22] shows that individual intermeeting time can be shaped to be exponential by choosing an approximate domain size with respect to the given time scale. Moreover, there are also some works, which describe the intermeeting time of human or vehicles by exponential distribution and validate their model experimentally on real motion traces [23, 24]. For this reason, the exponential model is still widely used in many existing works, such as [25–27]. In this paper, we also use such model and assume that the intermeeting time between two nodes follows an exponential distribution with parameter λ. Simulations based on both synthetic and real motion traces show that our theoretical framework based on such assumption is very accurate.
Besides the intermeeting time, many other factors can have certain impact on the routing performance, such as the contact duration, bandwidth, and message size. If the bandwidth is big enough, the message may be transmitted successfully in one contact. However, if the bandwidth is too small, it may be hard to transmit the message to one contact, even though the contact duration is long. At present, some works find that the distribution of the contact duration may conform to the Pareto distribution [28, 29]. However, the Pareto distribution is hard to be used to analyze the routing performance theoretically. Therefore, most of the previous works which explore the routing performance based on the theoretical method ignore the impact of the contact duration and assume that a contact is long enough to transmit the message, such as [13–17]. In this paper, we use the same assumption. Note that the assumption is rational when the message is small or the bandwidth is very big.
The commonly used variables of this paper can be seen in Table 1.
The list of commonly used variables.
4. Problem Formulation
4.1. Theoretical Framework
Let
Symbol
Combining with (1) and (2), we can get
Further, we can obtain
Then we have
One main metric of routing algorithm in DTN is the delivery ratio, which denotes the probability that the destination D obtains message within given time. Let
Similar to the relay nodes, D may get message from the source or the cooperative relay nodes. Therefore, we have
Further, we can obtain
Let
Time interval
Because nodes that have message do not receive the same message any more, if the event
Combining with (9) and (10), we can obtain
Based on (11), we have
T is the maximal lifetime of the message, and our object is to maximize the value of
4.2. Optimal Control
Obviously, the above question is an optimal control problem, and
Let
Note that, at time t, α and β are simple expressions of
Then, we can get the costate or adjoint functions
The transversality conditions are shown as follows [30]:
Then according to Pontryagin's maximum principle in ([30, P. 109, Theorem 3.14]), there exist continuous or piece-wise continuously differentiable state and costate functions, which satisfy
This equation between the optimal control parameter μ and the Hamiltonian
H allows us to express μ as a function of the state
Below, we will prove that when the function of
If above conditions can be satisfied, the optimal policy conforms to the threshold form and has at most one jump. In particular, we have the following theorem.
Theorem 1.
If
Proof.
First, note that the functions
When
Then, we can get
Combining with (16), we have
In addition, from (16), we can obtain
Suppose that
Combining with (22), we can obtain
Because
Further,
That is, if
Then we assume that
In summary, if
5. Model Validation and Performance Analysis
5.1. Model Validation
In this section, we will check the accuracy of our framework by comparing the theoretical results obtained by our model with the simulation results. We run several simulations using the opportunistic network environment (ONE) [31] based on three different scenarios. In the first one, we use the famous random waypoint (RWP) mobility model [32], which is commonly used in many mobile wireless networks. There are totally 500 nodes, and all of these nodes move according to the RWP model within a 10000 m × 10000 m terrain with a scale speed chosen from a uniform distribution from 4 m/s to 10 m/s. The communication range is 5 m. Moreover, the source and destination nodes are randomly selected among these nodes. In the second scenario, we use a real motion trace from about 2100 operational taxis for about one month in Shanghai city collected by GPS [33]. The location information of the taxis is recorded at every 40 seconds with the area of 102 km2. We randomly pick 500 nodes from this trace. In addition, the source and destination nodes are randomly selected among these nodes too. The third scenario is based on the dataset collected in the Infocom 2005 conference [34]. In particular, this dataset includes 41 attendees, who connect with each other by Bluetooth. Among those attendees, we randomly select two nodes as the source and destination, respectively.
The functions of
Based on these settings, we can get Figures 1, 2, and 3, respectively.

Simulation and numerical results comparison of total fees with RWP mobility model.

Simulation and numerical results comparison of total fees with Shanghai city motion trace.

Simulation and numerical results comparison of total fees with Infocom 05 motion trace.
From the results, we can see that the average deviation between the theoretical and simulation results is very small. For example, the average deviation is about 4.22% for the RWP mobility model and 5.01% for the Shanghai city motion trace. For the Infocom 2005 dataset, the average deviation is about 7.12%. Though the deviation is bigger than that in RWP and Shanghai city motion traces, it also can be seen very accurately. This demonstrates the accuracy of our theoretical framework. For this reason, we can use the numerical results obtained by our theoretical framework to evaluate the performance of different policies.
In addition, the results above also show that the performance is different when the source adopts different policies. In particular, the results in Figures 1 and 2 show that it is not good for the source to request help all the time. For example, when the value of T is bigger than 4000 s in Figure 2, the total income of the source may be negative if it requests help all the time. This shows that the policy of the source can have important impact on its total income, and this means that our optimal control policy is necessary. Later, we will show that the optimal policy obtained by (19) is the best through extensive numerical results.
5.2. Performance Analysis with Numerical Results
In this section, we use the best fitting for the Shanghai city motion trace in the above simulation to describe the exponential distribution of the intermeeting time between nodes.
First, we evaluate the performance of the optimal policy obtained by (19). For comparison, we consider three other cases: case 1:

Performance comparison with different policies.
The result in Figure 4 shows that the optimal policy is the best one. Under the optimal policy, the source can always get the maximal total reward. This means that our optimal control policy is correct.
Now, we further compare the performance of different policies when the number of relay nodes is different. In this case, we assume that the maximal message lifetime T equals 10000 s, and let the number of relay nodes increase from 50 to 1000. Other settings remain unchanged. Numerical result is shown in Figure 5, and it demonstrates that the optimal policy obtained by (19) is the best too.

Performance comparison with different policies when the number of relay nodes is different.
In addition, total reward under the optimal policy is increasing with the number of nodes. In fact, when there are more nodes, the source can request help from more nodes at early time. Because the reward that the relay nodes request is increasing with time (e.g.,

The value of the threshold under the optimal policy when the number of relay nodes is different.
On the other hand, the result in Figure 6 also shows that the optimal policy really conforms to the threshold form. We can see the result in Figure 7 more clearly, when there are 500 and 1000 relay nodes, respectively. That is, the source asks for help from others with probability 1 before the threshold h, and then it stops doing this.

The optimal policy when the number of relay nodes is different.
In the above simulation and numerical results, we define

Performance analysis when
Note that the optimal policy is obtained by (19), but the threshold policy conforms to Theorem 1. In fact, there is a threshold policy, which is corresponding to a specific value of h. Therefore, there are many threshold policies, which can be denoted by threshold (h). The threshold policy in Figure 8 can maximize the total income of the source under all of the threshold policies.
The result in Figure 8 shows that the threshold policy is worse than the optimal policy obtained by (19). This means that the optimal policy does not conform to the threshold form in this case. Therefore, the form of the function
6. Conclusions
To increase the efficiency, most routing algorithms in DTN need nodes to work in a cooperative way. In particular, nodes should stay in the network to forward the message further after getting message. However, due to the impact of selfishness, nodes have no incentive to stay in the network after getting message. To make these nodes be cooperative, the source has to pay certain reward (e.g.,
Note that once we know the functions
Footnotes
Conflict of Interests
The authors declare that there is no conflict of interests regarding the publication of this paper.
