Optimal Control of Epidemic Routing in Delay Tolerant Networks with Selfish Behaviors

Abstract

Most routing algorithms in delay tolerant networks (DTN) need nodes serving as relays for the source to carry and forward message. Due to the impact of selfishness, nodes have no incentive to stay in the network after getting message (e.g., free riders). To make them be cooperative at specific time, the source has to pay certain reward to them. In addition, the reward may be varying with time. On the other hand, the source can obtain certain benefit if the destination gets message timely. This paper considers the optimal incentive policy to get the best trade-off between the benefit and expenditure of the source for the first time. To cope with this problem, it first proposes a theoretical framework, which can be used to evaluate the trade-off under different incentive policies. Then, based on the framework, it explores the optimal control problem through Pontryagin's maximum principle and proves that the optimal policy conforms to the threshold form in certain cases. Simulations based on both synthetic and real motion traces show the accuracy of the framework. Through extensive numerical results, this paper demonstrates that the optimal policy obtained above is the best.

1. Introduction

With the increasing of mobile operating systems, such as iPhone OS, Android, and Windows Phone OS, mobile phones have already evolved from simple voice communication means into powerful devices able to provide a variety of services to users. For example, they can transfer the videos or music in a peer-to-peer (P2P) way through the short-range communication technologies (Bluetooth, WiFi, etc.) in them. Due to the limited communication range or mobility nature, it is hard to maintain the continuous end-to-end path between them as for classic Internet P2P applications [1]. Obviously, such network belongs to delay tolerant networks (DTN), in which end-to-end path between communication nodes may not exist [2]. In particular, the concept of DTN is proposed to support many new emerging networking applications, and other classic examples include deep-space exploration [3], military networks [4], and vehicular networks [5].

Because the communication opportunity is uncertain, message propagation in DTN often needs the help from others. In particular, nodes in DTN adopt a store-carry-forward communication strategy. This strategy exploits the opportunistic connectivity and node mobility to relay and carry messages. In other words, when the next hop is not available for the current node, it will store the message in its buffer, carry the message along the movement, and forward it until a new communication opportunity arises. Obviously, this method needs nodes working in a cooperative way. That is, nodes should stay in the network and forward message to others after getting message. However, nodes have no incentive to stay in the network after getting message due to the selfish nature [6, 7]. In particular, nodes may leave the network at once when it receives message, and these nodes can be seen as free riders [6, 7]. For example, users may make their communication devices (e.g., Bluetooth) off to save their limited energy. If all nodes in the network become free riders, other nodes can get message only from the source, and this is similar to the direct transmission protocol [8], which is very inefficient. In fact, the source can obtain certain benefit if other nodes get message timely. For example, if the source is the advertiser, it is good for it to make other nodes (consumers) get the message (advertisement) timely. In addition, such benefit may be varying with time. For example, the sooner the nodes get message, the more the benefit will be. Therefore, the source has the incentive to push the message to other nodes timely. To achieve this goal, it has to pay certain reward to the relay nodes to make them be cooperative. In addition, such reward may be varying with time too. For example, the longer the time nodes stay in the network, the more the energy may be used, so nodes may ask for more rewards. In fact, nodes (e.g., phone, PDA) are often devices that can be manipulated by humans [9, 10], and the buffer space or the forwarding ability of nodes can be seen as goods. Therefore, the event that the source requests help from other nodes can be seen as the event that the source buys certain goods from humans. The things that are used to buy goods by the source may be virtual currency [11] or discount of service [12] and so forth. Therefore, the message propagation process can be seen as the commodities trading process, and humans want to maximize its reward in this process. Therefore, these humans may adjust the price of their goods according to the market state. For example, if the remaining lifetime of message is shorter, they may think that the source is eager to transmit the message as soon as possible, so they think that their goods (e.g., the forwarding service) are important for the source. In this case, they may increase the price. On the other hand, if the remaining lifetime of the message is long, they may think that the source may be not eager to transmit the message quickly and is not willing to pay too many fees. In this case, they may help the source with only a little reward. Therefore, the price of the goods (e.g., the forwarding service) may be varying with time. In this environment, whether to make nodes be cooperative at specific time is an important problem for the source. For example, suppose that node j is willing to help the source by charging m nuggets (price of the goods) at time $t_{1}$ , but it only requires n nuggets at time $t 2$ . If $m > n$ and $t 1 < t 2$ , the source can pay less nuggets when it requires help from node j at time $t 2$ , but this may decrease the message propagation speed, so it may be not good for the source. On the other hand, if the source uses fewer nuggets to make node j be cooperative, the remaining nuggets are more, so the source may have enough nuggets to make more nodes be cooperative and this is good for the source. Therefore, the optimal policy of the source is related to time. In this case, how to maximize the total income of the source is not a simple problem and this will be our main contribution.

The main contribution of this paper can be summarized as follows. (i)

We consider the optimal incentive policy to get the best trade-off when the reward is varying with time for the first time.

(ii)

We propose a unifying framework through a continuous time Markov process, which can be used to evaluate the trade-off between the benefit and expenditure of the source under different incentive policies.

(iii)

Then based on the framework, we formulate an optimization problem. Through Pontryagin's maximum principle, we explore the optimal control problem and prove that the optimal policy conforms to the threshold form in some cases. By comparing the simulation results with the theoretical results, we show that our theoretical framework is very accurate. In addition, we compare the performance of the optimal policy with other policies through extensive numerical results and find that the optimal policy obtained by our model is the best.

2. Related Works

In fact, our work is similar to the optimal controlling problem for epidemic routing algorithm (ER), such as the works in [13, 14]. These works mainly study how to maximize the average delivery ratio when the energy is limited, and the energy consumption for forwarding the message once is not related to time. Therefore, these methods cannot be used to solve the optimization problem in our paper, in which the reward that the relay nodes require is related to time. On the other hand, the work in [15] studies similar problem as that in this paper, but it tries to get the optimal forwarding policy when the total fees are limited. However, we study the trade-off between the income and expenditure, so they are different.

There are many other selfish behaviors, such as individual selfishness and social selfishness [16, 17]. At present, some works study the impact of these selfish behaviors. For example, Li et al. study the impact of social selfishness on the epidemic routing protocol [17]. Then, they explore the impact of both individual selfishness and social selfishness on multicasting in DTN [18]. However, these selfish behaviors depend on the social relations between nodes. For example, the social selfishness denotes the selfish behavior between friends. Therefore, the distribution of friends may have certain impact. Existing studies have shown that the number of friends of nodes may be different. In particular, the distribution of friends may conform to a power law distribution [19]. Therefore, if we consider those selfish behaviors, we have to classify the nodes according to their number of friends, and this will be a controlling problem with multiple parameters. Such interesting problem is an extension of our work, and we will study it deeply in the future.

3. Network Model

Suppose that there are one source S, N relay nodes, and a destination node D. At time 0, only the source has message, and it wants to make the destination obtain the message before the maximal lifetime T. To achieve this goal, the source needs the help from others. However, it has to pay certain reward every time it makes a relay node be cooperative, so it may not do this all the time. Note that only the relay nodes that have message can forward it to others, so the source only makes these nodes be cooperative. In other words, the source is not willing to pay reward to nodes that do not have message. In this paper, we assume that the source makes a relay node (e.g., node m) that has message cooperative with probability $μ (t)$ at time t, and then m can get the required reward denoted by $α (t)$ . As shown in previous section, the function $α (t)$ may be varying with time. On the other hand, the source can obtain certain benefit denoted by $β (t)$ if the destination gets message at time t. In addition, we assume that all of the relay nodes are willing to receive message. In fact, if they get message, they may get certain reward from the source, so they have incentive to receive message and this assumption is rational.

Nodes in the network can communicate with each other only when they come into the transmission range of each other, which means a communication contact, so the mobility rule of nodes is critical. In this paper, we assume that the occurrence of contacts between two nodes follows a Poisson distribution. This assumption has been used in wireless communications for many years. At present, some works show that this assumption is only an approximation to the message propagation process. For example, the work in [20] reveals that nodes encounter with each other according to the power law distribution. However, it also finds that if you consider long traces, the tail of the distribution is exponential. Furthermore, a more recent work in [21] studies the vehicles’ dataset in large-scale urban environment and finds that the intermeeting time can be modeled by a three-segmented distribution. Though the first and second parts of the contact intervals do not obey the exponential distribution, it also recognizes that the tail obeys the exponential distribution. In addition, the work in [22] shows that individual intermeeting time can be shaped to be exponential by choosing an approximate domain size with respect to the given time scale. Moreover, there are also some works, which describe the intermeeting time of human or vehicles by exponential distribution and validate their model experimentally on real motion traces [23, 24]. For this reason, the exponential model is still widely used in many existing works, such as [25–27]. In this paper, we also use such model and assume that the intermeeting time between two nodes follows an exponential distribution with parameter λ. Simulations based on both synthetic and real motion traces show that our theoretical framework based on such assumption is very accurate.

Besides the intermeeting time, many other factors can have certain impact on the routing performance, such as the contact duration, bandwidth, and message size. If the bandwidth is big enough, the message may be transmitted successfully in one contact. However, if the bandwidth is too small, it may be hard to transmit the message to one contact, even though the contact duration is long. At present, some works find that the distribution of the contact duration may conform to the Pareto distribution [28, 29]. However, the Pareto distribution is hard to be used to analyze the routing performance theoretically. Therefore, most of the previous works which explore the routing performance based on the theoretical method ignore the impact of the contact duration and assume that a contact is long enough to transmit the message, such as [13–17]. In this paper, we use the same assumption. Note that the assumption is rational when the message is small or the bandwidth is very big.

The commonly used variables of this paper can be seen in Table 1.

Table 1

The list of commonly used variables.

N	Number of relay nodes
S	The source node
D	The destination
λ	Exponential parameter of the intermeeting time (the biggest value)
T	The maximal lifetime of the message
$μ (t)$	The probability that the source makes a relay node cooperative at time t
$α (t)$	The required reward at time t
$β (t)$	The benefit the source can get if the destination gets the message at time t
$X (t)$	The number of relay nodes carrying message at time t
$F (t)$	The delivery ratio at time t
C	The maximal energy

4. Problem Formulation

4.1. Theoretical Framework

Let $X (t)$ denote the number of relay nodes that have message at time t. Because only the source has message at time 0, we have $X (0) = 0$ . Given a small time interval $Δ t$ , we have

\begin{matrix} X (t + Δ t) = X (t) + \sum_{m \in (Y (t)}}^{} φ_{m} (t, t + Δ t) . \end{matrix}

(1)

Symbol ${Y (t)}$ denotes the set of relay nodes that do not have message at time t, so the cardinality of this set is $N - X (t)$ . $φ_{m} (t, t + Δ t)$ denotes the event whether the relay node m gets messagein time interval $[t, t + Δ t]$ . If $φ_{m} (t, t + Δ t) = 1$ , we can say that node m successfully obtains message, but if $φ_{m} (t, t + Δ t) = 0$ , this event does not happen. Note that a relay node can get message only from the source or other cooperative relay nodes. In addition, two nodes encounter with each other according to an exponential distribution with parameter λ. Therefore, node m encounters with a specific node (e.g., n) in time interval $[t, t + Δ t]$ with probability $1 - e^{- λ Δ t}$ . If node n is the source, node m can get message immediately. However, if node n is a relay node, m can get message from n only when n is cooperative. In addition, a relay node is cooperative at time t with probability $μ (t)$ , so the total probability that node m gets message in interval $[t, t + Δ t]$ is

\begin{matrix} p (φ_{m} (t, t + Δ t)) = 1 - e^{- λ Δ t} {(1 - μ (t) (1 - e^{- λ Δ t}))}^{X (t)} . \end{matrix}

(2)

Combining with (1) and (2), we can get

\begin{array}{l} E (X (t + Δ t)) = E (X (t)) \\ + (N - E (X (t))) E (φ_{m} (t, t + Δ t)) . \end{array}

(3)

Further, we can obtain

\begin{matrix} \lim_{Δ t \to 0} \frac{E (φ_{m} (t, t + Δ t))}{Δ t} = λ (1 + μ (t) X (t)) . \end{matrix}

(4)

Then we have

\begin{array}{l} E (\dot{X} (t)) = \frac{E (X (t + Δ t)) - E (X (t))}{Δ t} \\ = (N - E (X (t))) \frac{E (φ_{m} (t, t + Δ t))}{Δ t} \\ = λ (N - E (X (t))) (1 + μ (t) X (t)) . \end{array}

(5)

One main metric of routing algorithm in DTN is the delivery ratio, which denotes the probability that the destination D obtains message within given time. Let $F (t)$ denote the delivery ratio when the given time is t. Before getting its value, we first give another symbol $H (t) = 1 - F (t)$ , which denotes the probability that D does not obtain message before time t. Moreover, let $H (t, t + Δ t)$ denote the probability that D does not get message in time interval $[t, t + Δ t]$ . Therefore, we have

\begin{matrix} H (t + Δ t) = H (t) H (t, t + Δ t) . \end{matrix}

(6)

Similar to the relay nodes, D may get message from the source or the cooperative relay nodes. Therefore, we have

\begin{matrix} H (t, t + Δ t) = e^{- λ Δ t} {(1 - μ (t) (1 - e^{- λ Δ t}))}^{X (t)} . \end{matrix}

(7)

Further, we can obtain

\begin{matrix} E (\dot{H} (t)) = - λ E (H (t)) (1 + μ (t) X (t)) \\ E (\dot{F} (t)) = λ (1 - E (F (t))) (1 + μ (t) X (t)) . \end{matrix}

(8)

Let $U (t)$ denote the total income of the source till time t, which equals the result that the benefit takes away the expenditure. Therefore, we have

\begin{array}{l} U (t + Δ t) = U (t) - μ (t + Δ t) α (t) X (t + Δ t) Δ t \\ + β (t) ρ_{D} (t, t + Δ t) . \end{array}

(9)

Time interval $Δ t$ is very small, so we can assume that the behavior of the source remains unchanged. That is, the source makes a relay node that has message be cooperative with the same probability denoted by $μ (t + Δ t)$ in the time interval. In addition, the number of relay nodes that have message can be denoted by $X (t + Δ t)$ . Because the source has to pay certain reward if it makes a relay node be cooperative and the reward is $α (t)$ at time t, the total reward the source has to pay is $μ (t + Δ t) X (t + Δ t) α (t) Δ t$ . In addition, if the destination gets message, the source can obtain certain benefit. Symbol $ρ (t, t + Δ t)$ denotes whether D gets message in interval $[t, t + Δ t]$ , so we can obtain (9).

Because nodes that have message do not receive the same message any more, if the event $ρ (t, t + Δ t)$ happens, we can see that the destination D does not have message before. In other words, we have

\begin{array}{l} p (ρ_{D} (t, t + Δ t) = 1) = H (t) (1 - H (t, t + Δ t)) \\ = F (t + Δ t) - F (t) . \end{array}

(10)

Combining with (9) and (10), we can obtain

\begin{matrix} E (\dot{U} (t)) = β (t) E (\dot{F} (t)) - μ (t) α (t) E (X (t)) . \end{matrix}

(11)

Based on (11), we have

\begin{matrix} E (U (T)) = \int_{0}^{T} (β (t) E (\dot{F} (t)) - μ (t) α (t) E (X (t))) d t . \end{matrix}

(12)

T is the maximal lifetime of the message, and our object is to maximize the value of $E (U (T))$ , which is a function about $μ (t)$ . That is, our object is to solve the following question:

\begin{array}{l} Max E (U (T)) \\ Subject to μ (t) \in [0,1], t \in [0, T] . \end{array}

(13)

4.2. Optimal Control

Obviously, the above question is an optimal control problem, and $μ (t)$ is the control variable. We use Pontryagin's maximum principle in ([30, P. 109, Theorem 3.14]) to solve the above problem. According to the principle, we should first get the Hamiltonian function.

Let $((X, F), μ)$ be an optimal solution. In particular, at time t, X denotes the value of $E (X (t))$ and F denotes the value of $E (F (t))$ . Similarly, μ denotes the value of $μ (t)$ . According to [30], the Hamiltonian function can be got by the derivative of the objective function and the derivation of the corresponding state functions, so we can get the Hamiltonian H:

\begin{array}{l} H & = & β \dot{F} - μ α X + λ_{F} \dot{F} + λ_{X} \dot{X} \\ = & (β + λ_{F}) \dot{F} + λ_{X} \dot{X} - μ α X \\ = & λ (β + λ_{F}) (1 - F) (1 + μ X) \\ + λ λ_{X} (1 + μ X) (N - X) - μ α X . \end{array}

(14)

Note that, at time t, α and β are simple expressions of $α (t)$ and $β (t)$ , respectively. Based on (14), we have

\begin{array}{l} g = \frac{\partial H}{\partial μ} = λ (β + λ_{F}) (1 - F) X \\ + λ λ_{X} X (N - X) - α X . \end{array}

(15)

Then, we can get the costate or adjoint functions $λ_{X}$ and $λ_{F}$ based on the Hamiltonian H [30]:

\begin{array}{l} \dot{λ_{F}} = - \frac{\partial H}{\partial F} = λ (β + λ_{F}) (1 + μ X) \\ \dot{λ_{X}} = - \frac{\partial H}{\partial X} = - λ (β + λ_{F}) (1 - F) μ \\ + λ λ_{X} (1 + μ X) - λ λ_{X} μ (N - X) + μ α . \end{array}

(16)

The transversality conditions are shown as follows [30]:

\begin{matrix} λ_{F} (T) = λ_{X} (T) = 0 . \end{matrix}

(17)

Then according to Pontryagin's maximum principle in ([30, P. 109, Theorem 3.14]), there exist continuous or piece-wise continuously differentiable state and costate functions, which satisfy

\begin{matrix} μ \in \arg \max_{0 \leq μ^{*} \leq 1} H (λ_{F}, λ_{X}, (F, X), μ^{*}) . \end{matrix}

(18)

This equation between the optimal control parameter μ and the Hamiltonian H allows us to express μ as a function of the state $(X, F)$ and costate ( $λ_{X}, λ_{F}$ ), resulting in a system of differential equations involving only the state and costate functions and not the control function. In fact, this equation means that maximizing the value of $E (U (T))$ equals maximizing the corresponding Hamiltonian H. In particular, at given time t, the state $(X, F)$ and costate $(λ_{X}, λ_{F})$ can be seen as constants, and $μ (t)$ can maximize H at this time. Therefore, according to (15), we can obtain the optimal policy as follows:

\begin{matrix} μ = {\begin{cases} 1, & g > 0 \\ 0, & g < 0 . \end{cases} \end{matrix}

(19)

Below, we will prove that when the function of $α (t)$ and $β (t)$ satisfies certain conditions, the optimal policy has a simple structure. The conditions are as follows: $α (t)$ is increasing with time t, but $β (t)$ is no-decreasing function; $α (t)$ and $β (t)$ are continuous and differentiable; they are nonnegative. In fact, the maximal lifetime $(T)$ of the message is fixed, so if the value of t is bigger, the remaining lifetime ( $T - t$ ) is shorter. In this case, the relay nodes may think that the source may be eager to transmit message to D quickly, so they may ask for more rewards. That is to say, if the value of t is bigger, the value of $α (t)$ may be bigger. Therefore, the condition that $α (t)$ is increasing is rational in some environments. On the other hand, it is better if the destination gets message earlier, so the assumption that $β (t)$ is no-decreasing function is rational in certain applications too.

If above conditions can be satisfied, the optimal policy conforms to the threshold form and has at most one jump. In particular, we have the following theorem.

Theorem 1.

If $α (t)$ and $β (t)$ satisfy the above conditions, the optimal policy satisfies $μ (t) = 1$ , $t < h$ , and $μ (t) = 0$ , $t > h$ , $0 \leq h \leq T$ .

Proof.

First, note that the functions $α (t)$ and $β (t)$ are nonnegative. In addition, we simply use $X (t)$ , $Y (t)$ , and $F (t)$ to denote $E (X (t))$ , $E (Y (t))$ , and $E (F (t))$ in the proving process, respectively.

When $X = 0$ , none of the relay nodes has message, so the value of μ cannot have any impact. Therefore, we only consider the case that $X > 0$ . Based on (15), we define

\begin{matrix} f = \frac{g}{X} = λ (β + λ_{F}) (1 - F) + λ λ_{X} (N - X) - α . \end{matrix}

(20)

Then, we can get

\begin{array}{l} \dot{f} & = & λ (\dot{β} + λ_{F}^{\cdot}) (1 - F) - λ (β + λ_{F}) \dot{F} \\ + λ λ_{X}^{\cdot} (N - X) - λ λ_{X} \dot{X} - \dot{α} \\ = & λ (\dot{β} + λ_{F}^{\cdot}) (1 - F) - λ λ (β + λ_{F}) (1 - F) (1 + μ X) \\ + λ λ_{X}^{\cdot} (N - X) - λ λ_{X} λ (1 + μ X) (N - X) - \dot{α} . \end{array}

(21)

Combining with (16), we have

\begin{array}{l} \dot{f} & = & λ \dot{β} (1 - F) + λ λ_{X}^{\cdot} (N - X) \\ - λ λ_{X} λ (1 + μ X) (N - X) - \dot{α} \\ = & λ \dot{β} (1 - F) + λ (N - X) (λ_{X}^{\cdot} - λ λ_{X} (1 + μ X)) - \dot{α} . \end{array}

(22)

In addition, from (16), we can obtain

\begin{array}{l} λ_{X}^{\cdot} & = & - \frac{\partial H}{\partial X} = - λ (β + λ_{F}) (1 - F) μ \\ + λ λ_{X} (1 + μ X) - λ λ_{X} μ (N - X) + μ α \\ = & - μ (λ (β + λ_{F}) (1 - F) + λ λ_{X} (N - X) - α) \\ + λ λ_{X} (1 + μ X) \\ = & - μ f + λ λ_{X} (1 + μ X) . \end{array}

(23)

Suppose that $f (s) = 0$ ; we have

\begin{array}{l} \dot{λ_{X}} (s) = - μ (s) f (s) + λ λ_{X} (s) (1 + μ (s) X (s)) \\ = λ λ_{X} (s) (1 + μ (s) X (s)) . \end{array}

(24)

Combining with (22), we can obtain

\begin{array}{l} \dot{f} (s) & = & λ \dot{β} (s) (1 - F (s)) - \dot{α} (s) \\ + λ (N - X (s)) (λ_{X}^{\cdot} (s) - λ λ_{X} (s) (1 + μ (s) X (s))) \\ = & λ \dot{β} (s) (1 - F (s)) - \dot{α} (s) . \end{array}

(25)

Because $α (t)$ is increasing with time t and $β (t)$ is no-decreasing function, we have

\begin{matrix} \dot{β} (s) < 0, \dot{α} (s) > 0 . \end{matrix}

(26)

Further, $1 - F (s) \geq 0$ , so we have

\begin{matrix} \dot{f} (s) = λ \dot{β} (s) (1 - F (s)) - \dot{α} (s) < 0 . \end{matrix}

(27)

That is, if $f (s) = 0$ , the function will decrease at time s.

Then we assume that $f (s) < 0$ . Based on (19), we have $μ (s) = 0$ . Combining with (22) and (23), we also can obtain (25). Further, we can get (27) and know that $f (t)$ will decrease at time s.

In summary, if $f (s) \leq 0$ , $f (t)$ will decrease at time s. Therefore, if $f (s) \leq 0$ , we have $f (t) < 0$ , $t > s$ . Further, according to (19), the optimal policy satisfies $μ (t) = 1$ , $t < h$ , and $μ (t) = 0$ , $t > h$ , $0 \leq h \leq T$ . That is, once $μ (t) \neq 1$ , it will be 0 later and then remain unchanged all the time, so the optimal policy conforms to the threshold form and has at most one jump. This proves that Theorem 1 is correct.

5. Model Validation and Performance Analysis

5.1. Model Validation

In this section, we will check the accuracy of our framework by comparing the theoretical results obtained by our model with the simulation results. We run several simulations using the opportunistic network environment (ONE) [31] based on three different scenarios. In the first one, we use the famous random waypoint (RWP) mobility model [32], which is commonly used in many mobile wireless networks. There are totally 500 nodes, and all of these nodes move according to the RWP model within a 10000 m × 10000 m terrain with a scale speed chosen from a uniform distribution from 4 m/s to 10 m/s. The communication range is 5 m. Moreover, the source and destination nodes are randomly selected among these nodes. In the second scenario, we use a real motion trace from about 2100 operational taxis for about one month in Shanghai city collected by GPS [33]. The location information of the taxis is recorded at every 40 seconds with the area of 102 km². We randomly pick 500 nodes from this trace. In addition, the source and destination nodes are randomly selected among these nodes too. The third scenario is based on the dataset collected in the Infocom 2005 conference [34]. In particular, this dataset includes 41 attendees, who connect with each other by Bluetooth. Among those attendees, we randomly select two nodes as the source and destination, respectively.

The functions of $α (t)$ and $β (t)$ may be any form. For simplicity, we define $α (t) = (1 - e^{- t / 10000}) / 1000$ and $β (t) = 1000 e^{- t / 10000}$ . In fact, the value of $μ (t)$ may be any value between 0 and 1 at time t too. Because our main goal is to check the accuracy of our theoretical framework, we only consider two special cases: case 1: $μ (t) = 1$ , $t \geq 0$ ; case 2: $μ (t) = 0$ , $t \geq 0$ . The first case means that the sources make nodes be cooperative all the time and message is propagated according to epidemic routing (ER) algorithm, but in the second case, the source does not ask for help from others at all, so message is propagated according to the direct transmission algorithm (DT). At the starting of each simulation, one message is generated with maximal lifetime T, and each simulation is repeated 20 times. In addition, let the maximal message lifetime T increase from 0 to 50000 s.

Based on these settings, we can get Figures 1, 2, and 3, respectively.

Figure 1

Simulation and numerical results comparison of total fees with RWP mobility model.

Figure 2

Simulation and numerical results comparison of total fees with Shanghai city motion trace.

Figure 3

Simulation and numerical results comparison of total fees with Infocom 05 motion trace.

From the results, we can see that the average deviation between the theoretical and simulation results is very small. For example, the average deviation is about 4.22% for the RWP mobility model and 5.01% for the Shanghai city motion trace. For the Infocom 2005 dataset, the average deviation is about 7.12%. Though the deviation is bigger than that in RWP and Shanghai city motion traces, it also can be seen very accurately. This demonstrates the accuracy of our theoretical framework. For this reason, we can use the numerical results obtained by our theoretical framework to evaluate the performance of different policies.

In addition, the results above also show that the performance is different when the source adopts different policies. In particular, the results in Figures 1 and 2 show that it is not good for the source to request help all the time. For example, when the value of T is bigger than 4000 s in Figure 2, the total income of the source may be negative if it requests help all the time. This shows that the policy of the source can have important impact on its total income, and this means that our optimal control policy is necessary. Later, we will show that the optimal policy obtained by (19) is the best through extensive numerical results.

5.2. Performance Analysis with Numerical Results

In this section, we use the best fitting for the Shanghai city motion trace in the above simulation to describe the exponential distribution of the intermeeting time between nodes.

First, we evaluate the performance of the optimal policy obtained by (19). For comparison, we consider three other cases: case 1: $μ (t) = 1$ , $t \geq 0$ ; case 2: $μ (t) = 0$ , $t \geq 0$ ; case 3: random. The random policy means that the value of $μ (t)$ is randomly selected from the interval $[0,1]$ at time t. Other settings are the same as those in simulation, and then we can obtain Figure 4.

Figure 4

Performance comparison with different policies.

The result in Figure 4 shows that the optimal policy is the best one. Under the optimal policy, the source can always get the maximal total reward. This means that our optimal control policy is correct.

Now, we further compare the performance of different policies when the number of relay nodes is different. In this case, we assume that the maximal message lifetime T equals 10000 s, and let the number of relay nodes increase from 50 to 1000. Other settings remain unchanged. Numerical result is shown in Figure 5, and it demonstrates that the optimal policy obtained by (19) is the best too.

Figure 5

Performance comparison with different policies when the number of relay nodes is different.

In addition, total reward under the optimal policy is increasing with the number of nodes. In fact, when there are more nodes, the source can request help from more nodes at early time. Because the reward that the relay nodes request is increasing with time (e.g., $α (t) = (1 - e^{- t / 10000}) / 1000$ is increasing function), this behavior will decrease the cost of the source and the source will stop requesting help at early time. As shown in Theorem 1, the source will stop making relay nodes be cooperativeat certain time (e.g., h). In particular, the optimal policy satisfies $μ (t) = 1$ , $t < h$ , and $μ (t) = 0$ , $t > h$ , $0 \leq h \leq T$ . When the number of nodes is bigger, the value of h is smaller, so the source stops requesting help earlier and it will pay less reward. The result in Figure 6 shows that h is really decreasing with the number of nodes. When the number of relay nodes is smaller, the source has to ask for help for a longer time. For example, when there are 50 relay nodes, the source nearly requires help all the time.

Figure 6

The value of the threshold under the optimal policy when the number of relay nodes is different.

On the other hand, the result in Figure 6 also shows that the optimal policy really conforms to the threshold form. We can see the result in Figure 7 more clearly, when there are 500 and 1000 relay nodes, respectively. That is, the source asks for help from others with probability 1 before the threshold h, and then it stops doing this.

Figure 7

The optimal policy when the number of relay nodes is different.

In the above simulation and numerical results, we define $α (t) = (1 - e^{- t / 10000}) / 1000$ , which is an increasing function. The optimal policy conforms to the threshold form in this case. However, $α (t)$ may conform to any form. In the rest of this section, we want to know whether the threshold policy is still better when $α (t)$ has different forms. In particular, we define $α (t) = 50$ , $t \leq 5000$ s; $P R (t) = 0 t > 5000$ s. It is easy to see that $α (t)$ is not an increasing function. Other settings are the same as those in simulation. Based on these settings, we can obtain Figure 8.

Figure 8

Performance analysis when $α (t)$ conforms to different form.

Note that the optimal policy is obtained by (19), but the threshold policy conforms to Theorem 1. In fact, there is a threshold policy, which is corresponding to a specific value of h. Therefore, there are many threshold policies, which can be denoted by threshold (h). The threshold policy in Figure 8 can maximize the total income of the source under all of the threshold policies.

The result in Figure 8 shows that the threshold policy is worse than the optimal policy obtained by (19). This means that the optimal policy does not conform to the threshold form in this case. Therefore, the form of the function $α (t)$ can have certain impact on the optimal policy.

6. Conclusions

To increase the efficiency, most routing algorithms in DTN need nodes to work in a cooperative way. In particular, nodes should stay in the network to forward the message further after getting message. However, due to the impact of selfishness, nodes have no incentive to stay in the network after getting message. To make these nodes be cooperative, the source has to pay certain reward (e.g., $α (t)$ ) to them, and such reward may be varying with time. On the other hand, if the destination gets message timely, the source can get certain reward (e.g., $β (t)$ ) too. For example, the sooner the destination obtains message, the more reward the source may get. In this paper, we propose a unifying framework to evaluate the total income that the source gets under different policies. Then based on the framework, we study the optimal control problem through Pontryagin's maximum principle. In addition, we prove that the optimal policy conforms to the threshold form when $α (t)$ and $β (t)$ satisfy certain conditions. Simulations based on both synthetic and real motion traces show the accuracy of our theoretical framework. Numerical results show that the optimal policy obtained by (19) is the best.

Note that once we know the functions $α (t)$ and $β (t)$ , we can get the theoretical model that can evaluate the routing performance under different policies. Furthermore, we can get the optimal policy from (19). In this case, nodes just need to conform to the optimal policy. However, $α (t)$ and $β (t)$ are system specified, so we may not know their form as the premise. In this case, we need certain learning process to get the functions of $α (t)$ and $β (t)$ rapidly. In other words, in certain applications, we have to explore the learning process, and this will be our future work.

Footnotes

Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.

References

Palazzi

C. E.

Bujari

Social-aware delay tolerant networking for mobile-to-mobile file sharing

International Journal of Communication Systems 2012 25 10 1281 1299

Fall

A delay-tolerant network architecture for challenged internets

Proceedings of the ACM SIGCOMM: Conference on Computer Communications

August 2003

27 34

2-s2.0-1242310099

Papastergiou

Psaras

Tsaoussidis

Deep-space transport Protocol: a novel transport scheme for Space DTNs

Computer Communications 2009 32 16 1757 1767

2-s2.0-68749095779

10.1016/j.comcom.2009.02.012

Whitbeck

Lopez

Leguay

Conan

Rosenberg

Tessier

Using UHF connectivity to off-load VHF messaging in tactical MANETs

Proceedings of the IEEE Military Communications Conference (MILCOM ′11)

November 2011

961 966

2-s2.0-84856969588

10.1109/MILCOM.2011.6127804

Baccelli

Jacquet

Mans

Rodolakis

Information propagation speed in bidirectional vehicular delay tolerant networks

Proceedings of the IEEE Annual Joint Conference on Computer and Communications Societies (INFOCOM ′11)

April 2011

Shanghai, China

436 440

2-s2.0-79960850558

10.1109/INFCOM.2011.5935199

Núñez-Queija

Prabhu

Scaling laws for file dissemination in P2P networks with random contacts

Proceedings of the 16th International Workshop on Quality of Service (IWQoS ′08)

June 2008

75 79

2-s2.0-50649117332

10.1109/IWQOS.2008.15

Altman

Nain

Shwartz

Predicting the impact of measures against P2P networks on the transient behaviors

Proceedings of the IEEE Annual Joint Conference on Computer and Communications Societies (INFOCOM ′11)

April 2011

Shanghai, China

1440 1448

2-s2.0-79960851629

10.1109/INFCOM.2011.5934931

Spyropoulos

Psounis

Raghavendra

C. S.

Single-copy routing in intermittently connected mobile networks

Proceedings of the 1st Annual IEEE Communications Society Conference on Sensor and Ad Hoc Communications and Networks (SECON ′04)

October 2004

235 244

2-s2.0-20344379431

Liang

Choi

B. J.

Zhuang

Shen

Towards optimal energy store-carry-and-deliver for PHEVs via V2G system

Proceedings of the IEEE Annual Joint Conference on Computer and Communications Societies (INFOCOM ′12)

2012

10.

Whitbeck

Lopez

Leguay

Conan

de Amorim

M. D.

Push-and-track: saving infrastructure bandwidth through opportunistic forwarding

Pervasive and Mobile Computing 2012 8 5 682 697

2-s2.0-84857312157

10.1016/j.pmcj.2012.02.001

11.

Buttyan

Hubaux

J.-P.

Enforcing service availability in mobile ad-hoc WANs

Proceedings of the 1st Annual Workshop on Mobile and Ad Hoc Networking and Computing (MobiHOC ′00)

2000

12.

Zhuo

Gao

Cao

Dai

Win-coupon: an incentive framework for 3G traffic offloading

Proceedings of the 19th IEEE International Conference on Network Protocols (ICNP ′11)

October 2011

206 215

2-s2.0-84055200938

10.1109/ICNP.2011.6089054

13.

Altman

Neglia

de Pellegrini

Miorandi

Decentralized stochastic control of delay tolerant networks

Proceedings of the IEEE Annual Joint Conference on Computer and Communications Societies (INFOCOM ′09)

April 2009

1134 1142

2-s2.0-70349653423

10.1109/INFCOM.2009.5062026

14.

Jiang

Jin

Zeng

Energy-efficient optimal opportunistic forwarding for delay-tolerant networks

IEEE Transactions on Vehicular Technology 2010 59 9 4500 4512

2-s2.0-78149444710

10.1109/TVT.2010.2070521

15.

Deng

Huang

Optimal routing control in disconnected machine-to-machine networks

International Journal of Distributed Sensor Networks 2012 2012 11

10.1155/2012/963758

963758

16.

Deng

Huang

Information propagation through opportunistic communication in mobile social networks

Mobile Networks and Applications 2012 17 6 773 781

17.

Hui

Jin

Zeng

Evaluating the impact of social selfishness on the epidemic routing in delay tolerant networks

IEEE Communications Letters 2010 14 11 1026 1028

2-s2.0-78449306437

10.1109/LCOMM.2010.093010.100492

18.

D. O.

Jin

Zeng

The impact of node selfishness on multicasting in delay tolerant networks

IEEE Transactions on Vehicular Technology 2011 60 5 2224 2238

2-s2.0-79959276613

10.1109/TVT.2011.2149552

19.

Rapoport

Horvath

W. J.

A study of a large sociogram

Behavioral science 1961 6 279 291

2-s2.0-0000721008

20.

Karagiannis

le Boudec

J.-Y.

Vojnovic

Power law and exponential decay of inter contact times between mobile devices

Proceedings of the 13th Annual ACM International Conference on Mobile Computing and Networking (MobiCom '07)

September 2007

183 194

2-s2.0-37749052702

10.1145/1287853.1287875

21.

Jin

Hui

Zeng

Revealing contact interval patterns in large scale urban vehicular ad hoc networks

ACM SIGCOMM Computer Communication Review 2012 42 4 299 300

22.

Cai

Eun

D. Y.

Crossing over the bounded domain: from exponential to power-law intermeeting time in mobile ad hoc networks

IEEE/ACM Transactions on Networking 2009 17 5 1578 1591

2-s2.0-70350534268

10.1109/TNET.2008.2011734

23.

Gao

Zhao

Cao

Multicasting in delay tolerant networks: a social network perspective

Proceedings of the 10th ACM International Symposium on Mobile Ad Hoc Networking and Computing (MobiHoc '09)

May 2009

299 308

2-s2.0-70450161107

10.1145/1530748.1530790

24.

Zhu

Xue

Zhu

L. M.

Recognizing exponential inter-contact time in VANETs

Proceedings of the IEEE Annual Joint Conference on Computer and Communications Societies (INFOCOM ′10)

March 2010

2-s2.0-77953309373

10.1109/INFCOM.2010.5462263

25.

Lee

Jeong

Won

Rhee

Chong

Max-contribution: on optimal resource allocation in delay tolerant networks

Proceedings of the IEEE Annual Joint Conference on Computer and Communications Societies (INFOCOM ′10)

March 2010

1 9

2-s2.0-77953295636

10.1109/INFCOM.2010.5461932

26.

Deng

Huang

Energy efficient beaconing control in delay tolerant networks with multiple destinations

IET Communications 2014 8 5 730 739

27.

Deng

Huang

Hop limited epidemic-like spreading in mobile social networks with selfish nodes

Journal of Physics A: Mathematical and Theoretical 2013 46

265101

28.

Wang

Srinivasan

Motani

Adaptive contact probing mechanisms for delay tolerant applications

Proceedings of the 13th Annual ACM International Conference on Mobile Computing and Networking (MobiCom '07)

September 2007

230 241

2-s2.0-37749030899

10.1145/1287853.1287882

29.

Chaintreau

Hui

Crowcroft

Diot

Gass

Scott

Pocket switched networks: realworld mobility and its consequences for opportunistic forwarding

2005 UCAM-CL-TR-617

30.

Grass

Caulkins

Feichtinger

Tragler

Behrens

Optimal Control of Nonlinear Processes: With Applications in Drugs, Corruption, and Terror 2008

New York, NY, USA

Springer

31.

Keranen

Ott

Karkkainen

The ONE simulator for DTN protocol evaluation

Proceedings of the 2nd International Conference on Simulation Tools and Techniques (SIMUTOOLS ′09)

2009

32.

Bettstetter

Wagner

The spatial node distribution of the random waypoint mobility model

Proceedings of the of Deutscher Workshop über Mobile Ad-Hoc Netzwerke (WMAN ′02)

2002

33.

Shanghai Taxi Trace Data http://wirelesslab.sjtu.edu.cn

34.

Hui

Chaintreau

Scott

Gass

Crowcroft

Diot

Pocket switched networks and the consequences of human mobility in conference environments

Proceedings of the ACM SIGCOMM Workshop on Delay-Tolerant Networking (WDTN ′05)

2005

244 251