A Data Item Selection Mechanism for Mobile Opportunistic Networks

Abstract

The nonexistence of an end-to-end path poses great challenges to directly adapting the traditional routing algorithms for Internet and mobile ad hoc networks (MANETs) to mobile opportunistic networks (MONs). In this paper, we try to improve the routing performance by resorting to an efficient data item selection mechanism that takes the bandwidth and connection duration time into consideration. For the purpose of evaluation, a specific data item selection mechanism for a probability-based routing is devised, which is formally defined as an optimal decision-making problem and solved by the dynamic programming technique. The simulation results show that our data item selection mechanism greatly reduces the number of aborted transmissions thus enhancing the routing performance in aspects of delivery probability, average latency, and overhead ratio.

1. Introduction

Recent years have seen the development of mobile devices such as smartphones, laptops, and tablet PCs, which makes it easier for people to contact and share data with each other in a cheap way. Many researchers use the term “mobile opportunistic networks (MONs)” to describe this special kind of delay-tolerant network (DTN), in which mobile users move around and communicate with each other via their carried short-distance wireless communication devices. Since MONs experience intermittent connectivity incurred by the mobility of users, routing is a mainly concerning and challenging problem [1]. In traditional data networks such as Internet, there are usually some assumptions of the network model, for example, the existence of at least one end-to-end path between source-destination pairs. Any arbitrary link connecting two nodes is assumed to be bidirectional supporting symmetric data rates with low error probability and latency. In addition, the power of each node is considered to be sufficient and thus irrelative to the node throughput. Messages are buffered in intermediate nodes (e.g., routers) and further forwarded to the next-hop relay or successfully received by the destination. In this case, each message is not expected to occupy the buffer of nodes for a long period of time. However, all the above assumptions usually fail in the context of DTNs. A part of applications in DTNs stresses the delivery ratio while being tolerant of an acceptable end-to-end latency, which is known as the “delay-tolerant” property. For further popularizing these kinds of applications, we have to reconsider the widely used network architecture so as to relax the assumption based by the traditional TCP/IP, that the end-to-end connectivity always exists [2].

There are many research achievements of the routing problem in DTNs, which highly improve the performance in the network scenarios like MONs. Most of them focused on exploiting the strategies for choosing relay node(s) during the routing process [3–5], while few literatures [6–8] considered the effect of data item selection on the routing performance. However, the combination of long-term storage and the message replication performed by many DTN routing protocols imposes a high bandwidth and storage overhead on wireless nodes [9]. Moreover, the data units disseminated in this context, called bundles, are self-contained. What is more, the application-level data units can often be large [10]. As a result, the nodes' buffer, in this context, usually works at a full load status. Similarly, the available bandwidth for a certain connection is likely to be insufficient to have all the buffered messages forwarded. Consequently, regardless of the specific routing algorithm used, it is important to have efficient scheduling policies to decide which message(s) should be chosen to exchange with another encountered node when bandwidth and connection duration are limited.

In this paper, we try to improve the classical probability routing protocol PRoPHET [11] from this point. By defining the concept of “Transmission Profit,” we introduce the “Optimal Throughput-Aware Probabilistic Routing,” which is consequently modeled as an optimal decision-making problem and is solved by the dynamic programming technique. Based on this model, a data item selection mechanism for PRoPHET is devised in this paper. The data item selection mechanism in this paper applies to the network scenarios with the two following characteristics. (1)

The average throughput of the connection between each pair of nodes is far smaller than the size of messages in their buffer.

(2)

The energy consumption for each transmission may not be ignored, which highlights the importance of every successful relay operation.

For example, a transmission operation for a message would fail when the available throughput of the used connection is smaller than the size of this message. However, a solution to this problem is to send a message of which the size is not exceeding the connection throughput, thus avoiding the waste of the transmission opportunity. To say the least, even if the throughput of the connection is sufficient for sending either of the messages, the whole profit we gain for each transmission (e.g., the delivery probability or the end-to-end latency) would be distinctive when different messages are selected to send. From this perspective, an efficient data item selection mechanism is expected to be employed in challenged network scenarios.

The rest of this paper is organized as follows. In Section 2 we introduce the system model and the routing model. In Section 3 we formally define the key problem. The improved protocol with the data item selection is given in Section 4. In Section 5 we analyze the simulation result. Section 6 reports on previous similar work. We conclude the paper and discuss future work in Section 7.

2. Preliminary

The mathematical notations are listed in Table 1. In our model, we use a discrete timeline that is divided into many small time slots of which the length is defined as a unit time. We denote the whole nodes set as $N = {n_{i} | n_{i} \in N, 1 \leq i < | N |}$ . There is a message time-to-live value $T T L (k)$ for each generated message $m_{k}$ . When a message is generated by a node, the $T T L (k)$ is preassigned by the corresponding application. Messages would be dropped if their TTLs are exhausted. The size of each message $m_{k}$ is denoted by $S (k)$ . We assume that each node $n_{i}$ knows its own bandwidth $B (i)$ . This assumption is acceptable due to the fact that the network interface of each device is usually immovable. To say the least, even the bandwidth of each node is not stable, and we can easily measure the average value by a slide window. For any encountered node $n_{j}$ of $n_{i}$ , we let $n_{i}$ record two values $t_{s_{(i, j)}}$ and $t_{e_{(i, j)}}$ that record the start and the end time of the most recently happening contact, respectively. Then $t_{e_{(i, j)}} - t_{s_{(i, j)}}$ is the duration time of this contact, and we will simply use $t_{s}$ and $t_{e}$ when there would be no ambiguousness in the context.

Table 1

Mathematical notations.

Notation	Meaning
$N$	The set of all the nodes in the network
$n_{i}$	The node with the identification number i
$m_{i}$	The message with the identification number i
$T T L (i)$	The time-to-live value of message $m_{i}$
$S (i)$	The size of message $m_{i}$
$ζ (i)$	The transmission profit value of message $m_{i}$
$B_{(a, b)}$	The bandwidth of the connection between $n_{a}$ and $n_{b}$
$t_{s_{(a, b)}} / t_{e_{(a, b)}}$	The start/end time of the most recently happening connection between the $n_{a}$ and $n_{b}$
$τ_{a, b}$	The estimated value of the connection duration time between $n_{a}$ and $n_{b}$
$P_{(a, b)}$	$n_{a}$ 's estimated meeting probability of $n_{a}$ and $n_{b}$
$E T H_{(a, b)}$	The estimated throughput of the connection between $n_{a}$ and $n_{b}$
$P^{{a, b}}$	The codelivery probability of a certain message for $n_{a}$ and $n_{b}$

PRoPHET routing algorithm records history of encounters and transitivity, and the utility metric is based on an encounter probability with the transitivity. PRoPHET estimates a probabilistic metric called delivery predictability, $P_{(a, b)}$ , at every node $n_{a}$ , for each known destination $n_{b}$ . This indicates how likely it is that this node will be able to deliver a message to that destination. The calculation process is listed as follows:

\begin{matrix} P_{(a, b)} = P_{(a, b)_{old}} + (1 - P_{(a, b)_{old}}) \times P_{init}, \end{matrix}

(1)

\begin{matrix} P_{(a, b)} = P_{(a, b)_{old}} \times γ^{k}, \end{matrix}

(2)

\begin{matrix} P_{(a, c)} = P_{(a, c)_{old}} + (1 - P_{(a, c)_{old}}) \times P_{(a, b)} \times P_{(b, c)} \times β, \end{matrix}

(3)

where

P_{(a, b)}

denotes the delivery predictability of reaching

n_{b}

from

n_{a}

and

P_{init}

, β, and γ are initialization constants chosen from the range

[0,1]

. Each node maintains a

1 \times | N |

vector, with

| N |

representing the number of nodes, where each element i records the delivery predictability between

n_{a}

and

n_{i}

3. Problem Formalization

In this section, we give the details of our proposed data item selection scheme.

3.1. Objective

The objective is to maximize the delivery probability of each message to the destination. In MONs, each node routes the message in a “store-carry-forward” way. We can choose to always forward the message to the node with higher meeting probability to its destination. However, this simple strategy does not take the throughput issue into consideration, which is highly relative to the bandwidth and the connection duration time. Since the bandwidth and the connection duration time between each pair of nodes are limited, the forward sequence of messages in the queue has great effect on the routing performance. Assume that $n_{a}$ has three messages $m_{i}$ , $m_{j}$ , and $m_{k}$ for transmission to $n_{b}$ , of which the sizes are 150 k, 200 k, and 100 k. However the current connection can only carry a data flow maximal to 120 k in total. Thus neither $m_{i}$ nor $m_{j}$ can be successfully relayed from $n_{a}$ to $n_{b}$ due to the highly constrained throughput of the connection. It is not rational yet to let node $n_{a}$ forward the message with the smallest size to $n_{b}$ . The reason is that there is no explicit optimization objective, which might take away the relay opportunity of those messages that have a little larger size, but great improvement on the delivery probability after being replicated to $n_{b}$ . Consequently, it is necessary to choose an explicit optimization objective for our selection. In this paper, we primarily focus on how to efficiently enhance the delivery performance. In the next part we give analysis process of maximizing the profit on delivery probability.

If the message is forwarded to $n_{b}$ , then the delivery would fail only if both $n_{a}$ and $n_{b}$ fail to deliver the message, and the delivery probability for this case can be computed by the following equation:

\begin{matrix} P^{{a, b}} = 1 - (1 - P_{(a, d_{i})}) (1 - P_{(b, d_{i})}) . \end{matrix}

(4)

Thus we have the following definition.

Definition 1 (transmission profit).

The transmission profit $ζ (i)$ is the magnitude of improvement on delivery ratio for message $m_{i}$ , where

\begin{matrix} ζ (i) = P^{{a, b}} - P^{{a}} = 1 - (1 - P_{(a, d_{i})}) (1 - P_{(b, d_{i})}) - P_{(a, d_{i})} . \end{matrix}

(5)

The value of variable $ζ (i)$ reflects the improved result of the delivery probability for message $m_{i}$ . From this point, one consequently defines one's “Optimal Throughput-Aware Probabilistic Routing” as follows.

Definition 2 (optimal throughput-aware probabilistic routing).

The optimal probabilistic routing always tries to maximally improve the delivery probability for each message to the destination, by taking the estimated bandwidth of current connection into consideration; that is, each node $n_{i}$ forwards several selected messages to a node $n_{j}$ with corresponding top improved ζ values by making the most use of the estimated available throughput of current connection.

For example, as shown in Table 2, there are five messages in the buffer of $n_{a}$ . For simplification, we number the five messages as $m_{1}$ to $m_{5}$ . And the destination node of $m_{i}$ is represented as $d_{i}$ . For any message $m_{i} (1 \leq i \leq 5)$ , we have $P_{(b, d_{i})} > P_{(a, d_{i})}$ , which indicates that, when $n_{a}$ meets $n_{b}$ , all these five messages should be forwarded from $n_{a}$ to $n_{b}$ . By using (5) we can get the ζ value for each message. In the next part of this section, we firstly show the method to estimate the available connection throughput, and then we give the formal expression for our problem.

Table 2

Five messages in $n_{a}$ 's buffer.

Data item	$m_{1}$	$m_{2}$	$m_{3}$	$m_{4}$	$m_{5}$
$P_{(a, d_{i})}$	0	0.2	0.1	0.12	0.2
$P_{(b, d_{i})}$	0.1	0.75	0.2	0.25	0.35
Value of ζ	0.1	0.6	0.18	0.22	0.28
Message size	1 K	2 K	5 K	6 K	7 K

3.2. Estimating the Available Throughput

Consider the following definition.

Definition 3 (throughput of connection).

Given a connection duration time t between two nodes and the bandwidth B (KB/unit) of the connection between them, the throughput $T H$ of the connection is

\begin{matrix} T H = B \cdot t . \end{matrix}

(6)

Since we assume that the bandwidth B of the connection is known in advance, the remaining task of estimating the throughput is to get the connection duration time t. We can use the following equation to estimate the duration time for the next upcoming connection between $n_{a}$ and $n_{b}$ , where $α \in [0,1]$ is the scaling constant and $t_{e} - t_{s}$ is the duration time of the most recently happening connection, of which the impact is controlled by the parameter α. When α is set to be relatively large, the impact of the second part of the following equation would be enhanced and vise versa. Consider

\begin{matrix} τ_{(a, b)_{new}} = (1 - α) τ_{(a, b)_{old}} + α (t_{e} - t_{s}) . \end{matrix}

(7)

When

n_{a}

has not met

n_{b}

for a while, we use the following equation to update

τ_{(a, b)}

, where

γ \in [0,1)

is the same aging constant in (2) and k is the number of time units that have elapsed since the last time the metric was aged:

\begin{matrix} τ_{(a, b)_{new}} = τ_{(a, b)_{old}} γ^{k} . \end{matrix}

(8)

In each time unit, the node checks the status of the connection between $n_{a}$ and $n_{b}$ . The updating process for τ is shown in Algorithm 1. Then according to Definition 3, the throughput of the connection between $n_{a}$ and $n_{b}$ can be estimated by the following equation:

\begin{matrix} \begin{matrix} E T H_{(a, b)} = B_{(a, b)} \cdot τ_{(a, b)} . \end{matrix} \end{matrix}

(9)

Algorithm 1: Updating the τ value.

For the current time unit

(1) if connection is up then

(2) $t_{s} \leftarrow c u r r e n t_t i m e$

(3) $τ_{{(a, b)}_{new}} = τ_{{(a, b)}_{old}} γ^{k}$

(4) $k \leftarrow k + 1$

(5) else if connection is down then

(6) $t_{e} \leftarrow c u r r e n t_t i m e$

(7) $τ_{{(a, b)}_{new}} = (1 - α) τ_{{(a, b)}_{old}} + α (t_{e} - t_{s})$

(8) $k \leftarrow 1$

(9) else

(10) $τ_{{(a, b)}_{new}} = τ_{{(a, b)}_{old}} γ^{k}$

(11) $k \leftarrow k + 1$

(12) end if

An example is shown in Figure 1. The connection between $n_{a}$ and $n_{b}$ has shown up 3 times in the time interval $[0,11]$ . The variable k is reset to be 1 in the black square and increases by 1 in the white square. At the starting time 0, we set $τ = 0$ and $k = 1$ . In the red square, the variable τ is updated by (8), while in the green square τ is updated by (7). The whole computing process conforms Algorithm 1. Finally we obtain the estimated value $τ = 1.62$ . Assuming the bandwidth of the connection is 6.8 KB/unit (this numerical value is easier for the discussion below), then the throughput of the connection can be estimated by (9) as

\begin{matrix} E T H_{(a, b)} = B_{(a, b)} \cdot τ_{(a, b)} = 6.8 \times 1.62 (KB) \approx 11 KB . \end{matrix}

(10)

Figure 1

An example of the bandwidth estimation process.

3.3. Formalization

We first show that the routing problem can be viewed as a 0-1 knapsack problem, and then we give the formalization of our routing problem.

Theorem 4.

The optimal bandwidth-aware probabilistic routing problem can be formalized to be an optimal decision-making problem and, furthermore, can be viewed as a 0-1 knapsack problem.

Proof.

If viewing the estimated throughput $E T H_{(a, b)}$ as the maximum weight that knapsack can carry, each message as an item in the knapsack problem, the improvement value $ζ (i)$ for each message as the value of each item, and the size of each message $S (i)$ as the weight of $i th$ item, then the routing problem is equivalent to the corresponding 0-1 knapsack problem, where decisions are made on each item to achieve the maximal profit.

Theorem 4 shows that the routing problem is an optimal decision-making $N P$ — $c o m p l e t e$ problem. The problem is formally defined as follows.

Definition 5 (formalization of routing problem).

Assume that the optimal bandwidth-aware probabilistic routing problem is an optimal decision-making problem; that is,

\begin{array}{l} Max \sum_{i = 1}^{n} ‍ ζ (i) x_{i} \\ s . t . \sum_{i = 1}^{n} ‍ S (i) x_{i} \leq E T H_{(a, b)} \\ x_{i} \in {0,1}, i = 1, \dots, n . \end{array}

(11)

4. Improved Routing Protocol with Data Item Selection

In this section, we give the detailed information about the improved routing protocol. The key problem of routing is formalized in Section 3. We first apply dynamic programming to the data item selection problem. Then we illustrate how to maintain the needed information during the entire routing process. Finally we show the entire procedure of our improved routing protocol.

4.1. Solving the Optimal Decision Problem

The scheme to solve the optimal decision-making problem is stated in Algorithm 2. The calculating process is shown in Lines 1–5. And the process of solution construction is stated in Lines 6–19.

Algorithm 2: Get the solution by dynamic programming.

Input: $M e s s a g e L i s t = [m_{1}, m_{2}, \dots, m_{n}]$ ,

$ζ = [ζ (1), ζ (2), \dots, ζ (n)]$ , $S = [S (1), S (2), \dots, S (n)]$

Output: $F o r w a r d L i s t$

(1) for $i \leftarrow 1$ to n do

(2) for $j \leftarrow 0$ to $E T H$ do

(3) $V [i, j] = \max {V [i - 1, j], V [i - 1, j - S (i)] + ζ (i)}$

(4) end for

(5) end for

(6) $c \leftarrow E T H$

(7) for $i \leftarrow$ 1 to $n - 1$ do

(8) if $V [i, c] \neq V [i + 1, c]$ then

(9) $F o r w a r d L i s t . a d d (m_{i})$

(10) $c \leftarrow c - S (i)$

(11) end if

(12) end for

(13) if $V [n] [c] > 0$ then

(14) $F o r w a r d L i s t . a d d (m_{n})$

(15) end if

(16) return $F o r w a r d L i s t$

Overviewing the example through this paper, the value of corresponding ζ and the size of messages are shown in Table 2. In Section 3.2 the estimated throughput for the connection between $n_{a}$ and $n_{b}$ , that is, $E T H_{(a, b)}$ , is calculated. Based on all the above, the calculation process of the example is shown in Table 3.

Table 3

Calculation process using dynamic programming.

i	1	2	3	4	5
$S (i)$	1 K	2 K	5 K	6 K	7 K
$ζ (i)$	0.10	0.60	0.18	0.22	0.28
0	0	0	0	0	0
1	1	1	1	1	1
2	1	6	6	6	6
3	1	7	7	7	7
4	1	7	7	7	7
5	1	7	18	18	18
6	1	7	19	22	22
7	1	7	24	24	28
8	1	7	25	28	29
9	1	7	25	29	34
10	1	7	25	29	35
11	1	7	25	40	40

4.2. Protocol Description

Now we focus on the routing protocol. The same as that in PRoPHET, first of all we need to maintain a table recording the meeting probability. Besides, since the estimation of connection throughput is based on the connection duration time, we also need to let each node record the variables $t_{s}$ and $t_{e}$ and the corresponding estimation value τ for the most recently happening connection. Finally, we need to record the number of time units that have elapsed since the last time the metric was aged, that is, k, for each connection. We denote the routing information table of $n_{a}$ by $T A B L E [a]$ , which is shown in Table 4, and the space complexity of $T A B L E [a]$ is $O (N)$ .

Table 4

The routing information table.

1	2	3	⋯	$\| N \|$
$P_{(a, n_{1})}$	$P_{(a, n_{2})}$	$P_{(a, n_{3})}$	⋯	$P_{(a, n_{\| N \|})}$
$k_{1}$	$k_{2}$	$k_{3}$	⋯	$k_{\| N \|}$
$τ_{(a, n_{1})}$	$τ_{(a, n_{2})}$	$τ_{(a, n_{3})}$	⋯	$τ_{(a, n_{\| N \|})}$
$t_{s_{(a, n_{1})}}$	$t_{s_{(a, n_{2})}}$	$t_{s_{(a, n_{3})}}$	⋯	$t_{s_{(a, n_{\| N \|})}}$
$t_{e_{(a, n_{1})}}$	$t_{e_{(a, n_{2})}}$	$t_{e_{(a, n_{3})}}$	⋯	$t_{e_{(a, n_{\| N \|})}}$

There are two parts of our entire protocol. The information exchange protocol is shown in Algorithm 3 and the data transmission protocol is shown in Algorithm 4.

Algorithm 3: Information exchange protocol.

Triggering condition:

In each time unit

$n_{a}$ Executes:

(1) for each column record i of $n_{a}$ in Table 4 do

(2) update $P (a, b_{i})$ by (1)–(3).

(3) call Algorithm 1 to update $τ_{(a, b_{i})}$ .

(4) end for

(5) broadcast the request for $T A B L E$ to $n_{a} . n e i g h b o r s$

(6) if received request for $T A B L E$ from any node $n_{b}$ then

(7) forward $T A B L E [a]$ to $n_{b}$

(8) end if

(9) if received $T A B L E [b]$ from any node $n_{b}$ then

(10) call Algorithm 4

(11) end if

Algorithm 4: Data transmission protocol.

Triggering condition:

$n_{a}$ and $n_{b}$ are in contact

$n_{a}$ Executes:

(1) for $\forall M_{i}$ in the buffer of $n_{a}$ do

(2) $d_{i} \leftarrow$ the destination node of $m_{i}$ .

(3) if $T A B L E [a] . P_{(a, d_{i})} > T A B L E [b] . P_{(b, d_{i})}$ then

(4) $M e s s a g e L i s t (n_{a}) . a d d (M_{i})$

(5) end if

(6) end for

(7) estimate $E T H_{(a, b)}$ by (9).

(8) for $\forall M_{i} \in M e s s a g e L i s t (n_{a})$ do

(9) calculate $ζ (i)$ by (5)

(10) get $F o r w a r d L i s t (n_{a})$ by Algorithm 2

(11) end for

(12) $S o r t (F o r w a r d L i s t, a s c e n d, T T L)$

(13) $F o r w a r d L i s t (n_{a}) . t r a n s m i t T o (n_{b})$

In Algorithm 3, the primary task is to update the needed information for timely routing and then to exchange it with neighbor nodes. The updating process is stated in Lines 1–4, where the equation in PRoPHET and our updating algorithm are used. In Line 5 the request for $T A B L E$ is broadcast to all the neighbor nodes of current node $n_{a}$ . As shown in Lines 6–8, if current node $n_{a}$ receives the request from any other node, $T A B L E [a]$ will be transmitted to that node. In Lines 9–11, when $n_{a}$ receives $T A B L E [b]$ from any node $n_{b}$ , then the data transmission protocol will be called. In other words, which is the triggering condition of the data transmission protocol.

In Algorithm 4, the current node $n_{a}$ scans its buffer, adding all messages that let $P (b, d_{i}) > P (a, d_{i})$ hold to the message list. In our scheme, the forward strategy is the same as PRoPHET, that a message will be transferred from node a to node b only if the b's contact predictability to the destination node is higher than at the other node. However, the throughput of each connection is taken into consideration, as shown in the remaining part of this algorithm. We firstly estimate the throughput of the connection between $n_{a}$ and $n_{b}$ in Line 7. Then the transmission profit value $ζ (i)$ is calculated for each message $m_{i}$ by (5). Finally we employ Algorithm 2 to obtain the $F o r w a r d L i s t$ , where all elements are extracted from the $M e s s a g e L i s t$ . We sort all the messages in $F o r w a r d L i s t$ by ascending order according to the $T T L$ to give the expiring messages a higher priority for transmission, so as to lower the number of dropped messages. Then in Line 13 all the messages in $F o r w a r d L i s t$ are transmitted by $n_{a}$ to $n_{b}$ in the ascending order of message $T T L$ .

5. Simulation

The results are evaluated by the ONE simulator [12]. We firstly adopt the real experimental trace of the Cambridge-iMote dataset, since it is one of the most extensive and widely exploited data traces. This trace includes Bluetooth sightings by groups of users carrying small devices (iMotes) for two months in various locations that we expected many people to visit. Mobile users in this experiment mainly consisted of students from Cambridge University who were asked to carry these iMotes with them at all times for the duration of the experiment. Then we perform the evaluation based on the Helsinki City Scenario, a widely used synthetic mobility model. The simulation is grouped into the following categories: (1) varying buffer size in Cambridge-iMote real trace; (2) varying message time-to-live in Cambridge-iMote real trace; (3) varying buffer size in Helsinki City Model; (4) varying message time-to-live in Helsinki City Model. We compare five different routing protocols based on delivery ratio, overhead ratio, and average latency.

5.1. Simulation in Cambridge-iMote Real Trace

We conduct the simulations by generating 3,300 messages for randomly selected source nodes and by executing the above-mentioned algorithms to forward these messages to their destinations, while recording the delivery ratio, average latency, and average hop count. In simulations on evaluating all these metrics, we set the simulation time as 25%, 50%, 75%, and 100% of the entire time of the dataset. Figure 2 shows the simulation results on varying buffer size, with the message TTL constant at 300 minutes. In Figure 3, we show the simulation results on varying the message TTL, with the node buffer size constant at 50 M. The other settings of the simulation are listed in Table 5.

Table 5

Simulation settings of Cambridge-iMote trace.

Parameter name	Range
Number of nodes	36
Entire simulation time (days)	11.5
Message size (KB)	500–1024
Bandwidth (KBps)	250
$P_{init}$	0.75
α	0.5
β	0.25
γ	0.98

Figure 2

[Cambridge-iMote] Buffer size versus overhead with different percentage of simulation time.

Figure 3

[Cambridge-iMote] Message time-to-live versus overhead with different percentage of simulation time.

Regarding all the figures in Figure 2, the results show that our improved throughput-aware routing significantly outperforms PRoPHET and Epidemic routing in delivery ratio and average latency and reduces the overhead to an extent. Though the delivery performance between MaxProp and our scheme varies a little, our proposed routing outperforms MaxProp in either of the remaining two criteria. More specifically, from Figures 2(a), 2(d), 2(g), and 2(j), we can see that the effect on the improvement of delivery performance increases with the simulation time prolonged. As shown in the second column of Figure 2, the improved throughput-aware routing has a comparably lower latency than PRoPHET. By referring to the third column of Figure 2, our proposed routing has greater improvement on overhead ratio with the simulation time set longer. From all the subfigures, we can see that the throughput-aware routing has the overall best performance among all the five protocols.

Regarding all the figures in Figure 3, the results show that our proposed routing significantly outperforms the other two protocols in delivery and overhead, while having a slight improvement on average latency. With the simulation time prolonged, the influence of our proposed scheme has greater improvement on all of the three metrics. However, we do not see much improvement on the average latency. From all the subfigures in Figure 3, our proposed routing has a relatively better performance than Epidemic, PRoPHET, and EncounterBased routing. Compared with MaxProp, our proposed routing performs much better when the whole simulation time is short, which indicates that the proposed routing reaches the best status more quickly in real network scenarios.

5.2. Simulation in Helsinki City Scenario

In Helsinki City Scenario [12], the nodes are assumed to be users with mobile phones or similar devices, using Bluetooth interface at 250 KBps bandwidth and 10 m transmission range. In this case, the initial free buffer size of each node is set to be small, which ranges from 5 M to 55 M. There are six trams following predefined routes, and there is an extra high-speed interface at 10 MBps bandwidth and 1000 m transmission range for the communication between trams. Two-thirds of the remaining nodes are pedestrians and one-third is cars. The speed of cars is set to be 1050 km/h and the speed of trams 2536 km/h, with the pause time of 10120 s and 1030 s, respectively. Both pedestrians and cars randomly choose their destinations on the map and move along the shortest path. The parameters settings are listed in Table 6.

Table 6

Simulation settings of Helsinki City Scenario.

Parameter name	Range (default value)
Pedestrians/cars	30–90
World size (m × m)	$4500 \times 3000$
Initial tickets number	6
Message TTL (min)	180–425 (300)
Simulation time (hours)	12
Message size (KB)	500–1024
Pedestrian buffer (MB)	5–55 (25)
Tram buffer (MB)	500
Bluetooth range (m)	10
High-speed range (m)	1000
Bluetooth (KBps)	250
High-speed (MBps)	10
Pedestrian speed (m/s)	0.5–1.5
Message interval (s)	35–40

Regarding all the figures in Figure 4, the results show that our improved throughput-aware routing significantly outperforms EncounterBased, PRoPHET, and Epidemic routing in delivery ratio. With the number of active pedestrians and cars increasing, MaxProp routing gradually outperforms the others in all the three criteria. But when there are not enough active pedestrians and cars in the network, the performance of MaxProp routing is almost the same as PRoPHET. In this case, our proposed scheme significantly outperforms the others in all the three criteria. More specifically, from Figures 4(a), 4(d), and 4(g), we can see that the effect on the improvement of delivery performance increases with the number of pedestrians/cars. As shown in the second column of Figure 4, the improved throughput-aware routing has almost the same latency as PRoPHET. By referring to the third column of Figure 4, our proposed routing has greater improvement on overhead ratio. From all the subfigures, we can see that the throughput-aware routing has the overall best performance when the number of nodes in the network is set to be relatively small.

Figure 4

[Helsinki City Scenario] Buffer size versus overhead with different number of nodes.

Figure 5 shows a similar result as in Figure 4 which is that, when the number of nodes is relatively small, our proposed routing outperforms the other four protocols in delivery and overhead and has a slight improvement on average latency. From all the subfigures in Figure 3, our proposed routing performs better than its original edition PRoPHET, which reflects that the same relay node choosing strategy with different data item selection mechanism has totally variant performance. In all, when the nodes are relatively abundant, that is, the density of nodes is large in the network, we prefer to choose MaxProp. On the other hand, our proposed scheme has a comparable improvement on the PRoPHET routing thus making it suitable to work in the network with relatively low density of nodes.

Figure 5

[Helsinki City Scenario] Message time-to-live versus overhead with different number of nodes.

6. Related Works

In [8], Zhu et al. proposed a routing algorithm taking full advantage of predicted probabilistic vehicular trajectories by which the packet delivery probability was theoretically derived. This paper demonstrates that predicted trajectories do help data delivery in vehicular networks. One of the most classical probabilistic routing schemes is probabilistic routing protocol using history of encounters and transitivity (PRoPHET) [11]. In PRoPHET, the utility metric is based on an encounter probability with the transitivity property. For example, given that $n_{a}$ most likely encounters $n_{b}$ and in similar manner that $n_{b}$ encounters $n_{c}$ , then $n_{c}$ may be a good candidate node for node A even if its encounter is least likely. Therefore, messages carried by $n_{a}$ would also be replicated to $n_{c}$ , in addition to $n_{b}$ , alleviating the buffer space exhaustion at $n_{b}$ . In particular, the aging factor is also taken into account for the outdated information.

Reference [13] presents two multicopy forwarding protocols, called optimal opportunistic forwarding (OOF) and OOF-, which maximize the expected delivery rate and minimize the expected delay, respectively, while requiring that the number of forwarding operations of per message does not exceed a certain threshold. Reference [14] applies the evolutionary games to noncooperative forwarding control in MDTNs, of which the main focus is on mechanisms to rule the participation of the relays in the delivery of messages in DTNs. Reference [15] provides a reliable data delivery scheme for mobile sensor networks with an enhanced delaying technique. Nodes estimate connectivity and expect interencounter time with sink nodes. Connectivity is estimated based on ratio of past and present connections. When the connectivity is unreliable, nodes delay the transmission for the remaining interencounter duration or per-hop lifetime. Reference [16] theoretically proves that considering both factors leads to higher throughput than considering only contact frequency. And, to fully exploit a social network for high throughput and low routing delay, the authors propose a social network oriented routing protocol for DTNs, in which a duration utility-based metric is utilized for evaluating the most suitable the relay node for each message.

In [4], the authors find that it is wise to wait till much better opportunities arise to minimize the communication cost without degrading the delivery ratio and latency. Consequently a universal scheme, named E-Scheme, is proposed to improve routing on the delivery probability metric. In [3], the authors propose a distributed optimal community-aware opportunistic routing (CAOR) algorithm, where a reverse Dijkstra algorithm is devised so as to compute the minimum expected delivery delays of nodes, thus acheving the optimal opportunistic routing performance. By proposing a home-aware community model, whereby turning an MON into a network that only includes community homes, the computational cost and maintenance cost of contact information are greatly reduced.

7. Conclusion

In this paper, we try to improve the routing performance by resorting to an efficient data item selection mechanism in MONs. Our motivation is that, due to the fact that the bandwidth and contact duration time of each connection are highly constrained, a routing protocol would perform very differently with various data selection strategies. By defining the concept of “Transmission Profit,” the concept of “Optimal Throughput-Aware Probabilistic Routing” is introduced, which is consequently modeled as a dynamic programming problem. Then a data item selection algorithm for PRoPHET is devised in this paper. The simulation results show that our data item selection mechanism greatly reduces the number of aborted transmissions thus enhancing the routing performance in aspects of delivery probability, average latency, and overhead ratio. Besides, it is possible to apply the proposed scheme in improving other metrics by redefining the “Transmission Profit.” Our future work will be focused on evaluating the improvement of the data item selection mechanism on various routing protocols.

Footnotes

Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.

Acknowledgments

This research was supported in part by Foundation Research Project of Qingdao Science and Technology Plan under Grant no. 12-1-4-2-(14)-jch and Natural Science Foundation of Shandong Province under Grant no. ZR2013FQ022.

References

Wei

Liang

A survey of social-aware routing protocols in delay tolerant networks: applications, taxonomy and design-related issues

IEEE Communications Surveys & Tutorials 2013 1 23

Ott

404 not found? A quest for DTN applications

Proceedings of the 3rd ACM International Workshop on Mobile Opportunistic Networks (MobiOpp '12)

March 2012

New York, NY, USA

ACM Press

3 4

2-s2.0-84860636067

10.1145/2159576.2159579

Xiao

I. J.

Huang

Community-aware opportunistic routing in mobile social networks

IEEE Transactions on Computers 2013 1 13

Lin

Jiang

Universal scheme improving probabilistic routing in delay-tolerant networks

Computer Communications 2013 36 849 860

10.1016/j.comcom.2012.12.011

Xiao

Huang

Homing spread: community home-based multi-copy routing in mobile social networks

Proceedings of the IEEE International Conference on Computer Communications (INFOCOM '13)

April 2013

2319 2327

Burgess

Gallagher

Jensen

Levine

B. N.

MaxProp: routing for vehicle-based disruption-tolerant networks

Proceedings of the 25th IEEE International Conference on Computer Communications (INFOCOM '06)

April 2006

1 11

2-s2.0-39049118503

10.1109/INFOCOM.2006.228

Zhang

Zhao

Social network analysis on data diffusion in delay tolerant networks

Proceedings of the 10th ACM International Symposium on Mobile Ad Hoc Networking and Computing (MobiHoc '09)

May 2009

New York, NY, USA

ACM Press

345 346

2-s2.0-70450202272

10.1145/1530748.1530806

Zhu

Trajectory improves data delivery in urban vehicular networks

IEEE Transactions on Parallel and Distributed Systems 2014 25 1089 1100

Spyropoulos

Psounis

Raghavendra

C. S.

Efficient routing in intermittently connected mobile networks: the multiple-copy case

IEEE/ACM Transactions on Networking 2008 16 1 77 90

2-s2.0-40749113163

10.1109/TNET.2007.897964

10.

Krifa

Barakat

Spyropoulos

Message drop and scheduling in DTNs: theory and practice

IEEE Transactions on Mobile Computing 2012 11 1470 1483

11.

Lindgren

Doria

Schelén

Probabilistic routing in intermittently connected networks

ACM SIGMOBILE Mobile Computing and Communications Review 2003 7 3 19 34

12.

Keränen

Ott

Kärkkäinen

The ONE simulator for DTN protocol evaluation

Proceedings of the Second International ICST Conference on Simulation Tools and Techniques (ICST '09)

2009

13.

Liu

On multicopy opportunistic forwarding protocols in nondeterministic delay tolerant networks

IEEE Transactions on Parallel and Distributed Systems 2012 23 6 1121 1128

2-s2.0-84860532411

10.1109/TPDS.2011.280

14.

El-Azouzi

de Pellegrini

Sidi

H. B.

Kamble

Evolutionary forwarding games in delay tolerant networks: equilibria, mechanism design and stochastic approximation

Computer Networks 2013 57 1003 1018

10.1016/j.comnet.2012.11.014

15.

Cha

Talipov

Cha

Data delivery scheme for intermittently connected mobile sensor networks

Computer Communications 2013 36 504 519

16.

Shen

SEDUM: exploiting social networks in utility-based distributed routing for DTNs

IEEE Transactions on Computers 2013 62 1 83 97

10.1109/TC.2011.232

MR3002901