Abstract
Machine-to-machine (M2M) networks could be connected by a wide range of wireless technologies (e.g., Bluetooth, WiFi, RFID). Due to some factors (e.g., mobility of machines, limited communication range), it is hard to maintain the connectivity of the network, that is, the network is disconnected and it is a specific application of delay tolerant networks (DTN). Communication in such network often needs nodes working in a cooperative way. However, due to selfishness, nodes have no incentive to help others. Therefore, if the source requests help from other nodes, it needs to pay certain fees to them. The main goal of this paper is to explore efficient ways for the source to maximize the average delivery ratio when the total fees are limited. First, we mathematically characterize the average delivery ratio under different policies. Then we get the optimal policy through Pontryagin's maximum principle, and prove that the optimal policy conforms to the
1. Introduction
At present, Machine-to-machine (M2M) networks have emerged as an innovative topic and are undergoing rapid development [1]. In particular, M2M networks could be connected by a wide range of wireless technologies (e.g., Bluetooth, WiFi). Due to the limited communication range, mobility of machines or other factors, it is hard to maintain the connectivity of the M2M networks. On the other hand, the concept of delay tolerant networks (DTN) has been proposed to support many new networking applications, where the end-to-end connectivity cannot be assumed [2]. Therefore, M2M networks can be seen as a specific application of DTN. In order to provide communication services in such disconnected networks, nodes in DTN communicate through a
By abuse of language, we will use node and machine alternately in this paper. It is easily to see that the
Note that our work is similar to the packet purse model (PPM) of credit-based incentive policy, in which nodes can get certain
On the other hand, our work is similar to the optimal control problem of ER algorithm in DTN, which is a popular topic recently, and there are some good works in this field, such as [7, 8]. However, they do not consider the selfishness nature of nodes. For the selfishness behavior, there are some works, which evaluate its impact on the routing performance [3, 4]. However, none of these works considers the optimal control problem. In addition, the selfishness behavior in those works is not varying with time and is denoted by a simple way. In particular, they use the probabilistic way to denote the selfishness behavior. For example, node
The main contribution of this paper is to study the optimal routing problem in such complex application. In particular, we propose a unifying framework through a continuous time Markov process, which can be used to evaluate the total fees that the source pays. Then based on the framework, we formulate an optimization problem. Through Pontryagin's maximum principle, we explore the stochastic control problem, and we prove that the optimal policy conforms to the
The rest of this paper is organized as follows. In Section 2, we briefly present some related works, and we describe the network model in Section 3. In Section 4, we first present the theoretical framework, and then formulate the optimal control problem. In Section 5, we introduce the simulation and numerical results. Finally, we conclude our main work in Section 6.
2. Related Work
In the past few years, many routing protocols have been proposed in DTN, but most of them need certain prior knowledge of the network, or they can obtain such knowledge through some online learning processes, such as the works in [9–11]. In some applications, there may be no enough time to learn, so these algorithms have certain shortage. On the other hand, ER algorithm does not need any prior knowledge about the network and can be used in many environments. For this reason, this algorithm is still a very popular topic. However, it works in a flooding way, so it wastes much energy and suffers from poor scalability in large networks. At present, many policies have been proposed to reduce its overhead. Among them, there are probability forwarding policy [12], hop-based forwarding policy [13], and so forth. These algorithms try to achieve big delivery ratio and relatively low transmission cost. Generally speaking, big delivery ratio is obtained at the expense of more cost. Therefore, how to accurately evaluate both strengths and limitations of these algorithms is very important. Some works use the simulation manner [14], but recently, theoretical manner is more popular. The performance of ER algorithm based on the sparsely exponential graph is studied in [15] and then the problem is explored again with heterogeneous nodes in [16]. The authors in [17] get the generic theoretical upper bounds for the information propagation speed in large-scale mobile and intermittently connected network, and then the work in [18] explores the information propagation speed in bidirectional vehicular delay tolerant network. The work in [19] studies the performance of two-hop relay routing under limited packet lifetime. Performance of the ER routing when the contention exists is explored in [20].
Most routing algorithms in DTN need nodes working in a cooperative way, so the selfishness behavior can have important impact on the performance. Panagakis et al. evaluate the impact of selfishness through simulation in [21]. There are also some works, which study the impact of nodes’ selfishness behavior by theoretical method, such as [3, 4]. The work in [22] is the first to propose the
Because ER algorithm has certain shortage, there are some works, which begin to consider the optimal control problem. The optimal control problem of two-hop routing algorithm based on discrete time model is studied in [7], and this work proves that the optimal forwarding policy conforms to the
3. Network Model
In this paper, we assume that there are one source node
To make the destination get message quickly,

The snapshot of the network at time
Nodes in the network can communicate with each other only when they come into the transmission range of each other, which means a communication contact, so the mobility rule of nodes is critical. In this paper, we assume that the occurrence of contacts between two nodes follows a Poisson distribution. This assumption has been used in wireless communications many years. At present, some works show that this assumption is only an approximation to the message propagation process, and they reveal that nodes encounter with each other according to the power law distribution [30]. However, they also find that if you consider long traces, the tail of the distribution is exponential. In addition, the work in [31] shows that individual inter-meeting time can be shaped to be exponential by choosing an approximate domain size with respect to given time scale. Moreover, there are also some works, which describe the intermeeting time of human or vehicles by Poisson process and validate their model experimentally on real motion traces [32, 33]. According to these descriptions, the exponential intermeeting time is rational in some applications, and we assume that the intermeeting time between two nodes follows an exponential distribution with parameter λ. Simulations based on both synthetic and real motion traces show that our theoretical framework based on such assumption is very accurate. The list of commonly used variables can be seen as in Table 1.
The list of commonly used variables.
4. Optimization Formulation
4.1. Theoretical Framework
Let
Symbol
Therefore, we can get the expectation of
Note that
There are
Further, we can obtain
Let
Let
Then, we have
4.2. Optimal Control
Our object is to solve the following optimization problem:
Symbol
Note that at time
The
Then according to Pontryagin's maximum principle ([34, Page 109, Theorem 3.14]), there exist continuous or piece-wise continuously differentiable state and
This equation between the optimal control parameter
The total number of ordinary nodes is
Below, we will prove that when the function of PR(
Theorem 1.
If the
Proof.
This theorem means that the source requests help with probability 1 before time
Therefore, we have
If
Then based on (14) and
In fact, we have
From (22), we can see that
Now, we assume that
Combined with (19), we have,
Note that we have proved that
If
Because
In summary, for time
In fact, Theorem 1 means that if PR(
5. Simulation and Numerical Results
5.1. Model Validation
In this section, we will check the accuracy of our framework by comparing the theoretical results obtained by our model with the simulation results. We run several simulations using the opportunistic network environment (ONE) [35] based on both synthetic mobility model and realworld-based scenarios. For the synthetic mobility trace, we use the famous random waypoint (RWP) mobility model [36], which is commonly used in many mobile wireless networks. There are totally 500 nodes, and all nodes move according to the RWP model within a 10000 m × 1000 m terrain with a scale speed chosen from a uniform distribution from 4 m/s to 10 m/s. The communication range is 10 m. Moreover, the source and destination nodes are randomly selected among these nodes. For the realworld-based scenarios, we use a real motion traces from about 2100 operational taxis for about one month in Shanghai city collected by global position system (GPS) [37]. The location information of the taxis is recorded at every 40 seconds with the area of 102 km2. We randomly pick 500 nodes from this trace, and the source and destination nodes are randomly selected among these nodes, too.
The first metric is the total fees that the source pays to the ordinary nodes under different forwarding policies. The source may forward message with any probability, that is, the value of

Simulation and numerical results comparison of total fees with RWP mobility model.

Simulation and numerical results comparison of total fees with Shanghai city mobility model.
From these results, we can see that the average deviation between the theoretical and the simulation results is very small. For example, the average deviation is about 2.92% for the RWP mobility model, and 4.01% for the Shanghai city motion trace. In addition, Figure 3 shows that the source has to pay more fees if it adopts the policy of Case 2. In fact, when it adopts the policy of Case 1, nodes can get message timely, when the fees that the ordinary nodes required is less. However, if it adopts the policy of Case 2, many nodes can get message when the time
Then based on the same settings, we explore the average delivery ratio in Figures 4 and 5, respectively. These results also show that the average deviation between the theoretical and the simulation results is very small.

Simulation and numerical results comparison of average delivery ratio with RWP mobility model.

Performance comparison of different policies with Shanghai city mobility model.
To further check the accuracy of the model, we want to explore the performance when the number of nodes is different. In particular, we assume that there are 300 and 600 nodes, respectively. For simplicity, we only consider Case 1, that is, the value of

Simulation and numerical results comparison when the number of nodes is different.

Performance comparison when the number of nodes is different.
All of the above results demonstrate the accuracy of our theoretical framework. For this reason, we can use the numerical results obtained by our theoretical framework to evaluate the performance of different policies.
5.2. Performance Analysis with Numerical Results
In this section, all of the numerical results are obtained by our theoretical framework based on the best fitting for the Shanghai city motion trace.
First, we evaluate the performance of the optimal policy, which is the

Performance comparison of different policies when the maximal message lifetime is different.
Now, we let the maximal message lifetime

Performance comparison of different policies when the total fees are different.
The number of ordinary nodes may have certain impact on the routing performance. At present, we only consider the optimal policy obtained from (17). We let the value of

Average delivery ratio of the optimal policy when the value of
The result in Figure 10 shows that if the number of ordinary nodes is bigger, the performance is better. As shown above, the message transmission process can be seen as the commodities trading process. The event that the source requests help from other nodes can be seen as that the source buys certain goods from them. The price of the goods is increasing with time. Therefore, it is good for the source to buy more goods early. However, when the number of ordinary nodes is smaller, the goods are limited, so the source cannot buy many goods early. Therefore, if the number of ordinary nodes is bigger, the performance is better.
On the other hand, as shown in Theorem 1, the source will stop requesting help at certain time. In particular, the optimal policy satisfies:

Optimal policy when the value of
In the above simulation and numerical results, we define: PR(

Optimal policy with different function.
6. Conclusions
Due to the mobility of nodes and many other factors, it is hard to maintain the connectivity in M2M networks. Therefore, the
