Abstract
This article exhibits a reward-based incentive mechanism for file caching in delay-tolerant networks. In delay-tolerant networks, nodes use relay’s store-carry-forward paradigm to reach the final destination. Thereby, relay nodes may store data in their buffer and carry it till an appropriate contact opportunity with destination arises. However, the relays are not always willing to assist data forwarding due to a limited energy or a low storage capacity. Our proposal suggests a reward mechanism to uphold and to sustain cooperation among relay nodes. We model this distributed network interaction as a non-cooperative game. Namely, the source node offers to the relay nodes a positive reward if they accept to cache and to forward a given file successfully to a target destination, whereas the relay nodes may either accept or reject the source deal, depending on the reward attractiveness and on their battery status (their actual energy level). Next, full characterizations of both pure and mixed Nash equilibria are provided. Then, we propose three fully distributed algorithms to ensure convergence to the Nash equilibria (for both pure equilibrium and mixed equilibrium). Finally, we validate our proposal through extensive numerical examples and many learning simulations and draw some conclusions and insightful remarks.
Keywords
Introduction
Communication networks, whether they are wired or wireless, require that the end-to-end link always exists between a sender and a recipient. However, some emerging networks such as delay-tolerant networks (DTNs) 1 introduce new concepts like intermittent connectivity and large delay-tolerant services. Yet, DTNs are complex and distributed systems where a permanent link between pair nodes source/destination may only exist for brief and unpredictable periods of time due to the lack of network infrastructures. Nevertheless, the large delay is tolerated in such a network and the store-carry-forward paradigm of relays is designed to make a full end-to-end transmission possible. The DTN relays connect ubiquitously over a direct link instead of passing through a base station in order to provide an enhanced coverage, high data rate, low latency, and low cost. Consequently, DTN paradigms may improve the resource utilization and may reduce the load of base stations. Furthermore, higher energy efficiency and enhanced network availability/robustness can be achieved. Such a network provides modular solutions to assist data transmission within intermittent forwarding where steady connectivity is hard or even impossible to establish. From a practical perspective, the DTN concept has been extensively used in several fields. For instance, it has been deployed to strengthen military communications, 2 interplanetary networks, 3 underwater networks, 4 Vehicular Ad hoc NETworks (VANETs), 5 and many other applications.
In a DTN context, when the source–relay encounter takes place (i.e. they are into communication range of each other), the source transmits its content files to the relay. This latter stores/caches them, it waits arbitrarily until it experiences a link opportunity and forwards them to the target destination. Therefore, the relay nodes must be able to buffer files for long periods of time. Indeed, these relays can be wireless devices with limited battery lifetime and storage capability. Hence, they may not be always available to assist the file transmission, which makes connectivity intermittent and arbitrary. Thus, designing an efficient and effective scheme is crucial. In this article, we build a rewarding mechanism encouraging the relay nodes to participate in the forwarding game. We believe such a scheme makes sense in DTN and can improve the overall network performance metrics.
Literature review
During the last few years, there has been a tremendous growing interest in DTN. Most of related current research focuses on improving the DTNs performance (e.g. routing/forwarding, scheduling policy, buffer management, and mobility models). More precisely, these works mostly focus on reducing (1) the messages loss probability, (2) the end-to-end delay, (3) the expected number of transmissions, and (4) the energy consumption while maximizing the delivery rate of transmitted messages. However, most of these provided results are based on some strong assumptions and/or are sometimes difficult to apply to a realistic network.
In the meantime, numerous research works have been devoted to stimulate DTN relays to participate in the relaying of data. This research direction has gained considerable interest and has become a hot topic. Indeed, several incentive mechanisms have been designed to sustain cooperation among selfish relay nodes. Jiang and Bai 6 exhibited an interesting survey on incentive mechanisms for DTNs. Four schemes have been compared as follows: (1) virtual currency-based incentive mechanism, (2) credit-based incentive mechanism, (3) game-theory-based incentive mechanism, and (4) combined incentive mechanism. In the work by Hulke and Attar, 7 the authors discussed different game theoretical-inspired incentive mechanisms and analyzed them while pointing out their advantages and drawbacks. Chahin et al. 8 propose a fixed rewarding mechanism. To make tradeoff between incentives for cooperation and energy consumption, a Minority Game is constructed. Next, a learning algorithm is suggested to drive the system to a stable operating point (Nash equilibrium). In the work by El-Azouzi et al., 9 the authors use evolutionary game theory and a fixed rewarding mechanism to model the behavior of the relay nodes in a dense DTN environment. In the work by Brun et al., 10 the authors consider a simple reward scheme and provide some interesting insights on how the source node could optimize the reward value based on received relays information. This incentive mechanism depends on the time at which the source meeting occurs and on the information hidden/disclosed by the source (number of message copies/previous meetings with opponent, etc.), and both static and dynamic policies have been investigated. Lu et al. 11 present how energy harvesting can be exploited to improve the performance of opportunistic forwarding in mobile DTNs. In the work by Niyato et al., 12 the authors discuss how the relay nodes behave and form coalitions to help each other to forward a given file while dealing with energy and buffer space constraints. Neglia and Zhang 13 propose an activation forwarding control at relay nodes, but their optimization supposes that relays are always available to cooperate with the source node. However, the relays cannot always be active due to limited energy or buffer space.
Another exciting research direction focuses on buffer management and content popularity. Yet, several studies mainly spotlight the buffer management, for example, Krifa et al. 14 propose an optimal buffer management policy under unlimited bandwidth, which is far from reality. Wang et al., 15 consider that the popularity of the content files is a crucial parameter to distribute them on relay’s buffer. Next, they propose an algorithm that discovers a near-optimal allocation of files in the network. This work uses a minor unrealistic assumption that the files have the same size, and a major unrealistic assumption of a dense network is guaranteeing file transmission to the destination. In the work by Koulali et al., 16 the authors propose a strategic beaconing scheme for DTN and suggest an application for unmanned vehicle networks (UAVs). The main idea is to strategically switch to sleep mode when a contact opportunity may only occur with very low probability. This mechanism has been shown to meet a delivery-energy tradeoff. In the work by Le et al., 17 the authors propose a cooperative caching scheme based on the social relationship among nodes to improve the DTNs performance metrics. More precisely, this may help reduce the delay and the cost of message replicas. Yin and Cao 18 design a cooperative caching scheme for mobile ad hoc networks (MANETs), aiming to reduce the delay of future requests by caching the data at certain relay nodes. Numerous other works deal with the reduction of the data redundancy among neighboring nodes to create content diversity with a limited cache space.19,20 However, these caching schemes are hard to implement due to the intrinsic challenging characteristics of DTNs.
Our contribution
In this work, we introduce a mechanism for file caching in DTNs through sustaining cooperation among relay nodes by offering them a reward. We evaluate the tradeoff between the reward value and the energy consumed to find out a rest/stable point for the distributed network. The mechanism consists of asking the relays to cache a file according to an incentive rewarding. More precisely, the source node is offering a reward to the first relay node for accepting to cache the file and succeeding to forward it to the final destination. However, the relay may either accept or decline the source offer depending on the reward attractiveness and on its battery status (energy cost). For computation tractability and without loss of generality, we consider the two-hop routing 21 to route the files to their destinations. Moreover, the choice of the two-hop routing is also motivated by an efficient resources management, that is, bandwidth, buffer space, and relaying energy. The source node attempts to transmit the file to any relay that it encounters, whereas the relay nodes are allowed to transmit the file only to the final destination.
Due to limited storage and battery constraints, relay nodes may behave in a selfish way, that is, they may not be willing to cooperate all the time. Hence, the interaction among the source node and the relay nodes is naturally modeled as a non-cooperative game. The dynamic interaction among relays and the source node (all assumed to be rational) pushes the system to converge to an equilibrium point.
22
Now, each player seeks independently to maximize its own utility function which is also depending on the other players strategy vector. It follows that Nash equilibrium (Nash equilibrium is a strategy profile where no player/agent has incentive to unilaterally deviate) as a concept solution is the natural solution concept for such an adversarial situation. We define the payoff of each player a function of the contact probability (with the source node), the delivery probability (to the destination node), the file lifetime, and the energy consumed. On one hand, the source sets a reward value and invites relay nodes to accept its deal. On the other hand, each relay has two actions (pure strategies), either to accept (strategy “a”) or to reject (strategy
A preliminary version of this work 23 presents the caching problem for two relay nodes. This scheme is quite unrealistic but still provides some useful insights toward the derivation of the general model. The work presented by Ezzahidi et al. 24 provides a discussion on the reward mechanism for the n-person game but without detailed theoretic analysis and does not consider some important parameters. The major contributions of this article are fourfold and can be summarized as follows:
Our scheme is realistic. Indeed, decision-making includes the contact probability, the file lifetime, the energy consumption, and the relay willingness to cooperate;
We designed numerous fully distributed algorithms to reach the game equilibria: our algorithms cover both pure Nash equilibrium and mixed Nash equilibrium, for both discrete action sets (for relay nodes) and continuous action sets (for source node);
Our proposed learning algorithms require only local information (no external information is required), and thus, our scheme has good scalability features. In other words, our framework is suitable for both sparse and dense/ultra-dense networks. It could also capture device-to-device-like communications.
We showed that the Nash equilibrium of this game has some good fairness features. Indeed, we showed that the Price of Anarchy (PoA) is bounded and could be efficiently controlled and make it as closer to 1 as one wish by fine-tuning the file lifetime or alternatively adapting the mobility parameters (speed, directions, stop points, etc.).
The rest of this article is organized as follows. In section “System architecture and model formulation,” we describe the problem, system architecture, and the problem formulation. We analyze the game equilibria for two-person case and then for the n-person case in sections “Two-person caching game” and “The n-person caching game,” respectively. Next, we describe three learning schemes in section “Learning algorithms and insights for real-world implementation.” Extensive numerical and simulation results are presented in section “Numerical investigations.” Many insightful concluding remarks and perspectives are drawn in section “Conclusion.”
System architecture and model formulation
We consider a DTN network with a single source, a single destination, and n relays which are involved in the transmission of files. The files are generated at the source and the relay nodes should forward them to their final destination. Each file has a utility time h (finite horizon) during which the destination is interested in such a content. We assume that the relays have sufficient buffer capacity to store the content file, and they are only authorized to forward it to the destination (essence idea of the two-hop routing). Moreover, we suppose that the contact time is quite sufficient to succeed the file-carrying transmission when the link between nodes is up and the inter-contact times between any pair of nodes are independent and identically distributed
Throughout this article we assume, without loss of generality and clearness, that every node has the same contact probability with the source node. Besides, as mentioned above, at each encounter with the source, the relay chooses either to accept or to reject a given file. Hence, the source shall set an attractive payment strategy to encourage the relays to cooperate. It advertises a positive reward (The earned reward would be a certain credit, an amount of bit-coins, or some reputation bonus that the source uses to send its own files over the DTN network)

Interaction between source and relay nodes.
Source problem
The behavior of the source can be captured from a business perspective, that is, when it generates a file, it associates with it a value
Relay problem
When a file is generated by a source, a control admission takes place during the utility time h. Yet, the relay has two possible strategies, either to accept (action “a”) or to reject (action “r”) this file. From strategic reasoning, the relay chooses to accept or not depending on the value
Obviously, each transmission/reception attempt incurs an energy cost. We assume that each transmission consumes an amount of energy denoted by
It has been shown, in the works by Ezzahidi et al.
24
and Altman,
26
that a single relay node delivers a file to destination within the file lifetime h with probability
is the probability that a relay is not succeeding in file relaying to destination. Moreover, we assume that the relays have the same success probability.
Thus, equation (2) becomes
Now, we deal with the utility functions of both the source node and the relay node under the interaction details and the reward described above. Besides, it is rational that any action played by relay depends on the actions tacked by
Let us denote
We denote by
Relay utility
The utility function of the relay is defined as the difference between the reward earned from the source node and the energy consumed during the cache-and-forward transaction. We denote by
Source utility
We define the utility function of the source node as the difference between the maximum value that it is willing to pay and the cost function (expected energy consumed). It is expressed by
and can be rewritten as
Now, we turn to study the node’s behavior and characterize the equilibria structure of the caching game. Here, the source and the relay nodes are rational, that is, they know all system parameters mentioned above and each one seeks to maximize its own objective function. More precisely, we will exhibit some nice properties of the Nash equilibria in terms of existence and uniqueness under both pure strategies and mixed strategies. Under these assumptions, the induced caching game is clearly a one-shot (static game) non-cooperative game with complete information and the corresponding results. This formalism is suitable for our problem since it allows to clearly model the node’s behavior at low complexity and acceptable computational tractability.
Existence of a Nash equilibrium for the game is given by the well-known Debreu-Fan-Glicksberg theorem. Yet, under continuity of each utility function
Hereafter, we derive some closed form for the system equilibria. We first consider the simple case of two-person game. Next, we generalize our results to the n-person game.
Two-person caching game
In this section, we present our results and characterize the Nash equilibria under pure strategies and mixed strategies.
Pure Nash equilibrium
When nodes are only using pure strategies, we have the following result.
where W (Lambert-W function) is the reciprocal function of
which means that
In this case, the payoff of the source is written as
The source payoff is a linear function with a negative slope, and it reaches the maximum at
After some algebra, we obtain
which completes the proof.
Now, the payoff of the source node is
Again, the payoff function of the source node is a linear function but with a positive slope this time. Then, it reaches the maximum at
After some algebra, we obtain
which completes the proof.
Pure Nash equilibrium is a natural solution concept. However, such a solution may not exist all the time, see existence conditions in Lemma 1 and Lemma 2. Moreover, a pure equilibrium could fail to achieve a certain lucidity between relay nodes and good fairness properties among the game players. In fact, the pure Nash equilibrium
Mixed Nash equilibrium
When mixed strategy is permitted, the relay node chooses its pure strategies according to some probability distribution. Now, it plays strategy
Consider that the source node sets a file value
then
which allows to compute the equilibrium content value to be set by the source node
We turn now to derive the equilibrium forwarding strategy for relay nodes. The source problem can be resolved by maximizing its own objective function. At Nash equilibrium, the derivative with respect to
The expected payoff of the source node writes
then
Thus, the mixed Nash equilibrium is given by
PoA
In this subsection, we aim to quantify the performance degradation caused by the selfish behavior of the non-cooperative relay nodes. The loss of efficiency due to decentralizing decision-making is often captured using the concept of PoA introduced by Koutsoupias and Papadimitriou. 27 It is defined as the worst-achievable ratio between the aggregate utilities of a Nash equilibrium (decentralized setting) and of the optimal solution (centralized setting). When the value of the PoA is close to 1, it signifies that the difference between a Nash equilibrium and the optimal solution is not significant and near-optimal performance can be met without an expensive centralized control. However, unfortunately, the PoA can be arbitrarily large which means that the decentralized scheme may lead to a huge counter-performance and high loss of efficiency. Since the Nash equilibrium is unique for our caching game, the PoA can be written as follows
Let us compute
After some algebra, we obtain
Thus
and
Let us compute
Substituting equations (25) and (26) into equation (24) yields
with
We finally obtain
The last result shows that the PoA of the caching game is bounded for content files with large lifetime. Yet, the PoA can be efficiently controlled by fine-tuning the source intrinsic parameters such as
The n-person caching game
This section extends our results of the general case of n relays, a single source, and a single destination. We analyze the interaction/competition among relay nodes and derive some sufficient conditions for existence of the Nash equilibria.
A strategy profile of the n-person caching game is a strategy vector
The source node plays strategy
Pure Nash equilibrium
Since
which is a linear function with a negative slope. Then, it reaches its maximum at
Substituting
which completes the first part of the lemma. In the case where all relays
That is
Since
which is again a linear function with positive slope. Thus, it reaches its maximum at
Substituting
The proof of the second part of the lemma is complete.
The next lemma provides sufficient conditions for existence of a pure Nash equilibrium where some relay nodes accept the source deal, whereas the remaining relay nodes decline the offer.
We have the following for cooperating relay nodes
Similarly, we have the following for defecting relay nodes
From equations (33) and (34), we strategy profile (
That is
The source utility writes
which is a linear function; hence, it can reach its maximum:
1. For
Substituting
which completes the proof of the first condition.
2. For
Substituting
Thus
which completes the proof of the second condition.
Mixed Nash equilibrium
Now, as mixed strategies are allowed, the source/relay problems are as follows:
Source problem
The source node seeks to set the best reward value
Relay problem
Given a reward
When a mixed Nash equilibrium is achieved, each relay node becomes indifferent about which strategy to choose. Namely
Then
where
with
Now, the expected utility of the source node becomes
At Nash equilibrium, the first-order derivative of the source utility vanishes. Namely
Notice that the left side of previous equation (42) is continuous, strictly decreasing monotone function on
Summary
The caching game has a unique mixed Nash equilibrium fully characterized by
PoA
The caching game has a unique Nash equilibrium, and thus, the PoA is written as
where
The pair
Learning algorithms and insights for real-world implementation
We exhibit here some discussions and algorithms on the implementation of our incentive mechanism in real-world networks. The proposed reward mechanism can be efficiently and simply implemented through distributed learning algorithm. Obviously, due to the lack of the acknowledgement (ACK) about the state of the transmission (success or failure), which may incur into large delays, we propose distributed learning algorithms for the relays and the source to allow them to learn their individual best decisions only based on information available locally and independently over time, that is, they use the local observations to estimate their payoffs. But, once a successful delivery occurs, each node receives its real reward allowing to correct the learning trajectory.
Learning scheme for the source node
The source chooses a real value
Learning schemes for the relay node
The relay’s problem consists of choosing their best response strategy (to face the source policy, but also to face the other relays policies) that allows them to maximize their payoffs over time. At any time slot, every relay node selects a pure strategy according to a probability distribution. In order to discover pure Nash equilibrium, we consider the well-known Linear Reward-Inaction (LRI) algorithm. The LRI scheme only requires a slight amount of local information. For instance, it only requires the previous value of the distribution probability, the actual realization of selected pure strategy, and the reward received at previous step. Since LRI can only discover pure equilibria, we suggest a modified version called Linear Reward
Source node: learning continuous action
In previous sections “Two-person caching game” and “The n-person caching game,” we showed that from a designer perspective, the source objective function
where the estimated gradient is given by
If the step sizes
where
Relay node: learning pure Nash equilibrium
We use the LRI algorithm for the relays to learn their respective best response strategy (pure strategy only). Every relay updates a probability distribution independently over time with a view to converge to the best strategy progressively. Initially, the relay chooses strategy based on the initial probability distribution; after each time instant, the algorithm increases the probability to pick strategies with high observed utility and decreases the probability to choose the other strategies. Let
where
Relay node: learning mixed Nash equilibrium
Ultimately, the derived mixed Nash equilibrium exhibits some nice features. In particular, it has better energy efficiency and better fairness properties since each relay node may participate in the caching game. To guarantee convergence to such an equilibrium, we propose the
where
Numerical investigations
In this section, we present some numerical results and simulation runs to illustrate the behavior of our algorithms. Unless contrary indication, we use the following setting:
Figure 2(a) and (b) depicts the probability of acceptance

Equilibrium file value
We plot in Figure 3(a) and (b) the equilibrium payoff functions of the source and the relay (respectively) over file lifetime h. For low values of h, the source utility increases until a given threshold

Equilibrium Payoff for source node: (a) and equilibrium Payoff for relay node and (b) as functions of the file lifetimes h under different contact rates λ.
Figure 4(a) and (b) shows the impact of the network density (the number of relay nodes) on the equilibrium probability of acceptance and file value. We notice that

Equilibrium file value
Next, we depict in Figure 5(a) and (b) the impact of the network density on the reward value and the probability of acceptance for different contact rates

Equilibrium caching probability
Now, we focus on the loss of efficiency experienced at Nash equilibrium. We depict in Figure 6 the PoA while changing the file lifetime h for different values of contact rate

Price of Anarchy (PoA) as a function of the file lifetime h for different contact rates
The remaining is devoted to illustrate the behavior of our learning algorithms discussed above in section “Learning algorithms and insights for real-world implementation.” We consider

Seeking the pure Nash equilibrium (

Seeking the pure Nash equilibrium (

Seeking the unique mixed Nash equilibrium of the caching game: (a) depicts equilibrium file caching probability
Conclusion
In this work, we propose a reward-based incentive mechanism for file caching for DTN environment under constraints of energy consumption, file lifetime, and rate of contact. The source–relay interaction is captured using a non-cooperative game. Next, we give some necessary conditions for the existence of pure Nash equilibria. Then, we explicitly derive the unique mixed equilibrium of this caching game which seems to have some very nice features. For instance, it exhibits better fairness properties and improved energy efficiency in comparison with pure equilibria. Furthermore, we propose three fully distributed algorithms to discover the pure equilibrium and the mixed equilibrium as well. The first scheme allows the source node to learn its equilibrium strategy (continuous action). The second scheme is used by relay nodes to learn their respective pure strategies (discrete strategy). Whereas the third scheme is designed to explore the relay nodes’ mixed strategies at equilibrium.
As a part of our future works, we aim to implement such a mechanism in a real-world network using DTN framework. We are also interested in building a more energy-efficient mechanism by allowing relay nodes to switch to sleep mode to save energy while allowing a delivery probability higher than some given threshold.
Footnotes
Acknowledgements
The authors thank the anonymous referees and the editor for their constructive comments and suggestions to improve the presentation of this article.
Academic Editor: Sergio Toral
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work has been conducted within the framework of the Mobicity Project funded by the Moroccan Ministry of Higher Education and Scientific Research and the National Centre for Scientific and Technical Research.
