An advertisement dissemination strategy with maximal influence for Internet-of-Vehicles

Abstract

The recent advances in computation and communication technologies have led to the emergence of Internet-of-Vehicles where vehicles are connected to each other through sensors so that they can exchange information to improve driving safety, efficiency, and comfort. Internet-of-Vehicle has attracted attention from both academia and industry, as it promises huge commercial and research value. This article studies an advertisement dissemination problem in Internet-of-Vehicle with an aim to maximize the profit (number of vehicles to receive advertisements) given a limited budget. In this problem, advertisements will be first sent to a selected set of seed vehicles, then forwarded to neighboring vehicles. To find the influential set of vehicles, we examine the probability of interaction between vehicles by exploiting their mobility. In particular, we present the computation of node marginal gain in four cases by examining vehicle connectivity and topology. Simulation results demonstrate that the proposed algorithm outperforms existing methods for influence maximization by running time and deliver ratio under different traffic scenarios.

Keywords

Internet-of-Vehicles advertisements dissemination influence maximization greedy algorithm node connectivity

Introduction

Recent years have witnessed an explosive growth in Internet-of-Things (IoTs) where “Things” such as physical devices and home appliances are embedded with computer-based systems for information collection and exchange. Various applications of IoT require mobility support, geo-distribution, and location awareness.¹ We have to exploit the cloud computing technology to process massive data generated by these applications. This cloud computing paradigm should cause the latency issue of computation and communication between cloud servers and edge nodes.²

Edge computing, a novel distributed computing paradigm, is an extension for cloud computing.^3–5 It can address the requirement of ultra-low latency for communications. Take the exciting application of IoT, Internet-of-Vehicles (IoVs), as an example. It has provided a wide range of services to improve transportation safety, efficiency, and comfort.⁶ The existing works on mobile-edge computing for IoV focus on the downloading and off-loading,^7,8 efficient resource allocation,^9,10 date integrity analysis,^11,12 secure data sharing,^13,14 and privacy preserving analysis.¹⁵

IoV is composed of cloud servers and a vehicular ad hoc network (VANET), where the VANET consists of vehicles and road side units (RSUs).¹⁶ Vehicles, treated as mobile-edge computing (MEC) nodes, are equipped with onboard units (OBUs) and one or multiple application units, following the dedicated short range communication (DSRC) protocol¹⁷ to communicate with each other, and additional facilities such as RSUs; and RSUs can offer vehicles access to the Internet and enhance the performance of information exchanging due to its large computing capacity.

The ease of data interaction in IoV has opened a myriad of possibilities for intelligent value added services inter-vehicle and intra-vehicle. Messages such as commercials of nearby gas stations, hotels, and restaurants can be disseminated to a large number of road users with a small cost through vehicle collaboration and participation. This is called as cooperative communications¹⁸ or relay networks.¹⁹ Yet, the high mobility of vehicles has been a double-edged sword. While rapidly moving vehicles can send information to more road users as they encounter each other, new challenges have been introduced due to the highly dynamic and intermittent connections between vehicles. The value-added services typically serve a goal of maximizing the impact (the number of receivers) among users with a limited budget. Predecessor studies proposed several solutions to this problem over static wireless networks and (less dynamic) mobile networks,^20–22 but little has been contributed to IoV.

In this article, we consider a profit-maximization advertisement (ad) dissemination problem in IoV, where merchants send ads through RSUs to vehicles. Given a limited budget, ads will be first sent to a selected set of vehicles (seeds), and forwarded to other vehicles as seed vehicles moving, with an aim to maximize the total profit (the number of receivers). In order to find the influential set of vehicles, we examine the probability of interaction between vehicles by exploiting the mobility of vehicles, that is, we consider the physical distance, the traveling history, the route at the moment, to find users with a large overall effect on the network. In light of security concerns, information shared is carefully processed to protect user privacy. In addition, the contents of ads are often time sensitive, ads in VANETs are associated with a timer $Δ T$ , and will be discarded when $Δ T$ expires. Therefore, algorithms used in this article are computationally light, and the major computation is done at RSUs, enabling data to be analyzed in nearly real-time.

The problem considered in this article is similar to the influence maximization (IM) problem in social networks,²³ which seeks a subset of influential nodes which can obtain a maximum benefit of propagandizing. We adopt the idea of choosing seed nodes with largest marginal gain from.²⁴ However, due to the differences in the information diffusion model (detailed in section “Problem definition”), existing methods to calculate node marginal gain cannot be used to solve our problem directly. The contributions of this article are summarized below:

We formulate the ads dissemination problem as a profit maximization problem, and prove it to be NP-hard.

We employ a greedy algorithm to solve the profit maximization ads dissemination problem. In particular, we present the computation of node marginal gain in four cases by examining their connectivity and neighboring nodes.

We exploit the mobility of vehicles when selecting seed nodes. Information to be shared is carefully processed to protect user privacy.

The performance evaluation of the two algorithms are conducted through extensive simulations. Simulation results suggest that our proposed selecting algorithms surpasses state-of-the-art methods for IM by two factors, running time and delivery ratio, that is our selecting algorithms can deliver ads to more vehicles with shorter running time under variable scenarios.

The rest of the article is organized as follows. The related work is reviewed in section “Related work.” We present the network model and problem definitions in section “Problem definition.” In section “Solutions,” we present independent cascade (IC)-based model to cope with the ads dissemination problem. Experimental results and performance evaluation are discussed in section “Performance evaluation.” Finally, concluding remarks and future directions are given in section “Conclusion.”

Related work

Our work is related to selecting influential nodes with knowledge of social network. In this section, we introduce these studies and compare them with our work. Lu et al.²⁵ utilized the concept of social degree applying to the selection of RSUs, by placing RSUs at high social intersections vertexes. Kim et al.²⁶ abstracted a metropolitan area map into a rectangular graph with little quadrate blocks in it. The authors proposed an algorithm with a polynomial running time based on the graph to maintain a maximum influence with budget constrained. These two studies performed well in RSU selecting, but the nodes are immobile and cannot apply to our problem. Chuang and Lin²⁷ and Mezghani et al.²⁸ focused on the offloading problem in VANETs, which is similar to our seeds selecting problem. In Chuang and Lin,²⁷ the authors proposed a community-based selecting algorithm to select initial sources to improve the offloading efficiency. The algorithm divided the nodes into communities to segregate the impact range of each initial source. Mezghani et al.²⁸ designed an interest-aware selection scheme called SIEVE. The SIEVE algorithm take the nodes’ interests into consideration and combined it with the near-future encounter probability conjectured by the vehicular controller, yet the seeds can be determined by a content utility computation using the two indexes. These two algorithms can both improve the content utility when offloading, but cannot apply to our problem because the offloading process requires users’ own accord, which is not practical in advertising.

Notably, our seeds selecting problem is similar to the IM problem which has drawn continuous attention in social networks. Tang et al.²⁰ proposed a two-phase influence maximization (TIM) algorithm under the IC model. It can achieve $(1 - 1 / e - ε)$ approximation through theoretical analysis. The work in Zhang et al.²⁹ aimed at a Partial Positive Influence Seeding (PPIS) problem which is a deformation of IM problem. They proved that the function of PPIS is a nonsubmodular function and presented an algorithm with an $O ((\log n)^{2} H (⌊ pn ⌋))$ approximation. Nguyen et al.²² and Li et al.³⁰ studied a cost-aware targeted viral marketing (CTVM) problem, which is IM with various selecting cost and different influence benefit for each node. The authors proposed a Billion-scale Cost-aware Targeted (BCT) algorithm to solve the CTVM problem in Nguyen et al.²² and improve it with a TIPTOP algorithm in Li et al.³⁰ These works in social network can achieve a near approximation and a short running time, but cannot apply to our problem directly for diverse reasons.

Problem definition

Consider an IoV, displayed in Figure 1, that consists of N vehicles and M RSUs. Each vehicle is equipped with an on-board-unit (OBU) for communications with other vehicles and RSUs.^8,16 Let the communication range of the vehicle and the RSU be $R A_{o}$ and $R A_{s}$ , respectively. Assume there exists a controller alongside each RSU in charge of tracking vehicle locations, communications between vehicles within $R A_{s}$ and between vehicles and the RSU. The information captured by the controllers can be represented by a weighted undirected graph $G = (V, E)$ , where V is the set of nodes (vehicles) (To be concise, hereafter vehicle nodes are referred to as nodes.) and E is the set of edges indicating the interactions between nodes and that between nodes and the RSU.

Figure 1.

An IoV network model.

Under the above IoV network model, we consider an ads dissemination scenario based on a modified IC model, where some nodes are initially activated as seeds and then these activated nodes continue to activate their neighbors individually. All vehicles update their information to RSUs and merge to a social graph cloud. According to the information stored in the social graph cloud, we select a small set of seed vehicles (initial active nodes). Merchants first send ads to nearby RSUs. Any seed vehicles that enter into the range of $R A_{s}$ will receive ads forwarded from corresponding RSUs. Finally, seed vehicles deliver ads to other vehicles in the range of $R A_{o}$ . The non-seed vehicle would not forward its received ads any more.

This one-hop ads dissemination scenario are based on the consideration of the analysis complexity and the budget. Since no vehicles are willing to forward ads without being paid, merchants have to pay seed vehicles for helping disseminate ads. As we know, the budget of merchants is limited, which constraints the number of seed vehicles to be selected. However, a large number seed vehicles means that ads can be disseminated more widely and thus, the merchants can get a higher profit. Motivated by the tradeoff between the cost and the profit, we formulate a profit maximization ads dissemination problem under a budget constraint.

To be specific, let $ngb (v)$ be the set of neighboring nodes that v is connected to directly. Let S be the set of seed vehicles. During disseminating, ads are first transmitted from RSUs to seed vehicles $u (u \in S)$ . Then u disseminate ads to their neighboring nodes $v (v \in ngb (u))$ with a probability $w (u, v)$ . Specifically, $w (u, v) \in [0, 1]$ captures the communication probability and is the weight of the edge between u and v $e (u, v) \in E$ . It can be calculated as

\begin{matrix} w (u, v) = \frac{1}{1 - \exp (- \frac{d (u, v)}{2 R A_{o}})} \end{matrix}

(1)

where $d (u, v)$ represents the physical distance between u and v, and $2 R A_{o}$ represents the maximum distance of two vehicles, while they have chance to communicate. If $v \in S$ or $v \notin S$ but v receives ads from any node in S, we say v is influenced by S. This influence is indicated by a binary variable $i_{(v)}^{(S)}$ as follows

i_{(v)}^{(S)} = {\begin{matrix} 1, v \in S, \\ 1, v \notin S but v \in ngb (u) if u \in S \\ 0, otherwise \end{matrix}

(2)

For a time period $Δ T$ , the number of all nodes that are influenced by S can be denoted as $I (S)$ . $I (S)$ can be called as the influence spread of the seed set S. All of these influenced nodes can receive ads during a period of $Δ T$ . In this article, $Δ T$ equals to the expiration period of an ad, which affects $I (S)$ not the size of S.

Let $b (v), v \in V$ be the profit that merchants will make if v receives the ads. Let $c (v)$ be the cost of selecting v as a seed vehicle. (The cost includes incentives for seed vehicles and the cost of OBUs.) Without loss of generality, we assume that both $b (v)$ and $c (v)$ are constants. As a result, under the set of the selected seed vehicles S, the total profit the merchants will make can be denoted as $B (S)$ and computed by the following equation

B (S) = \sum_{v \in V} i_{(v)}^{(S)} b (v)

(3)

Given a limited budget $B$ , the objective of our ads dissemination problem is to maximize the total profit $B (S)$ . Thus, the problem can be written as

\begin{matrix} max B (S) \\ s . t . \sum_{v \in S} c (v) \leq B, at the end of Δ T \\ i_{(v)}^{(S)} \in {0, 1} \end{matrix}

(4)

Theorem 1

The problem in equation (4) is NP-hard.

Proof

Let k be the maximum number of seed vehicles that budget $B$ allows, that is, $k = ⌊ B / c (v) ⌋$ , where $⌊ \cdot ⌋$ is the floor function. Then in equation (4), we wish to find a k-element set S for which $B (S)$ is maximized. This is a NP-hard problem if $B (S)$ is a submodular function.³¹ We prove that $B (S)$ is a submodular function in Lemma 1. And this completes the proof of Theorem 1.

Lemma 1

$B (S)$ is a submodular function.

Proof

See Proof of Lemma 1 in Appendix 1.

Note that equation (4) can be seen as a variation of the IM problem²⁹ in social networks. However, solutions for the IM problem cannot be used to solve our problem directly due to the following reasons:

Different from social networks with the relatively stable topology, the topology of VANETs changes constantly as vehicles move at high speed and vehicles get disconnected frequently.

We assume that vehicles (except seed vehicles) that receive ads will not forward ads to other vehicles, that is, an active node cannot activate another node unless it is a seed node. While in the IM problem, once a node becomes active, it is able to affect other nodes.

The influence between nodes in the IM problem is based on social relationship, for example, contact frequency, and the number of common friends, while the influence between vehicles in VANETs is impacted mostly by locations and velocities of vehicles.

In next section, we propose an algorithm to solve equation (4), which is similar to solutions of the IM problem based on the classic IC model.²² Because of the above differences between our problem and the IM problem, we make many necessary modifications. For the sake of clarity, notations used in this article and their descriptions are summarized in the following Table 1.

Table 1.

Table of notations.

Variable	Description
N	The number of nodes in G
m	The number of RSUs
$e (u, v)$	The link between node u and node v
$w (u, v)$	The influence weight of each link $e (u, v)$
$b (v)$	The profit of influencing v
$c (v)$	The cost of selecting v
$ngb (v)$	The set of nodes that directly connect to v
k	The maximum number of seed vehicles ( $B / c (v)$ )
$I (S)$	The influence spread of the seed set S
$B (S)$	The total profit of the seed set S
$Su$	The joint of $ngb (S)$ and $ngb (u)$
$Δ T$	The time period while advertisements are valid

Solutions

In this section, we present solutions to the problem defined in equation (4). As described in the above section, the goal is to find a k-element seed set S to maximize the profit $B (S)$ .

To select k seed nodes, we can employ a classic greedy algorithm that starts with an empty set S then iterates k times and adds the node with the largest marginal gain $MG (v) = B (S + v) - B (S)$ each time. As we assume $b (v)$ is a constant, calculating $B (S + v) - B (S)$ would be the same as calculating $I (S + v) - I (S)$ . Calculating $I (S)$ is known to be hard as discussed in previous studies.³² In this article, we follow the mainstream work in the literature^33,34 to use the estimation of $E [I (S)]$ and $E [I (S + v)]$ for $MG (v)$ . Details are presented in Algorithm 1.

Algorithm 1.

IC-based seed selection algorithm.

Input: G,

B

b (v)

c (v)

Output: seed set S.

S \leftarrow \emptyset

2: while

| S | < ⌊ B / c (v) ⌋

S \leftarrow S + argma x_{v \in V \ S} (E [I (S + v)] - E [I (S)])

;

4: end while

IC: independent cascade.

To calculate $E [I (S + v)] - E [I (S)]$ , we first give how to compute $I (S + v) - I (S)$ in Theorem 2 by considering the following four cases:

Case 1. v and S have no common neighbors, and v is not adjacent to S.

Case 2. v and S have no common neighbors, but v is adjacent to S.

Case 3. v and S have neighbors in common, but v is not adjacent to S.

Case 4. v and S have neighbors in common, and v is adjacent to S (there exists at least one node in S that is adjacent to v).

Detailed expressions are shown in the top of the next page and the proof of Theorem 2 is given as follows.

Theorem 2

The marginal gain $I (S + v) - I (S)$ can be expressed as

\begin{matrix} I (S + v) - I (S) = \\ (\begin{matrix} I (v) & v \in Case 1 \\ I (v) - i_{(v)}^{(S)} - \sum_{s \in S} i_{(s)}^{(v)} & v \in Case 2 \\ I (v) - \sum_{u \in {ngb (S) \cap ngb (v)}} min {i_{(u)}^{(v)}, i_{(u)}^{(S)}} & v \in Case 3 \\ I (v) - i_{(v)}^{(S)} - \sum_{s \in S} i_{(s)}^{(v)} - \sum_{u \in {ngb (S) \cap ngb (v)}} min {i_{(u)}^{(v)}, i_{(u)}^{(S)}} & v \in Case 4 \end{matrix} \end{matrix}

(5)

Proof

Case 1: In our ads dissemination model, seed nodes activate their neighboring nodes (forward ads to neighboring nodes) independently of each other. Therefore, $I (S + v)$ can be written as the sum of $I (S)$ and $I (v)$ , if v and S have no common neighbors, and v is not adjacent to S.

Case 2: If v is adjacent to S, as a seed node, v would try to activate its neighbors in S that are also seed nodes. These nodes similarly would try to activate u. As a result, when adding $I (v)$ to $I (S)$ to get $I (S + v)$ , we need to subtract the part they try to influence each other, as they are already counted as seed nodes when counting $I (S)$ and $I (v)$ .

Case 3: In this case, v and S would both try to activate their common neighbors. When adding $I (v)$ to $I (S)$ , we count both $i_{(u)}^{(v)}$ and $i_{(u)}^{(S)}, u \in {ngb (S) \cap ngb (v)}$ . To obtain $I (S + v)$ , we need to subtract the minimum of them.

Case 4: This case is a combination of Case 2 and Case 3. Therefore, when adding $I (v)$ to $I (S)$ , we need to subtract both the part $S, v$ influence each other, and the duplicate influence on common neighbors.

Consider an example IoV G displayed in Figure 2. Let $S = {v_{9}}$ , then the influence spread of current seed set S is

\begin{matrix} I (S) & = \sum_{v \in V} i_{(v)}^{(S)} \\ = i_{(3)}^{(9)} + i_{(4)}^{(9)} + i_{(8)}^{(9)} + i_{(10)}^{(9)} + 1 \end{matrix}

Figure 2.

An example of G.

Given $S = {v_{9}}$ and Figure 2, we can find that nodes $v_{3}$ and $v_{4}$ fall into Case 4, nodes $v_{8}$ and $v_{10}$ fall into Case 2, nodes $v_{2}$ , $v_{5}$ and $v_{7}$ fall into Case 3, nodes $v_{1}$ and $v_{6}$ are in Case 1. For example, we let v be $v_{3}$ , then

\begin{matrix} I (S + v) & = I (9, 3) \\ = i_{(8)}^{(9)} + i_{(10)}^{(9)} + i_{(2)}^{(3)} + max {i_{(4)}^{(3)}, i_{(4)}^{(9)}} + 2 \\ = I (9) + I (3) - i_{(3)}^{(9)} - i_{(9)}^{(3)} - min {i_{(4)}^{(9)}, i_{(4)}^{(3)}} \end{matrix}

The result is the same as Case 4 in equation (5).

We can also let v be $v_{2}$ and obtain

\begin{matrix} I (S + v) & = I (9, 2) \\ = i_{(1)}^{(2)} + i_{(7)}^{(2)} + i_{(4)}^{(9)} + i_{(10)}^{(9)} + max {i_{(3)}^{(9)}, i_{(3)}^{(2)}} \\ + max {i_{(8)}^{(9)}, i_{(8)}^{(2)}} + 2 \\ = I (9) + I (2) - min {i_{(3)}^{(9)}, i_{(3)}^{(2)}} - min {i_{(8)}^{(9)}, i_{(8)}^{(2)}} \end{matrix}

which is consistent with Case 3 in equation (5).

If v is $v_{8}$ , we can easily draw that $I (9, 8) = I (9) + I (8) - i_{(9)}^{(8)} - i_{(8)}^{(9)}$ corresponds to Case 2.

Finally, it is obvious that $I (9, 1) = I (9) + I (1)$ if v is $v_{1}$ , as in Case 1. The above examples confirm that Theorem 2 is correct.

This completes the proof of Theorem 2.

According to Theorem 2 and the communication probability between nodes $w (u, v)$ , we have the computations of $E [I (S + v) - I (S)]$ for the four cases in the following theorems.

Theorem 3

\begin{matrix} E [I (S + v) - I (S)] = \sum_{v' \in ngb (v)} w (v', v) \\ s . t . v in Case 1 \end{matrix}

(6)

Proof

According to Theorem 2, if v and S have no common neighbors, and v is not adjacent to S, $E [I (S + v) - I (S)]$ is equal to $E (I (v))$ , where

I (v) = \sum_{u \in V} i_{(u)}^{(v)} = \sum_{v' \in ngb (v)} i_{(v')}^{(v)}

(7)

We have

E [I (v)] = E [\sum_{u \in V} i_{(u)}^{(v)}] = \sum_{v' \in ngb (v)} E [i_{(v')}^{(v)}]

(8)

Note that node u would be influenced by node v following a binomial distribution where $P (i_{(u)}^{(v)} = 1) = w (u, v)$ and $P (i_{(u)}^{(v)} = 0) = 1 - w (u, v)$ . As a result, we obtain $E [I (S + v) - I (S)] = \sum_{v' \in ngb (v)} w (v', v)$ .

This completes the proof of Theorem 3.

Theorem 4

\begin{matrix} E [I (S + v) - I (S)] = \sum_{v' \in ngb (v)} w (v', v) - 2 \sum_{s \in S} w (s, v) \\ s . t . v in Case 2 \end{matrix}

(9)

Proof

When v and S have no common neighbors, and v is adjacent to S, following Theorem 2, we have

\begin{matrix} E [I (S + v) - I (S)] \\ = E [I (v)] - E [i_{(v)}^{(S)}] - E [\sum_{u \in S} i_{(u)}^{(v)}] \\ = \sum_{v' \in ngb (v)} w (v', v) - \sum_{s \in S} w (s, v) - \sum_{s \in S} w (s, v) \\ = \sum_{v' \in ngb (v)} w (v', v) - 2 \sum_{s \in S} w (s, v) \end{matrix}

(10)

This completes the proof of Theorem 4.

Let $Sv = ngb (S) ⋂_{n} gb (v)$ , we have

Theorem 5

\begin{matrix} E [I (S + v) - I (S)] & = \sum_{y \in ngb (v) \ Sv} w (y, v) \\ + \sum_{u \in Sv} w (y, v) \underset{s \in S}{Π} (1 - w (s, v)) \\ s . t . v in Case 3 \end{matrix}

(11)

Proof

In Case 3, if v and S have neighbors in common, but v is not adjacent to S, we have

\begin{matrix} E [I (S + v) - I (S)] \\ = E [I (v) - \sum_{u \in {ngb (S) \cap ngb (v)}} min {i_{(u)}^{(v)}, i_{(u)}^{(S)}}] \\ = E [I (v)] - E [\sum_{u \in Sv} min {i_{(u)}^{(v)}, i_{(u)}^{(S)}}] \end{matrix}

(12)

We know that $E [I (v)] = \sum_{v' \in ngb (v)} w (v', v)$ , and $E [\sum_{u \in Sv} min {i_{(u)}^{(v)}, i_{(u)}^{(S)}}]$ can be expressed as follows

\begin{matrix} E [\sum_{u \in Sv} min {i_{(u)}^{(v)}, i_{(u)}^{(S)}}] \\ = \sum_{u \in Sv} E [min {i_{(u)}^{(v)}, i_{(u)}^{(S)}}] \\ = \sum_{u \in Sv} {0 * P (min {i_{(u)}^{(v)}, i_{(u)}^{(S)}} = 0) \\ + 1 * P (min {i_{(u)}^{(v)}, i_{(u)}^{(S)}} = 1)} \\ = \sum_{u \in Sv} P (min {i_{(u)}^{(v)}, i_{(u)}^{(S)}} = 1) \end{matrix}

(13)

Note that S and v would influence nodes in $Sv$ independently of each other. We obtain

\begin{matrix} P (min {i_{(u)}^{(v)}, i_{(u)}^{(S)}} = 1) \\ = P (i_{(u)}^{(v)} = 1, i_{(u)}^{(S)} = 1) \\ = P (i_{(u)}^{(v)} = 1) P (i_{(u)}^{(S)} = 1) \\ = w (u, v) P (i_{(u)}^{(S)} = 1) \end{matrix}

(14)

where

\begin{matrix} P (i_{(u)}^{(S)} = 1) & = P (max {i_{(u)}^{(s)}} = 1, s \in S) \\ = 1 - \underset{s \in S}{Π} P (i_{(u)}^{(s)} = 0) \\ = 1 - \underset{s \in S}{Π} (1 - w (s, u)) \end{matrix}

(15)

By substituting equations (14) and (15) into equation (13), we have

\begin{matrix} E [\sum_{u \in Sv} min {i_{(u)}^{(v)}, i_{(u)}^{(S)}}] \\ = \sum_{u \in Sv} w (u, v) [1 - \underset{s \in S}{Π} (1 - w (s, u))] \end{matrix}

(16)

Therefore, we can get

\begin{matrix} E [I (S + v) - I (S)] \\ = \sum_{v' \in ngb (v)} w (v', v) - \sum_{u \in Sv} w (u, v) + \sum_{u \in Sv} \underset{s \in S}{Π} (1 - w (s, u)) \\ = \sum_{y \in ngb (v) \ Sv} w (y, v) + \sum_{u \in Sv} w (u, v) \underset{s \in S}{Π} (1 - w (s, u)) \end{matrix}

(17)

This completes the proof of Theorem 5.

Theorem 6

\begin{matrix} E [I (S + v) - I (S)] = \sum_{y \in ngb (v) / Sv} w (y, v) \\ + \sum_{u \in Sv} w (u, v) \underset{s \in S}{Π} (1 - w (s, u)) - 2 \sum_{s \in S} w (s, v) \\ s . t . v in Case 4 \end{matrix}

(18)

Proof

It is easy to obtain that when v and S have neighbors in common, and v is adjacent to S, $E [I (S + v) - I (S)]$ is a combination of equations (9) and (11)

\begin{matrix} E [I (S + v) - I (S)] = \sum_{y \in ngb (v) \ Sv} w (y, v) \\ + \sum_{u \in Sv} w (u, v) \underset{s \in S}{Π} (1 - w (s, v)) - 2 \sum_{s \in S} w (s, v) \end{matrix}

(19)

This completes the proof of Theorem 6.

In summary, the computation complexity of the proposed IC-based seed selection algorithm (ISA) in Algorithm 1 can be calculated as $O (k \cdot (N \cdot n))$ , where $k = ⌊ B / c (v) ⌋$ is the maximum number of seeds needed to be selected and also the number of iterations, N is the total number of vehicles in the network and n is the average number of the neighboring nodes for each vehicle.

Given above equations, it is easy to select the node that has the largest marginal gain in each iteration of Algorithm 1. However, since vehicles travel at a high speed, the IoV G changes significantly, resulting in dynamic topology and intermittent connections, and we may lose useful traveling information if we only observe what happens in the current network G. Therefore, we examine the influence of vehicles by exploiting their mobility and take mobility related parameters into the computation of the influence spread for each node v. In this article, the mobility based influence spread is defined as

I_{m} (v) = (p_{v} + d_{v}) I (v) + h_{v}

(20)

where $p_{v}$ is the number of popular tourist spots/business areas vehicle v visited in the recent few weeks, $d_{v}$ is the distance to the destination at the moment for vehicle v, and $h_{v}$ is the number of ads that vehicle v forwarded in the history. $I (v)$ , the influence spread of vehicle v at the current moment, is used as as an average value for that of the potential places the vehicle v is going to visit. Note that the influence spread varies from different places, for example, the influence spread in a tourist spot may be larger than the current one. This definition is an estimation based on the vehicle’s past behaviors and not an accurate calculation about the influence spread of the vehicle v. Also it is just used to demonstrate the intuitive results that the more tourist spots and the longer the distance the vehicle v is going to visit and the more ads it has forwarded, the more vehicles it is possible to meet and forward more ads. Thus, it may have a higher mobility-based influence spread and have a higher probability to be selected into the seed set. The computation of $I (S + v) - I (S)$ in Theorem 2 can be modified as $I_{m} (S + v) - I_{m} (S)$ by replacing $I (v)$ as $I_{m} (v)$ .

Note that these mobility information can be easily collected from vehicles because the information may help increase the probability of being selected as the seed vehicle. Once being selected as the seed, the vehicle can get paid. In light of security concerns, when sharing information, participants can choose to report categorial data, not to report the details such as street name and building/apartment number, in order to protect user privacy. For example, when reporting areas visited, instead of “89 East 42nd Street at Park Avenue, New York, NY 10017,” they can report “Manhattan, New York” or “Midtown, New York.” When reporting the destination at the moment, instead of specifying its detailed address, the distance can be specified as “within 0–10 miles” or “more than 100 miles.”

Thus, the modified mobility-based seed vehicle selection algorithms can be described as Algorithm 2. The computation complexity is same to Algorithm 1 and is also $O (k \cdot (N \cdot n))$ .

Algorithm 2.

Mobility-based seed selection algorithm.

Input: G,

B

b (v)

c (v)

Output: seed set S.

S \leftarrow \emptyset

2: while

| S | < ⌊ B / c (v) ⌋

S \leftarrow S + argma x_{u \in V \ S} (E [I_{m} (S + v)] - E [I_{m} (S)])

;

4: end while

Performance evaluation

In this section, we evaluate the performance of the proposed algorithm through extensive simulations.

Experimental settings

In this article, simulations are conducted on a Windows machine with an Intel Core $3.3$ GHz CPU and 4 GB memory. Algorithms are implemented in Python.

Traffic scenario settings

We consider an IoV which consists of N vehicles, where N varies from 20 to 200. We let the vehicles traveling at a speed ranging from 40 to 100 km/h around a real world map in Helsinki (Helsinki, 4500 m × 3400 m, as in Figure 3). Each vehicle has a randomly chosen destination and calculates the route by Dijkstra shortest path algorithm. When arriving at the destination, a vehicle will pause for a short random time and choose another destination randomly on the map. This repeats until the end of the simulation. The communication range of vehicles is set to 200 m, and the ads expiration period $Δ T$ is set to 1 week if not specified.

Figure 3.

Helsinki region considered for simulation.

Algorithms compared

We implement two state-of-the-art algorithms for performance comparison, a Mente Carlo simulation method,³⁴ varied from the classical greedy algorithm by Kempe et al.’s,²⁴ and an algorithm based on reverse influence sampling by Borgs et al.³³ The two compared methods are denoted as CELF++ and RIS, respectively. For CELF++, the number of Monte Carlo steps is set to 10,000, following the standard practice in the literature, while for RIS, the number of hyperedges is set to 5000. For the other parameters, we take the recommended values in the corresponding papers if available.

Experimental results

Comparison of running time

Figure 4 reports the running time versus the number of nodes. The number of seeds is set to a medium size as a quarter of the number of nodes. In Figure 4, the results show that the proposed ISA achieves the lowest run time. When N equals to 300, the proposed algorithm is almost four times faster than CELF++, and twice faster than RIS.

Figure 4.

The running time on the IC-based model in different sizes of VANETs.

Comparison of delivery ratio

Figure 5 plots the performance of delivery ratio of algorithms when the network size varies between $[20, 300]$ and the number of seed vehicles ranges from 1/8 to 1/2 N, where the delivery ratio is defined as the percentage of nodes receiving ads out of total nodes on the map. Overall, the delivery ratio increases when the seed set size increases. This is intuitive as more seed nodes lead to more receivers. We can also observe that the proposed algorithm achieves a higher delivery ratio under different network size and seed set size. When the number of seed vehicles is $⌊ 1 / 8 N ⌋$ , the delivery ratio of the proposed algorithm is on average about $5 %$ higher than the other two methods.

Figure 5.

The delivery ratio of three algorithms when N varies: (a) N = 20, (b) N = 100, and (c) N = 300.

Figure 6 displays the delivery ratio when $Δ T$ changes from 1 day to 1 month where N is set to $20, 100, 300$ , and k is set from 1/10 to 1/2 N, respectively. We can observe that as $Δ T$ increases, the delivery ratio grows significantly. As the life of ads becomes longer, there leaves more time for seed vehicles to forward ads to other vehicles, resulting in larger delivery ratio. The proposed algorithm achieves the highest performance in all scenarios, demonstrating its superiority over the other two.

Figure 6.

The delivery ratio of three algorithms when ads has different expiry periods: (a) N = 20, (b) N = 100, and (c) N = 300.

Finally, Figure 7 illustrates the impact of mobility on selecting seed vehicles when N varies from 20 to 300, and k ranges from 1/8 to 1/2 N. It can be seen that considering the vehicle mobility enhances the delivery ratio under various network and traffic settings. Thus, we can conclude that introducing mobility improves the delivery performance.

Figure 7.

The delivery ratio with and without considering vehicle mobility: (a) N = 20, (b) N = 100, and (c) N = 300.

Conclusion and future work

In this article, we investigated the problem of maximizing the benefit of ads dissemination in an IoV under the budget constraints. This problem is transformed to a seed selecting problem on the IC model. On the basis of constant individual benefit and cost, we put forward an IC-based seed selection algorithm to guarantee the selected vehicles can maximize the influence spread. By taking the mobility of vehicles into consideration, we proposed a mobility-based seed selection algorithm to fit the dynamic topology. Through extensive performance evaluation, we have demonstrated that the proposed algorithm can achieve better efficiency in terms of delivery ratio and running time. And the preponderance does not alter with the variation of network size, proportion of seeds and the expiry period of ads. We also illustrated that the consideration of vehicle mobility can contribute to the delivery ratio.

In this article, we assume that all vehicles are trusted and legal nodes. Yet, a node is likely to pretend to be a normal node in IoV to do various malicious behaviors.³⁵ These behaviors include wiretapping,³⁶ intercepting,³⁷ or even tampering with data contained personal and vehicle information,³⁸ when a user browse ads. As a result, we will further study a security-mechanism-enabled ads dissemination scheme for the mobile IoT system.

Footnotes

Appendix 1 Acknowledgements

The authors thank all reviewers who have helped in improving the quality of this paper.

Handling Editor: Carlos Calafate

Declaration of conflicting interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by the National Natural Science Foundation of China (Grant Nos 61571010 and 61871023), and the Fundamental Research Funds for the Central Universities (Grant No. 2019JBZ001).

ORCID iD

Qinghe Gao

References

Zhou

, et al. Sample selected extreme learning machine based intrusion detection in fog computing and MEC. Wirel Commun Mob Com 2018; 2018:7472095.

Wang

Yin

Quan

, et al. Enabling collaborative edge computing for software defined vehicular networks. IEEE Netw 2018; 32(5): 112–117.

Lin

You

, et al. A novel utility based resource management scheme in vehicular social edge computing. IEEE Access 2018; 6: 66673–66684.

Dong

Ota

, et al. Fog-computing-enabled cognitive network function virtualization for an information-centric future internet. IEEE Commun Mag 2019; 57(7): 48–54.

Lin

, et al. Making knowledge tradable in edge-AI enabled IoT: a consortium blockchain-based efficient and incentive approach. IEEE T Ind Inform. Epub ahead of print 16 May 2019. DOI: 10.1109/TII.2019.2917307.

Yang

Wang

, et al. An overview of Internet of vehicles. China Commun 2014; 11(10): 1–15.

Zhang

Mao

Leng

, et al. Mobile-edge computing for vehicular networks: a promising network paradigm with predictive off-loading. IEEE Veh Technol Mag 2017; 12(2): 36–44. DOI:10.1109/MVT.2017.2668838

Jing

Zeng

Qian

, et al. An optimal multiple stopping rule-based cooperative downloading scheme in vehicular cyber-physical systems. Int J Distrib Sens N 2017; 13(3): 699840.

Huo

Dong

Qian

, et al. Coalition game-based secure and effective clustering communication in vehicular cyber-physical system (VCPS). Sensors 2017; 17(3): 1–23.

10.

Lin

Zhou

, et al. Fair resource allocation in an intrusion-detection system for edge computing: ensuring the security of internet of things devices. IEEE Consum Elec Mag 2018; 7(6): 45–50.

11.

Huo

Yong

. Re-ADP: real-time data aggregation with adaptive ω-event differential privacy for fog computing. Wirel Commun Mob Com 2018; 2018:6285719.

12.

Mao

Zhang

, et al. A position-aware merkle tree for dynamic cloud data integrity verification. Soft Comput 2017; 21(8): 2151–2164.

13.

Huo

, et al. LoDPD: a location difference-based proximity detection protocol for fog computing. IEEE Internet Things 2017; 4(5): 1117–1124.

14.

Kang

Huang

, et al. Blockchain for secure and efficient data sharing in vehicular edge computing and networks. IEEE Internet Things 2019; 6(3): 4660–4670.

15.

Mao

Tian

Jiang

, et al. Understanding structure-based social network de-anonymization techniques via empirical analysis. EURASIP J Wirel Comm 2018; 2018(1): 279.

16.

Luo

Yuan

Zhou

, et al. Cooperative vehicular content distribution in edge computing assisted 5G-VANET. China Commun 2018; 15(7): 1–17.

17.

Kenney

. Dedicated short-range communications (DSRC) standards in the united states. P IEEE 2011; 99(7): 1162–1182.

18.

Huo

Fan

, et al. A novel secure relay selection strategy for energy-harvesting-enabled Internet of things. EURASIP J Wirel Comm 2018; 2018: 264.

19.

Huo

Liu

, et al. A coalition formation game based relay selection scheme for cooperative cognitive radio networks. Wirel Netw 2017; 23(8): 2533–2544.

20.

Tang

Xiao

Shi

. Influence maximization: near-optimal time complexity meets practical efficiency. In: Proceedings of the 2014 ACM SIGMOD international conference on management of data, Snowbird, UT, 22–27 June 2014, pp.75–86. New York: ACM.

21.

Zhou

Gao

, et al. Social big-data-based content dissemination in internet of vehicles. IEEE T Ind Inform 2018; 14(2): 768–777.

22.

Nguyen

Dinh

Thai

. Cost-aware targeted viral marketing in billion-scale networks. In: Proceedings of the IEEE INFOCOM 2016—the 35th annual IEEE international conference on computer communications, San Francisco, CA, 10–14 April 2016, pp.1–9. New York: IEEE.

23.

Chen

Wang

Yang

. Efficient influence maximization in social networks. In: Proceedings of the 15th ACM SIGKDD international conference on knowledge discovery and data mining, Paris, 28 June–1 July 2009, pp.199–208. New York: ACM.

24.

Kempe

Kleinberg

Tardos

. Maximizing the spread of influence through a social network. In: Proceedings of the ninth ACM SIGKDD international conference on knowledge discovery and data mining, Washington, DC, 24–27 August 2003, pp.137–146. New York: ACM.

25.

Lin

Shen

. SPRING: a social-based privacy-preserving packet forwarding protocol for vehicular delay tolerant networks. In: 2010 proceedings IEEE INFOCOM, San Diego, CA, 14–19 March 2010, pp.1–9. New York: IEEE.

26.

Kim

Velasco

Wang

, et al. A new comprehensive RSU installation strategy for cost-efficient VANET deployment. IEEE T Veh Technol 2017; 66(5): 4200–4211.

27.

Chuang

Lin

. Cellular traffic offloading through community-based opportunistic dissemination. In: Proceedings of the 2012 IEEE wireless communications and networking conference (WCNC), Shanghai, China, 1–4 April 2012, pp.3188–3193. New York: IEEE.

28.

Mezghani

Dhaou

Nogueira

, et al. Offloading cellular networks through V2V communications—how to select the seed-vehicles? In: Proceedings of the 2016 IEEE international conference on communications (ICC), Kuala Lumpur, Malaysia, 22–27 May 2016, pp.1–6. New York: IEEE.

29.

Zhang

Shi

Willson

, et al. Viral marketing with positive influence. In: Proceedings of the IEEE INFOCOM 2017—IEEE conference on computer communications, Atlanta, GA, 1–4 May 2017, pp.1–8. New York: IEEE.

30.

Smith

Dinh

, et al. Why approximate when you can get the exact? Optimal targeted viral marketing at scale. In: Proceedings of the IEEE INFOCOM 2017—IEEE conference on computer communications, Atlanta, GA, 1–4 May 2017, pp.1–9. New York: IEEE.

31.

Nemhauser

Wolsey

Fisher

. An analysis of approximations for maximizing submodular set functions-I. Mathematical Programming 1978; 14(1): 265–294.

32.

Chen

Wang

. Scalable influence maximization for prevalent viral marketing in large-scale social networks. In: Proceedings of the 16th ACM SIGKDD international conference on knowledge discovery and data mining, Washington, DC, 25–28 July 2010, pp.1029–1038. New York: ACM.

33.

Borgs

Brautbar

Chayes

, et al. Maximizing social influence in nearly optimal time. In: Proceedings of the twenty-fifth annual ACM-SIAM symposium on discrete algorithms, Portland, OR, 5–7 July 2014, pp.946–957. New York: ACM.

34.

Goyal

Lakshmanan

. Celf++: optimizing the greedy algorithm for influence maximization in social networks. In: Proceedings of the 20th international conference companion on world wide web, Hyderabad, India, 28 March–1 April 2011, pp.47–48. New York: IEEE.

35.

Mao

Bian

Bai

, et al. Detecting malicious behaviors in javascript applications. IEEE Access 2018; 6: 12284–12294.

36.

Huo

Tian

, et al. Jamming strategies for physical layer security. IEEE Wirel Commun 2018; 25(1): 148–153.

37.

Asgari

Haines

Rysavy

. Identification of threats and security risk assessments for recursive internet architecture. IEEE Syst J 2018; 12(3): 2437–2448.

38.

Jia

Chen

Dong

, et al. Man-in-the-browser-cache: persisting https attacks via browser cache poisoning. Comput Secur 2015; 55: 62–80.