A distributed algorithm for maximizing utility of data collection in a crowd sensing system

Abstract

Mobile crowd sensing harnesses the data sensing capability of individual smartphones, underpinning a variety of valuable knowledge discovery, environment monitoring, and decision-making applications. It is a central issue for a mobile crowd sensing system to maximize the utility of sensing data collection at a given cost of resource consumption at each smartphone. However, it is particularly challenging. On the one hand, the utility of sensing data from a smartphone is usually dependent on its context which is random and varies over time. On the other hand, because of the marginal effect, the sensing decision of a smartphone is also dependent on decisions of other smartphones. Little work has explored the utility maximization problem of sensing data collection. This article proposes a distributed algorithm for maximizing the utility of sensing data collection when the smartphone cost is constrained. The design of the algorithm is inspired by stochastic network optimization technique and distributed correlated scheduling. It does not require any prior knowledge of smartphone contexts in the future, and hence sensing decisions can be made by individual smartphone. Rigorous theoretical analysis shows that the proposed algorithm can achieve a time average utility that is within O(1/V) of the theoretical optimum.

Keywords

Mobile crowd sensing utility maximization smartphone online algorithm distributed algorithm cost constraint

Introduction

Over the past decades, mobile phones have become an indispensable part of the daily life of almost everyone. Most of the smartphones embed a rich set of built-in sensors, such as accelerometer, gyroscope, microphone, global positioning system (GPS), and camera.¹ As a consequence, it is unprecedentedly easier for one to collect sensing information around surroundings and share such sensing information. As a new compelling paradigm for large-scale sensing data collection and sharing, mobile crowd sensing² harnesses the data collection capability of individual smartphones, underpinning a variety of valuable knowledge discovery, environment monitoring, and decision-making applications. A number of exciting applications based on mobile crowd sensing have been explored, for example, noise mapping^3,4 and personal environmental impact analysis.⁵

There are two types of mobile crowd sensing, depending on the way of node participation, that is, participatory sensing and opportunistic sensing.^1,6 Participatory sensing requires participants to actively engage in sensing activities by manually determining how, when, what, and where to sense. In opportunistic sensing, however, sensing activities are typically automated, without requiring user intervention to actively and consciously perform sensing tasks. In practice, opportunistic sensing applications may run in the background and the phone users may not be aware of active execution of sensing applications. In other words, opportunistic applications are usually transparent to phone users. The benefit of opportunistic sensing is that it significantly lowers the burden of phone users, allowing higher participation, which is crucial for wide adoption of mobile crowd sensing.

This article concentrates on opportunistic sensing based mobile crowd sensing. A mobile crowd sensing system (Figure 1) consists of a central data collection server and a number of smartphones. Each smartphone opportunistically collects sensing data around its vicinity and reports the sensing data to the central collection server, which then applies data analytics algorithms for monitoring or decision-making purposes. The objectives of such a mobile crowd sensing system include larger sensing data volume, higher data quality, and lower cost incurred at smartphones for sensing data collection. We do not consider strategic behaviors of smartphone users and assume that smartphones are cooperative in participating sensing data collection. Such mobile crowd sensing systems are practical in the real world, for example, when smartphones are volunteers or members of the same organization. Mobile crowd sensing systems with strategic smartphones are beyond the scope of this work.

Figure 1.

An illustration of mobile crowd sensing systems. Smartphones perform sensing tasks and then report sensing data to the data collection server via cellular networks.

It is a central issue for a mobile crowd sensing system to gather high-quality sensing data with low resource consumption at smartphones. We observe that the utility of sensing data collected by a smartphone may be dependent on the phone context under which it collects the data.¹ The phone context typically varies over time and can be random in nature. In a large noise detection and monitoring application, for example, the utility of acoustic sensing data is larger when the smartphone is out of the pocket. In a road traffic monitoring application that is time-sensitive, for another example, the utility of sensed road traffic condition is smaller when the smartphone has a poor network connection as it incurs long delay. In the meanwhile, we should emphasize that it costs a smartphone non-negligible resources (e.g. energy, central processing unit (CPU), and bandwidth) as it performs sensing and reporting sensing data to the system. A smartphone is driven by a battery, and the computing power is typically limited. As a result, it is important for smartphones to decide at appropriate time for better data collection at lower cost. More importantly, a mobile crowd sensing system can gather sensing data from many smartphones. It is easy to understand that there is a redundancy with sensing data from different smartphones which leads to the marginal effect.⁷ Therefore, a smartphone decision of data sensing and reporting should also take decisions of other smartphones into account.

There are several great challenges for the mobile crowd sensing system to maximize the utility of sensing data collection at a given cost of resource consumption at each smartphone. First, the context of each smartphone is random and varies over time, which is difficult, if not impossible, to predict for future contexts. Second, a mobile crowd sensing system may have a large number of smartphones. A centralized solution for deciding the sensing decision for each individual smartphone may introduce prohibitive computational and commutation cost. Moreover, it would introduce the single point of failure problem. Finally, because of the marginal effect, the sensing decision of a smartphone is also dependent on decisions of other smartphones.

Mobile crowd sensing has received increasingly extensive research study. Unfortunately, little work has been done on maximizing the utility of sensing data collection from smartphones as the cost of smartphones is constrained. In particular, little work has noticed the dependence of sensing data utility on the actual context of a smartphone which is random and changes over time. In addition, most of the existing work ignores the marginal effect of sensing data. As a consequence, most existing mobile crowd sensing systems and applications^3,8 blindly make smartphones to collect sensing data, either periodically or randomly.

In this article, we propose a distributed algorithm for maximizing the utility of sensing data collection in a mobile crowd sensing system. To tackle the aforementioned challenges, we take advantage of the stochastic network optimization technique developed in Neely⁹ and the idea of distributed correlated scheduling¹⁰ to design a distributed online scheduling algorithm. It does not require any prior knowledge of smartphone contexts in the future, and hence sensing decisions can be made by individual smartphones. The algorithm first transforms the satisfaction of cost constraints to the stability of virtual queues. By defining a quadratic Lyapunov function, the algorithm continuously minimizes a drift-minus-utility expression to make sensing decisions.

Our major contributions are summarized as follows:

It is the first attempt, to the best of our knowledge, to explore the crucial problem of utility maximization of sensing data collection in a mobile crowd sensing system when the cost of smartphones is constrained.

We formulate the cost-constrained utility maximization problem as an online optimization problem in which the sensing action of individual smartphones is the online decision. We propose a distributed algorithm for solving the online optimization problem which allows each smartphone to make its own sensing decisions.

We perform rigorous theoretical analysis to show that our algorithm can achieve a time average utility that is within $O (1 / V)$ of the optimum with tradeoffs on the time required to converge to the cost constraints, for any $V > 0$ and can adapt to the mobility of mobile smartphones very well.

The remainder of this article is organized as follows. In the next section, related work is discussed. We formulate the system model and define the problem formally in section “Problem definition.” In section “Online scheduling algorithm,” we present the details of our distributed optimal online scheduling algorithm. We evaluate the performance of the algorithm based on simulations in section “Performance evaluation.” Finally, a brief conclusion of this work is given in section “Conclusion.”

Related work

Due to the fast increasing of usage of smartphones, mobile crowd sensing is becoming more and more popular in recent years and has attracted extensive research attention from both academia and industry. The research trend started with the notion of participatory sensing which requires the user interaction to sense particular events. Then, research evolved into opportunistic sensing which enlarges the vision of mobile crowd sensing by allowing the cooperation of multiple smartphones without requiring the explicit interaction with users.

A great number of opportunistic sensing applications have been designed and implemented. For example, GeoServ⁸ is a scalable sensor networking platform where millions of users can participate in urban sensing and share location-aware information using always-on cellular data connections. Nericell¹¹ is a system that performs rich sensing by piggybacking on mobile phones that users carry with them in normal course. The system could be used to annotate traditional traffic maps with information such as the bumpiness of roads, and the noisiness and level of chaos in traffic. The recent work¹² presents sensor mobile enablement (SME), which is a lightweight standard for efficiently identifying, coding, and decoding heterogeneous sensing information on mobile devices. More examples can be found in a recent survey paper.¹³ But most of these applications do not consider the problem of limited mobile phone resources or information saturation.

Little existing work has studied the problem of efficient scheduling to achieve optimal utility considering the marginal effect with limited resource of smartphones. And most of the related work requires sufficient statistical knowledge and perform in offline manner or prediction-based approach. For example, in Sheng et al.,¹⁴ the authors study an energy efficient problem in mobile crowd sensing and propose prediction-based algorithms to minimize the energy consumptions at smartphones. In Zhu et al.,¹⁵ the authors develop a novel smartphone-based vehicular crowd sensing system that achieves efficient utilization of limited 3G budgets to improve system performance. They propose heuristic algorithm based on the statistic data to estimate whether a WiFi encounter is approaching so as to make decisions. Their feasibility heavily depends on the accuracy of the prediction of future patterns and cannot guarantee the optimal performance. In comparison, our optimal online scheduling algorithm does not require any prior knowledge of the future patterns and can achieve a time average utility that could be arbitrarily close to the optimum, in a distributed manner.

Problem definition

First, we summarize the key notations in Table 1.

Table 1.

Key notations.

N	The number of smartphones in the target region
$s_{i} (t)$	Phone context of the ith smartphone in time slot t
$c_{i}$	Time average cost constraint on the ith smartphone
$a_{i} (t)$	Sensing decision for the ith smartphone in time slot t
$p_{i} (t)$	Cost of the ith smartphone in time slot t
$u (t)$	Utility produced for the target region in time slot t
$R_{i}$	The trust of the ith smartphone

We consider typical opportunistic sensing scenario in which each smartphone automatically performs sensing tasks and reports sensing data to remote server without user involvement, such as noise level and radio signal strength.^16–20 In large-scale sensing applications, smartphones are usually organized into target regions according to their geographic locations,^3,5,8,21 for efficient data management. A target region is the area around a sensing target. For example, in the noise mapping application Ear-Phone,³ physical area is divided into small regions with size of 100 m × 100 m. Sensing targets are the noise of each region and smartphones in the same region sense the noise of that region together. For another example, in road traffic monitoring applications, sensing targets are the traffic conditions of each road. Then, the target region is the road. All smartphones on the road sense the traffic condition of that road. It is easy to understand that there is a redundancy with sensing data from different smartphones in the same target region since they all collect sensing data for the same sensing target. Since the scheduling problems in different regions are similar, we only need to focus on the scheduling algorithm in one target region which can be easily extended to the others.

Consider that the mobile crowd sensing system operates over discrete time with unit time slots $t \in {0, 1, 2, \dots}$ . There are N smartphones in the target region. Let $s_{i} (t) \in S$ denote the phone context of the ith smartphone in time slot t, where S is the set of possible phone context. Suppose that $s_{i} (t)$ is independent and identically distributed over time slots. (Context of each smartphone is possibly correlated in each time slot. The assumption is realistic if the size of time slot is appropriate. We will show that our algorithm does not require any knowledge of the probabilities and can adapt if they change.) As explained in section “Introduction,” the phone context is random and can impact the utility brought by the sensing data since it may not meet the sensing application’s request. We use large values of $s_{i} (t)$ to represent that the phone context in time slot t is close to the application’s request. On the contrary, small value means far from. Take the noise map application as an example: the application only wants to take a sound sample when the phone is out of the pocket. Then, $S = {0, 1}$ and $s_{i} (t)$ can be a binary value: $s_{i} (t) = 1$ means the phone is out of the pocket and $s_{i} (t) = 0$ represents the phone is in the pocket.

Suppose that the phone context can be detected automatically by sensors (e.g. accelerometer and gyroscope).¹ For every slot, each smartphone detects the current phone context automatically and chooses whether or not to perform a sensing task and report to the remote server. We use binary variable $a_{i} (t) \in {0, 1}$ to represent the sensing decision for the ith smartphone in time slot t so that $a_{i} (t) = 1$ if the ith smartphone performs a sensing task in slot t, and $a_{i} (t) = 0$ otherwise. Define the vectors $s (t) = (s_{1} (t), s_{2} (t), \dots, s_{N} (t))$ and $a (t) = (a_{1} (t), a_{2} (t), \dots, a_{N} (t))$ . Then, the utility produced by smartphones in the target region in slot t is denoted by $u (t)$

u (t) = \hat{u} (s (t), a (t)) = min [\sum_{i = 1}^{N} s_{i} (t) a_{i} (t) R_{i}, U^{*}]

(1)

where $U^{*}$ is a constant and $R_{i}$ represents how much the system trusts in the ith smartphone according to its hardware level. Such utility function is a special case of marginal effect and can model the realistic scenario of information saturation which means once a certain amount of utility $U^{*}$ (e.g. equation (1)) is achieved by one or more smartphones on slot t, there is no advantage in having other smartphones perform sensing tasks and report for the target region on that slot. Suppose that each sensing task and report incurs one unit of cost (e.g. power and data traffic consumption) at smartphones. Let $p_{i} (t)$ be the cost of the ith smartphone on slot t, being 1 if it performs a sensing task and report, and 0 otherwise. Then, the cost for smartphone $i \in {1, 2, \dots, N}$ in slot t is

p_{i} (t) = a_{i} (t)

(2)

Each smartphone can choose not to sense and report in order to save the cost. The time average expected utility and cost are denoted by $\bar{u}$ and ${\bar{p}}_{i}$ , respectively

\begin{matrix} \bar{u} = lim_{t \to \infty} \frac{1}{t} \sum_{τ = 0}^{t - 1} E [u (τ)] \\ {\bar{p}}_{i} = lim_{t \to \infty} \frac{1}{t} \sum_{τ = 0}^{t - 1} E [p_{i} (τ)] \end{matrix}

We then define the cost-constrained utility maximization problem as follows

Maximize : \bar{u}

(3)

s . t . : {\bar{p}}_{i} \leq c_{i}, \forall i \in {1, 2, \dots, N}

(4)

where $c_{i}$ are a given set of real numbers which specify constraints on time average cost of smartphones.

We see that it is quite challenging to achieve the maximal time average utility considering the time average cost constraint at each smartphone since the phone context is random and time-varying which makes it infeasible to precisely calculate optimal solution in an offline manner. And the current decision is coupled with future decision by the constraint. What’s more, it is significantly more challenging to solve in a distributed method. The difficulty is that neither smartphone knows the phone context of others in the target region. Thus, a distributed scheduling algorithm may have redundant smartphones to sense and send reports which incur costs without increasing utility. In the next section, we will provide a distributed online algorithm which is able to make the optimal sensing decision by each smartphone.

Online scheduling algorithm

Problems (3) and (4) are standard stochastic network optimization problems which can be solved by the Lyapunov optimization technique⁹ in a centralized manner. Such a centralized method requires the remote server as the coordinator to make sensing decisions for all smartphones in each region based on a full knowledge of phone contexts, in every time slot. This method is not scalable when the number of small regions becomes larger since the server needs to make sensing decisions for every small region. Therefore, in this section, we propose a distributed approach that enables sensing decisions to be made by each smartphone, based on the idea of distributed correlated scheduling.¹⁰

Distributed optimal scheduling algorithm

In each time slot, each smartphone detects its phone context automatically and decides whether or not to perform a sensing task and report. Let ${\hat{a}}_{i} (s_{i}) \in {0, 1}, s_{i} \in S$ denote the pure strategy of the ith smartphone. And define a vector-valued function $\hat{a} (s) = ({\hat{a}}_{1} (s_{1}), {\hat{a}}_{2} (s_{2}), \dots, {\hat{a}}_{N} (s_{N}))$ specifying a distributed decision rule where each smartphone i chooses sensing decision $a_{i}$ as a deterministic function of $s_{i}$ , that is, $a_{i} = {\hat{a}}_{i} (s_{i})$ . The total number of pure strategy functions $\hat{a} (s)$ is $Π_{i = 1}^{N} 2^{| S |}$ . Actually, the set of pure strategy functions can be pruned to a smaller set. Intuitively, most of the strategies are not efficient since they may choose $a_{i} = 1$ if $s_{i}$ is small but choose $a_{i} = 0$ if $s_{i}$ is large. Therefore, the strategy function of each smartphone i can be restricted to the following threshold form

{\hat{a}}_{i} (s_{i}) = {\begin{matrix} 0, if s_{i} \leq s_{i}^{*} \\ 1, if s_{i} > s_{i}^{*} \end{matrix}

(5)

for some thresholds $s_{i}^{*} \in S$ . Since there are |S| such threshold functions for each smartphone i, the number of pure strategy functions $\hat{a} (s)$ is reduced to $M = Π_{i = 1}^{N} | S |$ . It can be proved by Neely¹⁰ that only considering the smaller set of strategy functions will not incur loss of optimality. Enumerate these functions using ${\hat{a}}^{[m]} (s)$ for $m \in {1, 2, \dots, M}$ . The idea of distributed correlated scheduling is that in each time slot, smartphones in the target region choose a strategy function in the set ${{\hat{a}}^{[1]} (s), {\hat{a}}^{[2]} (s), \dots, {\hat{a}}^{[M]} (s)}$ via a distributed but correlative approach.

Suppose that all smartphones receive feedback message specifying the values of $s_{1} (t), s_{2} (t), \dots, s_{N} (t)$ and $p_{1} (t), p_{2} (t), \dots, p_{N} (t)$ before the end of time slot $t + D$ , where D represents the system delay of at least one time slot. This assumption is realistic for distributed implementation and any mechanism for delivering this feedback message can be utilized, for example, through piggybacking. Then, virtual queue $Q_{i} (t)$ is defined and updated by

Q_{i} (t + 1) = max [Q_{i} (t) + p_{i} (t - D) - c_{i}, 0]

(6)

for each slot $t \in {0, 1, 2, \dots}$ and $i \in {1, 2, \dots, N}$ , where $Q_{i} (0) = 0$ and $p_{i} (- 1) = p_{i} (- 2) = \dots = p_{i} (- D) = 0$ . Each smartphone can repeat updating the above virtual queues based on information available at the end of each time slot t. Therefore, all smartphones in the target region know the value of $Q_{i} (t)$ at the beginning of each time slot t. It can be proved according to Neely⁹ that stabilizing all virtual queues guarantees the time average cost constraints (4) are satisfied. Define the virtual queue vector $Q (t) = (Q_{1} (t), Q_{2} (t), \dots, Q_{N} (t))$ .

First, we define the Lyapunov function as follows

L (t) \overset{Δ}{=} \frac{1}{2} \sum_{i = 1}^{N} Q_{i} (t)^{2}

(7)

Then, we define Lyapunov drift as $Δ (t) \overset{Δ}{=} L (t + 1) - L (t)$ . Based on the techniques in Neely,^9,10 the algorithm chooses strategy function in each time slot to greedily minimize an upper bound of the drift-minus-utility expression $E {Δ (t + D) - Vu (t) | Q (t)}$ . The control parameter $V \geq 0$ represents an importance weight on how much we emphasize the utility maximization compared to cost constraints satisfaction at smartphones. The term $Δ (t + D)$ differs from the standard Lyapunov optimization technique⁹ and is utilized because the virtual queues are updated by the delayed feedback message by equation (6). A natural explanation of the algorithm is that we make $Δ (t + D)$ small to maintain queue stability while adding the weighted utility to make decisions toward a large utility. We have the following lemma regarding the drift-minus-utility expression.

Lemma 1

In each time slot t, we have

\begin{matrix} E {Δ (t + D) - Vu (t) | Q (t)} & \leq B (1 + 2 D) - \sum_{i = 1}^{N} c_{i} Q_{i} (t) \\ + E {\sum_{i = 1}^{N} Q_{i} (t) p_{i} (t) - Vu (t) | Q (t)} \end{matrix}

(8)

where $B = (1 / 2) \sum_{i = 1}^{N} c_{i}^{2}$ is a finite constant.

Proof

First, squaring both sides of equation (6), and using the fact that $max [x, 0]^{2} \leq x^{2}$ , we have

\begin{matrix} Q_{i} (t + D + 1)^{2} - Q_{i} (t + D)^{2} & \leq (p_{i} (t) - c_{i})^{2} \\ + 2 Q_{i} (t + D) (p_{i} (t) - c_{i}) \end{matrix}

Summing over $i \in {1, 2, \dots, N}$ and dividing by 2 yields

\begin{matrix} Δ (t + D) \leq \frac{1}{2} \sum_{i = 1}^{N} {(p_{i} (t) - c_{i})}^{2} \\ + \sum_{i = 1}^{N} Q_{i} (t + D) (p_{i} (t) - c_{i}) \\ = \frac{1}{2} \sum_{i = 1}^{N} {(p_{i} (t) - c_{i})}^{2} + \sum_{i = 1}^{N} Q_{i} (t) (p_{i} (t) - c_{i}) \\ + \sum_{i = 1}^{N} (Q_{i} (t + D) - Q_{i} (t)) (p_{i} (t) - c_{i}) \end{matrix}

(9)

Moreover, by defining $B = (1 / 2) \sum_{i = 1}^{N} c_{i}^{2}$ we have

\begin{matrix} \frac{1}{2} \sum_{i = 1}^{N} E {{(p_{i} (t) - c_{i})}^{2} | Q (t)} \leq B \\ \sum_{i = 1}^{N} E {(Q_{i} (t + D) - Q_{i} (t)) (p_{i} (t) - c_{i}) | Q (t)} \leq 2 BD \end{matrix}

Taking conditional expectations on both sides of equation (9), applying the above two inequalities, we can see that lemma 1 holds. For page limit, we omit the details of proof for the above two inequalities.

The drift-minus-utility algorithm is to choose a pure strategy function ${\hat{a}}^{[m]} (s)$ from the set ${{\hat{a}}^{[1]} (s), {\hat{a}}^{[2]} (s), \dots, {\hat{a}}^{[M]} (s)}$ to greedily minimize an upper bound of the expression $E {Δ (t + D) - Vu (t) | Q (t)}$ , that is to minimize term (8), in each time slot t. Since each smartphone does not have the knowledge of phone contexts of others in slot t (i.e. s(t)), the value of term (8) under a certain candidate strategy ${\hat{a}}^{[m]} (s)$ cannot be calculated. But the delayed information $s (t - D)$ is available at the end of time slot t. Based on the idea in Neely et al.,²² the expectations of $p_{i} (t)$ and $u (t)$ under strategy ${\hat{a}}^{[m]} (s)$ can be approximated as follows

\begin{matrix} {\tilde{p}}_{i}^{[m]} (t) = \frac{1}{W} \sum_{w = 1}^{W} {\hat{a}}_{i}^{[m]} (s_{i} (t - D - w)) \\ {\tilde{u}}^{[m]} (t) = \frac{1}{W} \sum_{w = 1}^{W} \hat{u} (s (t - D - w), {\hat{a}}^{[m]} (s (t - D - w))) \end{matrix}

where W is the positive integer which represents a moving average window size.

Then, we can derive the distributed scheduling algorithm for each smartphone $i \in {1, 2, \dots, N}$ , as illustrated by Algorithm 1 in detail.

Algorithm 1: Distributed Optimal Scheduling Algorithm
Initialization: Set the parameters V and W. Initialize the virtual queue vector $Q (0) = 0$ .
In each time slot t:
1: Smartphone i detects its phone context $s_{i} (t)$ and observes the queue vector $Q (t)$ ;
2: Smartphone i chooses the pure strategy function ${\hat{a}}^{[m]} (s)$ from the set ${{\hat{a}}^{[1]} (s), {\hat{a}}^{[2]} (s), \dots, {\hat{a}}^{[M]} (s)}$ that minimizes the following expression
$\sum_{i = 1}^{N} Q_{i} (t) {\tilde{p}}_{i}^{[m]} (t) - V {\tilde{u}}^{[m]} (t)$ (10)
3: Smartphone i applies the sensing decision $a_{i} (t) = {\hat{a}}_{i}^{[m]} (s_{i} (t))$ ;
4: Receive the delayed feedback specifying the values of $s_{1} (t - D), s_{2} (t - D), \dots, s_{N} (t - D)$ and $p_{1} (t - D), p_{2} (t - D), \dots, p_{N} (t - D)$ and update all virtual queues by equation (6).

Performance analysis

We analyze the performance of the distributed optimal scheduling algorithm by the following theorem 1.

Theorem 1

For arbitrary phone contexts $s_{1} (t), s_{2} (t), \dots, s_{N} (t)$ , under Algorithm 1 with $V \geq 0$ and $W > 0$ , we have the following:

The gap between time average utility achieved by Algorithm 1 and the optimal time average utility that can be achieved by any other distributed algorithms is within $O (1 / V)$

\begin{matrix} \frac{1}{t} \sum_{τ = 0}^{t - 1} E [u (τ)] \geq u^{OPT} - O (1 / \sqrt{W}) \\ - {\frac{B (1 + 2 D)}{V} + \frac{E [L (D)]}{Vt}} \end{matrix}

(11)

The time average cost constraint on each smartphone $i \in {1, 2, \dots, N}$ satisfies

\frac{1}{t} \sum_{τ = 0}^{t - 1} E [p_{i} (τ)] \leq c_{i} + O (\sqrt{\frac{V}{t}})

(12)

Proof

For page limit, the proof is omitted here.

Theorem 1 shows that fixing the window size W, our distributed scheduling algorithm can achieve a time average utility that is within $O (1 / V)$ of the optimal value. Larger values of V will push the time average utility closer to the optimum. But the tradeoff is that more time is needed for the time average cost of each smartphone to get close to the required cost constraint. What is more, large value of W can increase the time average utility. But it will result in long computation time and large storage on smartphones.

Performance evaluation

In this section, we conduct simulations to evaluate our distributed online scheduling algorithm for mobile crowd sensing. Consider a target region which has $N = 5$ smartphones in it. The random phone context $s_{i} (t)$ of each smartphone $i \in {1, 2, 3, 4, 5}$ takes the values from the set $S = {0, 1, 2}$ . That is to say each smartphone has three possible contexts: $s_{i} (t) = 2$ means that smartphone i meets the sensing application’s request very well in time slot t, while $s_{i} (t) = 0$ means that it does not meet the request. Assume that $s_{i} (t), i \in {1, 2, 3, 4, 5}$ are uniformly and randomly distributed over S. The trust of each smartphone $R_{i}$ is set to [0.4, 0.3, 0.2, 0.1, 0.1], respectively. The time average cost constraints $c_{i}$ are set to 1/4 for all smartphones. We use $U^{*} = 1$ for the utility function (1). The default system delay for the feedback messages is $D = 10$ . And the default value of W is 50. Each simulation is run for 1000 time slots.

We verify the utility optimality achieved by our algorithm. Figure 2 shows how the parameter V affects the time average utility with different values of W. We see that the utility improves significantly and converges quickly toward the optimum as the value of V increases. The impact of W is not such obvious. The utility just improves a little when W varies from 10 to 50. The improvement can even be negligible when W is further increased. Figure 3 shows the similar results with a much larger system delay $D = 100$ . Compared to Figure 2, we see that the time average utility may decrease if the system suffers a large delay for delivering feedback messages. Figure 4 illustrates the impact of $U^{*}$ in the utility function (1) on time average utility. It means that more utility can be achieved if the problem of information saturation is not very serious. The curves for $U^{*} = 1.5$ and $U^{*} = 2$ look identical because there are cost constraints at smartphones so that they cannot perform more sensing tasks.

Figure 2.

Time average utility versus V.

Figure 3.

Time average utility versus V ( $D = 100$ ).

Figure 4.

Time average utility versus V ( $W = 50$ ).

Second, we verify whether the cost constraints at smartphones are satisfied. In Figure 5, the curves plot time average cost up to time slot t of each smartphone (1–5). We can see that the time average cost of each smartphone satisfies the constraint $c_{i} \leq 1 / 4$ . The time average cost of smartphones 1 and 2 is larger than that of others because they have higher trust so that the algorithm tends to schedule them to perform sensing task if their phone contexts meet the application’s request. Figure 6 demonstrates how the parameter V affects the time required to converge to the desired constraints. The curves plot the maximal time average cost among smartphones. This verifies the fact that larger values of V push the time average utility closer to the optimum with the tradeoff in the amount of time required for the time average cost to converge to the required constraint. Figure 7 shows that the constraints are still satisfied when we reduce the constraints of smartphones 1 and 3 to 1/5 and increase the constraints of smartphones 2 and 4 to 1/3.

Figure 5.

Time average cost up to t ( ${\bar{p}}_{i} (t)$ ) versus t ( $V = 10, W = 50$ ).

Figure 6.

$max [{\bar{p}}_{1} (t), {\bar{p}}_{2} (t), {\bar{p}}_{3} (t), {\bar{p}}_{4} (t), {\bar{p}}_{5} (t)]$ versus t ( $W = 50$ ).

Figure 7.

Time average cost up to t ( ${\bar{p}}_{i} (t)$ ) versus t by reducing the constraints. ( $V = 10, W = 50$ ).

Adaption to changes

Next, we demonstrate that our algorithm can adapt to changes robustly. The simulation time is increased to 1500 slots which is divided into three phases. Each phase is of 500 time slots. Note that the phone context processes $s_{i} (t)$ are uniformly distributed over {0, 1, 2} for all smartphones in the above simulations. We keep that probability distribution in phase 1 and phase 3, but abruptly change the probabilities for smartphones 1–4 in phase 2, according to the following table.

	Pr[s_i(t) = 0]	Pr[s_i(t) = 1]	Pr[s_i(t) = 2]
i = 1, 3	0.8	0.1	0.1
i = 2, 4	0.5	0.5	0

Figures 8 and 9 show the average utility and the average cost of smartphone 1 over 1500 time slots. Values at each time slot t are obtained by averaging the utility and cost in that slot over 300 independent simulation runs. We see that the system can adapt to the changes in probability distribution of smartphone context quickly by adjusting to the new optimal average utility. And the cost constraint is still satisfied with only small disturbance in a short time.

Figure 8.

Average utility versus t.

Figure 9.

Average cost of smartphone 1 versus t.

We also demonstrate adaption to smartphone’s mobility. The mobile smartphones may leave or enter a target region over time. We simulate the mobility by making smartphone 1 and smartphone 2 leave the target region in phase 2, and have another smartphone with a trust of 0.4 join in phase 3. Figures 10 and 11 show the results. We see that the algorithm can quickly adapt to the changes incurred by mobility. The average utility adjusts fast to the new optimal value when changes occur. The average cost of smartphone 3 is always satisfied despite small disturbance.

Figure 10.

Average utility versus t after simulating mobility.

Figure 11.

Average cost of smartphone 3 versus t.

Conclusion

This article has presented a distributed algorithm for maximizing the utility of sensing data collection in a mobile crowd sensing system. The algorithm leverages both the stochastic network optimization technique and distributed correlated scheduling. It does not require any prior knowledge of smartphone contexts in the future, and supports individual smartphones to make their own sensing decisions. We have performed rigorous theoretical analysis to show that the proposed algorithm can achieve a time average utility that is within $O (1 / V)$ of the optimum. Extensive simulations have been carried out, and the results show that the proposed algorithm achieves high time average utility of collected sensing data.

Footnotes

Academic Editor: Joel JPC Rodrigues

Declaration of conflicting interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: The research was supported by the Wenzhou Science and Technology Bureau Program (application no. 2016 G0001 and 2016G0024). It was also supported by Education Department of Zhejiang Province 2016 educational technology research project (General Project No. JB084).

References

Lane

Miluzzo

. A survey of mobile phone sensing. IEEE Commun Mag 2010; 48(9): 140–150.

Ganti

Lei

Mobile crowdsensing: current state and future challenges. IEEE Commun Mag 2011; 49(11): 32–39.

Rana

Chou

Kanhere

. Ear-phone: an end-to-end participatory urban noise mapping system. In: Proceedings of the 9th ACM/IEEE international conference on information processing in sensor networks, Stockholm, 12–16 April 2010, pp.105–116. New York: ACM.

Liu

Zhu

Noise collection and presentation system based on crowd sensing. Comput Eng 2015; 41(10): 166–170.

Mun

Reddy

Shilton

. PEIR, the personal environmental impact report, as a platform for participatory sensing systems research. In: Proceedings of the 7th international conference on mobile systems, applications, and services, Kraków, 22–25 June 2009, pp.55–68. New York: ACM.

Lane

Eisenman

Musolesi

. Urban sensing systems: opportunistic or participatory? In:Proceedings of the 9th workshop on mobile computing systems and applications, Napa, CA, 25–26 February 2008, pp.11–16. New York: ACM.

Yang

Xue

Fang

. Crowdsourcing to smartphones: incentive mechanism design for mobile phone sensing. In: Proceedings of the 18th annual international conference on Mobile computing and networking, Istanbul, 22–26 August 2012, pp.173–184. New York: ACM.

Ahnn

Lee

Moon

HJ.

Geoserv: a distributed urban sensing platform. In: Proceedings of the 2011 11th IEEE/ACM international symposium on cluster, cloud and grid computing (CCGrid), Newport Beach, CA, 23–26 May 2011, pp.164–173. New York: IEEE.

Neely

MJ.

Stochastic network optimization with application to communication and queueing systems, vol. 3, no. 1 (Synthesis lectures on communication networks). Morgan & Claypool Publishers, 2010, pp.1–211. California: ACM.

10.

Neely

MJ.

Distributed stochastic optimization via correlated scheduling, 2013, https://arxiv.org/abs/1304.7727

11.

Mohan

Padmanabhan

Ramjee

. Nericell: Rich monitoring of road and traffic conditions using mobile smartphones. In: Proceedings of the 6th ACM conference on embedded network sensor systems, Raleigh, NC, 5–7 November 2008, pp.323–336. New York: ACM.

12.

Arnaboldi

Conti

Delmastro

. Sensor mobile enablement (SME): a light-weight standard for opportunistic sensing services. In: Proceedings of the 2013 IEEE international conference on pervasive computing and communications workshops (PERCOM Workshops), 18–22 March 2013, pp.236–241. New York: IEEE.

13.

Khan

Xiang

Aalsalem

. Mobile phone sensing systems: a survey. IEEE Commun Surv Tutor 2013; 15(1): 402–427.

14.

Sheng

Tang

Zhang

Energy-efficient collaborative sensing with mobile phones. In: Proceedings of the 2012 IEEE INFOCOM, Orlando, FL, 25–30 March 2012, pp.1916–1924. New York: IEEE.

15.

Zhu

Chen

Harnessing vehicle-to-vehicle communications for 3 g downloads on the move. Int J Distrib Sens N 2014; 2014(1): 657905-1–657905-13.

16.

Wang

. hJam: attachment transmission in WLANs. IEEE T Mobile Comput 2013; 12(12): 2334–2345.

17.

Tan

Liu

. Side channel: bits over interference. IEEE T Mobile Comput 2012; 11(8): 1317–1330.

18.

Tan

Ngan

H-L

. Chip error pattern analysis in IEEE 802.15.4. IEEE T Mobile Comput 2012; 11(4): 543–552.

19.

Zhang

. CUTS: improving channel utilization in both time and spatial domain in WLANs. IEEE T Parall Distr 2014; 25(6): 1413–1423.

20.

Wang

Zhang

. TiM: fine-grained rate adaptation in WLANs. IEEE T Mobile Comput 2016; 15(3): 748–761.

21.

Xiao

. CSI-based indoor localization. IEEE T Parall Distr 2013; 24(7): 1300–1309.

22.

Neely

Rager

La Porta

TF.

Max weight learning algorithms for scheduling in unknown environments. IEEE T Automat Contr 2012; 57(5): 1179–1191.