Sage Journals: Discover world-class research

Abstract

Encouraging a certain number of users to participate in a sensing task continuously for collecting high-quality sensing data under a certain budget is a new challenge in the mobile crowdsensing. The users’ historical reputation reflects their past performance in completing sensing tasks, and users with high historical reputation have outstanding performance in historical tasks. Therefore, this study proposes a reputation constraint incentive mechanism algorithm based on the Stackelberg game to solve the abovementioned problem. First, the user’s historical reputation is applied to select some trusted users for collecting high-quality sensing data. Then, the two-stage Stackelberg game is used to analyze the user’s resource contribution level in the sensing task and the optimal incentive mechanism of the server platform. The existence and uniqueness of Stackelberg equilibrium are verified by determining the user’s optimal response strategy. Finally, two conversion methods of the user’s total payoff are proposed to ensure flexible application of the user’s payoff in the mobile crowdsensing network. Simulation experiments show that the historical reputation of selected trusted users is higher than that of randomly selected users, and the server platform and users have good utility.

Keywords

Stackelberg game mobile crowdsensing historical reputation optimal response strategy incentive mechanism

Introduction

Mobile crowdsensing (MCS) network usually includes a cloud-based server platform (SP) and numerous users with smart sensor devices.¹ When the SP publishes a set of sensing tasks, several users will be selected to participate in performing the sensing tasks. MCS is recently used in various fields, such as traffic information,² noise pollution,³ WiFi coverage information,⁴ and water pollution.⁵ However, the selected users need to spend their time and limited resources (such as energy, internal storage, and CPU computing power) when they perform sensing tasks. Thus, voluntarily participation of users in sensing tasks may be unsustainable. The SP provides users with rewards to compensate for cost in the sensing tasks for encouraging users with smart devices to actively participate in the tasks.

Designing justifiable incentive mechanism is challenging in the MCS network. The collected sensing data’s information amount will be insufficient when the reward given by the SP is less. However, the utility of the SP will reduce and its cost will increase when the SP gives more rewards to users. Therefore, the core problem of the MCS network is designing a rational and valuable incentive mechanism.⁶ The existing incentive mechanism can be roughly divided into the platform-centric and user-centered incentive mechanisms.^7,8 The platform-centric incentive mechanism is designed to improve the information increment of the SP and reduce the reward to users.⁹ The user-centered incentive mechanism mainly increases the motivation of users to participate in sensing tasks and encourages users to collect sensing data initiatively.¹⁰ Therefore, this study designs an incentive mechanism that considers the SP and the users. In this mechanism, the SP receives high-quality sensing data while the users acquire considerable payoff.

The users can randomly submit sensing data to obtain more rewards at the lowest cost in the MCS network system.¹¹ However, dishonest users may deliberately send some false sensing data to mislead the network system, and this event causes inaccuracy of the sensing task’s result. Therefore, the reputation of the users is a crucial parameter in the MCS system.¹² The users need to spend time and limited resources of the sensors device to complete the sensing task. Thus, no rational users will upload the sensing data actively if the reward given by the SP is less than the cost that the users use in collecting sensing data.

The reputation constraint incentive mechanism algorithm (RCIMA) is proposed. This study aims to design an incentive mechanism for maximizing the utility of the users and SP, and the users with high reputation will be encouraged to participate and collect high-quality sensing data in the sensing task. The primary contributions of this study are summarized as follows:

A resource contribution game algorithm (RCGA) based on Stackelberg game theory is proposed. The SP and users choose their optimal strategies to maximize their utility, and the existence of the Nash equilibrium point is proven in the Stackelberg game.

A reputation update method for the users is proposed. After the users upload the sensing data, the expectation–maximization (EM) algorithm is applied to evaluate the quality of sensing data collected by the users, and the SP updates the historical reputation of the users participating in the sensing task.

Two methods of reward conversion are proposed to select reward application for the users. The first method uses the user’s total payoff as the total reward when he needs to publish tasks in the MCS. The second method converts the user’s total payoff into real currency. Thus, the payoff of users will be more flexible circulation in the MCS.

The rest of the article is organized as follows. Section “Related works” presents various incentive mechanisms proposed in recent years. The MCS system model is introduced in section “System model.” Section “Details of RCIMA” describes RCIMA, which has four parts: selecting trusted users, RCGA, updating the reputation of each user, and incentive allocation. Section “Simulation results and analysis” presents the performance evaluation. The conclusion is presented in section “Conclusion.”

Related works

In recent years, incentive mechanisms have become a research hotspot in the field of MCS.⁷ Several researchers have applied distinct game models to design incentive mechanisms in the MCS system.^13,14 The auction model is a universal mathematical method for designing incentive mechanisms. Good auction model needs to satisfy individually rational, incentive-compatible, feasible budget.¹⁵ A long-term dynamic quality incentive mechanism is proposed to capture the dynamic nature of users’ data quality in Wang et al.¹⁶ The incentive mechanisms based on the auction model are studied considering privacy protection and social cost minimization in Lin et al.¹⁷ The SP selects users using a predefined scoring function, and the computational efficiency, individual rationality, and truth and differential privacy of the algorithm can be guaranteed. The incentive mechanism based on Sybil-proof auction is studied to prevent Sybil attacks in Lin et al.¹⁸ A reverse auction-based incentive mechanism (RAIN) is proposed in Ji et al.,¹⁹ which considers participants’ potential contributions when recruiting new workers. An online auction algorithm is studied combining multi-attribute auction and reverse auction to dynamically select users in Wang et al.¹⁵

Different game models have distinct goals for designing the incentive mechanism in addition to the auction model. Some scholars design incentive mechanisms based on Stackelberg game in MCS. The incentive mechanism considering the social network effect based on Stackelberg game theory is applied to analyze the relationship between users and service providers in Nie et al.¹ Stackelberg game theory is applied to design the incentive mechanism with user resource requirements as parameters, and the dynamic incentive mechanism based on the deep reinforcement learning method is studied without learning the user’s private information in Zhan et al.²⁰ A delay-sensitive MCS network technology is designed based on the Stackelberg game in Cheung et al.²¹ A three-stage Stackelberg game is proposed in the continuous time-varying scene of the MCS incentive mechanism in Li et al.²²

Most of the traditional incentive mechanisms only consider the utility of the SP and users. However, other factors also affect the sensing task results, such as interests and history reputation of the users. The reliability of the collected sensing data in the MCS system is also a concern.²³ According to reports, users can submit some random sensing data to obtain more payoffs when performing the sensing task at the minimum cost.¹¹ Moreover, users with low reputation may upload some false sensing data to affect the result of the sensing task.¹² Therefore, the SP should select trusted users to collect sensing data. The MCS system considers the contribution quality and reputation level of the user in the social network to obtain the reputation level of each user in Amintoosi and Kanhere.²⁴ The author uses the Gompertz function to evaluate the contributions of participating devices, and the reputation system calculates the new reputation based on the location and time of the users in Huang et al.²⁵ However, the incentive mechanism, which is the core of the MCS network system, is ignored in Amintoosi and Kanhere²⁴ and Huang et al.²⁵ The historical reputation of a user reflects its previous behavior,²⁶ which is used as parameter for selecting the users to minimize the threat from dishonest users. Therefore, the historical reputation of users is combined to design the algorithm in our incentive mechanism.

In addition, scholars have also proposed some multi-attribute incentive mechanisms. A hybrid incentive mechanism based on blockchain technology is proposed, and this mechanism integrates data quality, reputation, and money factors to encourage users to collect sensing data while preventing malicious behavior in Wei et al.²⁷ However, the application problem of the reward obtained by the users when they perform the sensing task is always ignored in the MCS system. The users obtain the reward accordingly after performing a sensing task. The reward application of the users can enhance the flexibility of the MCS system.

On the basis of the abovementioned analysis, this study designs an RCIMA incentive mechanism based on the Stackelberg game in the MCS network. The SP selects trusted users to ensure the quality of the collected sensing data. Then, the Stackelberg game is employed to analyze the balance problem of the SP and the users. The EM algorithm is also utilized to evaluate the quality of the collected sensing data by users, and the SP updates the reputation of each user. Finally, two conversion methods of users’ total reward are proposed.

System model

The MCS network is mainly composed of the task publisher (TP), the SP, and the users. As shown in Figure 1, the execution process of the sensing task is as follows. First, the TP publishes the sensing task information and total reward R to the SP. The SP broadcasts the sensing task information to users equipped with the smart sensor device. The users interested in the task sign up for the sensing task, and the users’ set is U = {u₁, u₂, …, u_n}. Then, the SP selects some trusted users to participate in the task, and the selected users choose the optimal strategy to perform the sensing task. After the users complete the sensing task, they upload the sensing data to the SP. Finally, the SP updates the reputation of each user; the users are allocated reward in the sensing task. Besides, each user chooses a conversion method of reward to deal with the obtained reward.

Figure 1.

System model of crowdsensing.

The detailed process is presented as follows:

TP published a sensing task and total reward R to SP;

If the users with a mobile smart device sensor are interested in the sensing task, then they will sign up to participate in the sensing task. The users’ set is U = {u₁, u₂, …, u_n};

SP uses users’ historical reputation to select the trusted users W = {w₁, w₂, …, w_m} (m ≤ n);

The SP and the users choose their optimal strategies by RCGA. The users will perform the sensing task and submit data to the SP when user selects the optimal strategy and utility of user is greater than zero;

SP evaluates the quality of the sensing data, and the SP updates the reputation of users;

The users receive the reward allocated by SP, and users select a method to convert virtual currency.

The relationship between the SP and the users is constructed as a Stackelberg game model. The selected users’ set is W = {w₁, w₂, …, w_m}, and each user w_i∈W selects its resource contribution level X_i, where X_i ≥ 0. The user w_i chooses an optimal strategy X_i^* according to the total reward R provided by the SP in the sensing task. The resource contribution level strategy set of users is X = (X₁, …, X_m), and X_− i = (X₁, …, X_i −₁, X_i_+ 1, …, X_m) represents the strategy excluding w_i. The resource contribution level X_i of user w_i is defined as follows.

Definition 1

The resource contribution level X_i is determined by the resource contribution coefficient β_i of the user w_i and the energy consumption ratio E_i^.’. That is

X_{i} = β_{i} E_{i}^{'}

(1)

Definition 2

The energy consumption ratio E_i^’ is the ratio of the consumed energy of the user w_i in transmitting the sensing data to the SP and the remaining energy

E_{i}^{'} = \frac{E_{i}}{(E_{0} - E_{i})}

(2)

where $E_{i}^{'} \in (0, 1)$ . E_i is the consumed energy of user w_i in performing the sensing task. E₀ is the initial energy before performing the sensing task, and $E_{0} - E_{i}$ is the remaining energy after completing the sensing task. The energy consumption of each user mainly comes from the energy consumption of sending and receiving data in performing the sensing task. Thus, the other energy consumed by users is ignored.²⁸ Equation (3) represents the energy consumption of transmitting and receiving sensing data

E (k, d) = {\begin{matrix} k * E_{elect} + k * ε_{fs} * d^{2}, d \leq d_{0} \\ k * E_{elect} + k * ε_{amp} * d^{4}, d > d_{0} \end{matrix}

(3)

where k*E_elect represents the energy consumed when sending and receiving k bit sensing data. d₀ is the distance threshold equal to 87 m. ε_fs and ε_amp represent the amplifier power consumption of the free-space and multipath attenuation models, respectively. The free-space model is employed when the distance between the user and the SP is less than d₀, and the transmission power is attenuated to d². Otherwise, the multipath attenuation model is used, and the transmission power is designated as d⁴.

The utility function of the user w_i is defined as

u_{i} = \frac{X_{i}}{\sum_{i = 1}^{m} X_{i}} * R - α_{i} X_{i}

(4)

The utility of user w_i is composed of two parts. The first part is the user’s payoff, which is determined by the resource contribution level X_i. The second part is the user’s cost function, which is the cost spent by the user in performing the sensing task, and α_i is the unit cost of the user w_i.

The resource contribution level of the users in performing sensing task is converted into the SP’s payoff function φ(To). The SP’s utility is the payoff subtracted by the total reward R, that is

u_{0}^{'} = φ (To) - R

(5)

φ (To) = λ \ln (1 + \sum_{i = 1}^{m} \ln (1 + X_{i}))

(6)

The function φ(·) is utilized to convert the user’s resource contribution level into the SP’s payoff, which reflects the law of diminishing payoff. The payoff of the SP increases with the resource contribution level of the user. However, the marginal payoff decreases. λ is a system parameter, which represents the equivalent monetary value of contributed resource by users.

The game theory model is employed to construct the relationship between the SP and the users as a non-cooperative game.²⁹ The strategy of the user w_i is to determine the resource contribution level X_i, and the strategy of the SP is to determine the total reward R to maximize their utility. The Stackelberg game can solve the benefit conflict between the SP and the users and find their optimal strategy.^29,30 Therefore, a two-stage Stackelberg game is applied to solve the incentive allocation problem of the relationship between the SP and the users.

Definition 3

Two-stage Stackelberg game

The first stage of leader game (SP). The SP determines the total reward R to obtain more utility, that is

R^{*} = \underset{R}{\arg max} u_{0}^{'}

(7)

The second stage of follower game (users). Each user chooses his strategy according to the total reward R by the SP and the resource contribution level of other users, and the purpose is to ensure that his utility reaches the maximum, that is

X_{i}^{*} = \underset{X_{i}}{\arg max} u_{i}

(8)

The second stage is regarded as a non-cooperative game and is called RCGA. This study analyzes the Nash equilibrium of the Stackelberg game, as discussed in section “Analysis of RCGA.”

Details of RCIMA

Publishing sensing task and selecting users collect sensing data

The sensing task is published by the TP, and the TP uploads task information (such as name, function, number of users m, and total reward R) to the SP. The SP broadcasts the sensing information to the users, and the users U = {u₁, u₂, …, u_n} with smart devices sign up for the sensing task. The SP selects m users with the highest historical reputation in the n registered users to ensure the quality of the collected data. The user will be deleted when the user’s initial energy cannot complete the sensing task. Finally, the selected users’ set is W = {w₁, w₂, …, w_m} (m ≤ n). Furthermore, the selected users choose the optimal response strategy based on RCGA to decide whether to continue to participate in the sensing task and to collect sensing data.

Analysis of RCGA

The relationship between the SP and the users is modeled as the Stackelberg game. The SP is the leader, and its strategy is to announce the total reward R of the sensing task. The users are the followers, and their strategy is to choose the resource contribution level. Each user looks for his optimal response strategy by the SP’s strategy, and the SP further adjusts its strategy to maximize its utility. Each user is rational in performing the sensing task. Thus, the user uploads the sensing data when the utility is greater than zero. If the utility obtained by the users is less than zeros, then no users will participate in the sensing task.

Follower game

Once the users participate in the sensing task, the total reward R given by the SP will be allocated to users according to the weight of each user’s resource contribution level. The utility of user w_i by equation (4) is

u_{i} = (\frac{X_{i}}{X_{i} + \sum_{j = 1, j \neq i}^{m} X_{j}}) * R - α_{i} X_{i}

(9)

When all users choose the optimal strategy, a steady state will be achieved in the RCGA. As a result, all participants cannot change the strategy to obtain more utility, which is the Nash equilibrium in non-cooperative games.³¹ The following defines the Nash equilibrium and optimal response strategy in RCGA.

Definition 4

Optimal response strategy

Given X_− i, a strategy X_i ≥ 0 is the optimal response strategy if it is maximized u_i (X_i, X_− i), which is denoted by X_i^*.

Definition 5

Nash equilibrium

X ^* = (X_i^*, X_− i^*) is the Nash equilibrium in the RCGA when each user w_i satisfies u_i (X_i^*, X_− i^*) ≥u_i (X_i, X_− i), where X_i ≥ 0.

Theorem 1

A unique Nash equilibrium point exists in the follower game when the SP provides the total reward R to users in the RCGA.

Proof

To study the optimal strategy to maximize the utility of the user w_i, the first and second derivatives of the utility function u_i about its resource contribution level strategy X_i are calculated by equation (9)

\frac{\partial u_{i}}{\partial X_{i}} = \frac{R}{\sum_{i = 1}^{m} X_{i}} - \frac{X_{i} R}{{(\sum_{i = 1}^{m} X_{i})}^{2}} - α_{i}

(10)

\frac{\partial^{2} u_{i}}{\partial^{2} X_{i}} = - \frac{2 R (\sum_{i = 1}^{m} X_{i} - X_{i})}{{(\sum_{i = 1}^{m} X_{i})}^{3}} < 0

(11)

The utility function is strictly concave with respect to the strategy of the user w_i because the second derivative is negative. The SP provides the total reward R > 0 and other users’ strategy X_− i. If an optimal strategy X_i^* exists, then the optimal response strategy of user w_i is unique.

The first derivative is set to zero using equation (10)

\frac{R}{\sum_{i = 1}^{m} X_{i}} - \frac{X_{i} R}{{(\sum_{i = 1}^{m} X_{i})}^{2}} - α_{i} = 0

(12)

Once the user w_i uploads the sensing data, user w_i is the winner and X_i > 0; otherwise, X_i = 0. The selected users are defined as W. The set of winners is defined as $\bar{W}$ and $\bar{W} = {j \in W | X_{j} > 0}$ . m₀ is the number of winners $m_{0} = | \bar{W} |$ . Considering that $\sum_{j \in W} X_{j} = \sum_{k \in \bar{W}} X_{j}$ , we have

\frac{R}{\sum_{k \in \bar{W}} X_{k}} - \frac{X_{k} R}{{(\sum_{k \in \bar{W}} X_{k})}^{2}} - α_{k} = 0 . k \in \bar{W}

(13)

By summing all the elements of $\bar{W}$ in equation (13), we obtain

\frac{m_{0} R}{\sum_{k \in \bar{W}} X_{k}} - \frac{R}{\sum_{k \in \bar{W}} X_{k}} - \sum_{k \in \bar{W}} α_{k} = 0

(14)

By solving $\sum_{k \in \bar{W}} X_{k}$ , we have

\sum_{k \in \bar{W}} X_{k} = \frac{(m_{0} - 1) R}{\sum_{k \in \bar{W}} α_{k}}

(15)

Substituting equation (15) into equation (13) yields

X_{k}^{*} = \sqrt{\frac{R \sum_{k \in \bar{W}} X_{k}}{α_{k}}} - \sum_{k \in \bar{W}} X_{k}

(16)

The strategy X_i^* is the optimal strategy for the user w_i when X_i^* is positive in equation (16). If X_i^* is negative, then the user w_i does not participate in the sensing task. Therefore, the optimal strategy for user w_i is

X_{i}^{*} = {\begin{matrix} \begin{matrix} \sqrt{\frac{R \sum_{i} X_{i}}{α_{i}}} - \sum_{i} X_{i}, if & i \in \bar{W} \end{matrix} \\ \begin{matrix} 0 & otherwise \end{matrix} \end{matrix}

(17)

Theorem 2

Given the total reward R > 0, if the optimal strategy set of all users X^∗= (X₁^∗, X₂^∗, …, X_m^∗) is the unique Nash equilibrium of RCGA, then the following conditions are met.

$| \bar{W} | \geq 2$ ;

$X_{i}^{*} = {\begin{matrix} \begin{matrix} \sqrt{\frac{R \sum_{i} X_{i}}{α_{i}}} - \sum_{i} X_{i}, if & i \in \bar{W} \end{matrix} \\ \begin{matrix} 0 & , otherwise \end{matrix} \end{matrix}$ ;

if $α_{k} \leq max_{j \in | \bar{W} |} {α_{j}}$ and $k \in \bar{W}$ ;

Sort the user’s costs in a non-decreasing sequence such that α₁ ≤ α₂ ≤···≤ α_m, and set h as the largest integer of $α_{h} < \frac{R}{\sum_{j \in W} X_{j}}$ , then $\bar{W}$ = {1,2,···, h}.

Condition (a) is proven as follows. If $| \bar{W} |$ = 0, then no user is participating in the sensing task currently. Thus, any user can change the strategy X_i = 0 to X_i > 0 in the MCS to obtain utility, which contradicts the definition of Nash equilibrium. Therefore, $| \bar{W} |$ ≠ 0. When 1 ≤ $| \bar{W} |$ < 2, the utility of user w_i is $u_{i} = R - α_{i} * X_{i}$ . User w_i can unilaterally change the X_i strategy to X_i − 1 to obtain more utility, and this condition is still opposed to the definition of Nash equilibrium; Therefore, $| \bar{W} |$ ≥ 2.

We also prove Condition (b). The user w_i participates in the sensing task when $i \in \bar{W}$ . Under the condition that other users’ strategies are constant, user w_i’s optimal strategy is $X_{i}^{*} = \sqrt{(R \sum_{i} X_{i}) / α_{i}} - \sum_{i} X_{i}$ , and user w_i obtains the greatest utility. The user w_i does not participate in the sensing task when $i \notin \bar{W}$ , and the strategy X_i = 0.

Condition (c) is proven as follows. When $i \in \bar{W}$ , X_i > 0. From Condition (b), if X_i > 0, then R−α_i∑_iX_i > 0. Thus

α_{i} < \frac{R}{\sum_{i} X_{i}}, i \in \bar{W}

(18)

In addition, the following conclusions are drawn

max_{i \in \bar{W}} α_{i} < \frac{R}{\sum_{i} X_{i}}

(19)

We suppose $α_{k} \leq max_{j \in \bar{W}} {α_{j}}$ , and $k \notin \bar{W}$ . The strategy of user w_k from Condition (b) is X_k = 0, and it is substituted into equation (10)

\frac{R}{\sum_{j \in W} X_{j}} - α_{k} > max_{i \in \bar{W}} α_{i} - α_{k} \geq 0

(20)

Therefore, the user w_k can improve the utility by unilaterally changing the strategy to X_k > 0, which contradicts the Nash equilibrium.

Condition (d) is proven as follows. The costs α₁, α₂, ···, α_m of the m users are sorted in a non-decreasing sequence. From Conditions (a) and (c), an integer k in [2, m] exists such that W = {1, 2, …, k}. k ≤ h given that h is the maximum positive integer that satisfies $α_{h} < \frac{R}{\sum_{j \in W} X_{j}}$ . If k < h, then $α_{k + 1} < \frac{R}{\sum_{j \in W} X_{j}}$ and $k + 1 \notin \bar{W}$ . Similarly, substituting user k + 1 strategy X_k ₊ ₁ = 0 into equation (10) yields

\frac{R}{\sum_{j \in W} X_{j}} - α_{k + 1} > 0

(21)

Therefore, user k + 1 can obtain more utility by increasing the resource contribution level of the sensing task, and this condition contradicts the Nash equilibrium.

In Algorithm 1, the SP first initializes the set of users $\bar{W}$ who are willing to upload sensing data, the resource contribution level set {X_i}, and all users’ payoff set {p_i}. Then, the unit cost α_i of all users in W is sorted in a non-decreasing sequence, and the first users are added to the set $\bar{W}$ . Next, other users (with utility greater than zero) are added to set $\bar{W}$ by the SP. Finally, the resource contribution level of each user is calculated, and reward is allocated to each user by the SP.

Algorithm 1: RCGA
1 Input: Initializing R, W, {α_i};2 Output: {X_i}, {p_i};3 Set $\bar{W} \leftarrow \emptyset$ ,{X_i}←{0}, {p_i}←{0};4 Sort users according to their unit costs, α₁ ≤ α₂ ≤…≤ α_m;5 $\bar{W}$ ←{1,2},j←3;6 whilej ≤ m and $α_{j} < \frac{α_{j} + \sum_{k \in \bar{W}} α_{k}}{\| \bar{W} \|}$ do7 $\bar{W}$ ← $\bar{W}$ ∪{j}, j = j + 1;8 End9 foreachi∈W do10 ifi∈ $\bar{W}$ then11 elseX_i = 0;12 end13 $X_{0} \leftarrow \sum_{i \in \bar{W}} X_{i}$ ;14 foreachi∈W do15 ifX_i > 0 then $p_{i} \leftarrow X_{i} \frac{R}{X_{0}}$ ;16 elsep_i = 0;End

Algorithm 1: RCGA

1 Input: Initializing R, W, {α_i};2 Output: {X_i}, {p_i};3 Set

\bar{W} \leftarrow \emptyset

,{X_i}←{0}, {p_i}←{0};4 Sort users according to their unit costs, α₁ ≤ α₂ ≤…≤ α_m;5

\bar{W}

←{1,2},j←3;6 whilej ≤ m and

α_{j} < \frac{α_{j} + \sum_{k \in \bar{W}} α_{k}}{| \bar{W} |}

do7

\bar{W}

←

\bar{W}

∪{j}, j = j + 1;8 End9 foreachi∈W do10 ifi∈

\bar{W}

then11 elseX_i = 0;12 end13

X_{0} \leftarrow \sum_{i \in \bar{W}} X_{i}

;14 foreachi∈W do15 ifX_i > 0 then

p_{i} \leftarrow X_{i} \frac{R}{X_{0}}

;16 elsep_i = 0;End

RCGA: resource contribution game algorithm.

The time complexity of Algorithm 1 is O(nlogn). The time required for all users to sort is O(nlogn), while the time required for while loop (6–8 lines) and for loop (9–12 lines, 14–17 lines) is O(n).

Leader game

The SP and the users are participants in the RCGA, and the SP is the leader and the users are the followers. All users have a unique Nash equilibrium point when the SP provides the users with the total reward R. Therefore, the SP can determine the value of R to maximize its utility.

Theorem 3

An optimal strategy R^∗ exists in the RCGA and constitutes a unique Stackelberg equilibrium point (R^∗, X^∗), where X^∗ is the optimal strategy set for all users. The utility of the SP is the maximum when the total reward is R^∗.

Proof

By substituting equation (15) into equation (13), we obtain

X_{i} = (\frac{(m_{0} - 1) R}{\sum_{i = 1}^{m_{0}} α_{i}}) \cdot (1 - \frac{(m_{0} - 1) α_{i}}{\sum_{i = 1}^{m_{0}} α_{i}})

(22)

Substituting equation (22) into the utility function of the SP yields

u_{0}^{'} = λ \ln {1 + \sum_{i = 1}^{m} \ln [1 + [(m_{0} - 1) R / \sum_{i = 1}^{m_{0}} α_{i}] \cdot [1 - (m_{0} - 1) α_{i} / \sum_{i = 1}^{m_{0}} α_{i}]]} - R

(23)

The second derivative of the SP utility function is determined

\partial^{2} u_{0}^{'} / \partial R^{2} = - λ [\sum_{i = 1}^{m_{0}} (F_{i}^{2} \cdot K_{i}) / {(1 + F_{i} R)}^{2} + {[\sum_{i = 1}^{m_{0}} F_{i} / (1 + F_{i} R)]}^{2}] / K_{i}^{2} < 0

(24)

where

K_{i} = 1 + \sum_{i = 1}^{m_{0}} \ln [1 + [(m_{0} - 1) R / \sum_{i = 1}^{m_{0}} α_{i}] \cdot [1 - (m_{0} - 1) α_{i} / \sum_{i = 1}^{m_{0}} α_{i}]]

(25)

F_{i} = [(m_{0} - 1) / \sum_{i = 1}^{m_{0}} α_{i}] \cdot [1 - (m_{0} - 1) α_{i} / \sum_{i = 1}^{m_{0}} α_{i}]

(26)

Therefore, the utility function of the SP in the RCGA is strictly concave as obtained by equation (24), and there is only Stackelberg equilibrium in the RCGA. A unique R^∗ exists such that the SP’s utility function u₀’(R, X^∗) reaches the maximum under the condition of (R^∗, X^∗). The unique R^∗ is calculated by utilizing the Newton method.³²

Evaluating reputation

After the users upload the sensing data, the SP will evaluate the reputation of the selected users. First, the EM algorithm³³ is employed to evaluate the quality of the upload sensing data by the users. Then, the SP evaluates the selected users’ reputation based on the sensing data quality result. Finally, the user’s historical reputation is updated after the user’s reputation is evaluated.

Quality evaluation

The quality of the submitted sensing data by the users reflects the quality of the sensing task they completed. Here, the user w_i collects urban noise sensing as an example in Peng et al.,³⁴ and each user w_i estimates the quality evaluation matrix $e^{w_{i}}$ , which is a $m \times m$ matrix of elements $e_{rs}^{w_{i}} \in [0, 1]$ , r = 1,2,…, m, s = 1,2,…, m. The quality of the sensing data is mapped to the quality evaluation matrix by the function q_i = g(e^wi). Thus, the reading of sensing data is divided into m discrete intervals and expressed as a set D = {d₁, d₂, …, d_m}, which represents the quality level of collected sensing data. Given a set of collected sensing data S, a set of P missing true indicators, probability matrix E , and probability density function f are obtained. The probability matrix E is

L (E; P, S) = f (P, S | E)

(27)

To find the maximum likelihood estimate of E , the following two steps are iteratively run by the EM algorithm until convergence (with the assumption that ${\hat{E}}^{t}$ is the current value of the probability matrix E after t iterations).

E-step: According to the conditional distribution of P given observation S under the current estimated value of E, the expected value of the likelihood function is calculated as follows

Q (E | {\hat{E}}^{t}) = E_{P | S, {\hat{E}}^{t}} [L (E; P, S)]

(28)

M-step: The estimation $\hat{E}$ that maximizes the expectation function is determined

{\hat{E}}^{t + 1} = \underset{E}{\arg \max} Q (E | {\hat{E}}^{t})

(29)

E-step and M-step are iterated until the estimated value reaches convergence. The converged evaluation of the user’s effort matrix indicates the quality of the sensing data, and the noise interval distribution implies the noise pollution level.

The specific steps are given as follows:

Step 1: For each task t, the index function $I (d_{t}^{k} = d_{j}) = 1$ of user’s sensing data d_t^k falls into the real interval d_j, and the probability distribution of the real noise interval p^t is initialized as

p_{j}^{t} = p (d_{t}^{0} = d_{j}) = \frac{\sum_{w_{i} \in W_{l}} I (d_{t}^{k} = d_{j})}{| W_{l} |}

(30)

Step 2: The likelihood function of the sensing probability matrix is estimated, and ${\hat{e}}_{rs}^{w_{i}}$ represents the value after t iterations

{\hat{e}}_{rs}^{w_{i}} = \frac{\sum_{t \in T^{w_{i}}} p_{r}^{w_{i}} I (d_{t}^{w_{i}} = d_{s})}{\sum_{t \in T^{w_{i}}} p_{r}^{w_{i}}}, s = 1, 2, \dots, m

(31)

The real noise interval distribution is estimated as

{\hat{π}}_{r} = \frac{\sum p_{r}^{t}}{| T |}, r = 1, 2, \dots, m

(32)

Step 3: The real noise interval is estimated. Given the sensing data S, the quality evaluation matrix E , and the noise interval distribution Π, the true noise interval P is estimated using Bayesian inference. The real noise interval distribution is calculated using the following formula

P_{r}^{t} = \frac{π_{r} \underset{w_{i} \in W_{t}}{Π} \underset{s}{Π} (e_{rs}^{k}) I (d_{t}^{k} = d_{s})}{\sum_{q} π_{q} \underset{w_{i} \in W_{t}}{Π} \underset{s}{Π} (e_{qs}^{k}) I (d_{t}^{k} = d_{s})}, r = 1, 2, \dots, m

(33)

Step 4: Convergence. Steps 2–3 are iterated until the two estimates converge, that is, $| {\hat{E}}^{t + 1} - {\hat{E}}^{t} | < ε$ , $| {\hat{P}}^{t + 1} - {\hat{P}}^{t} | < η$ , $ε > 0$ , $η > 0$ . With the estimation for the quality evaluation matrix e ^w_i , we can obtain the quality of user w_i’s sensing data though the mapping function $g (e^{w_{i}})$ . Therefore, the quality of collected sensing data by user w_i is

q_{w_{i}} = g (e^{w_{i}}) = \sum_{r} \frac{e_{rr}^{w_{i}}}{m}

(34)

Updating reputation

Through the abovementioned quality evaluation process, the quality of the collected sensing data of user w_i is q_{w_i}. Then, the reputation value of user w_i will be normalized and converted to [0,5]. Thus, the reputation value of the user w_i is

Re p_{i} = \frac{5 q_{w_{i}}}{q_{max}}

(35)

where q_max is the highest data quality value in the sensing task. The user’s historical reputation is updated as

{Rep}_{i}^{'} = \frac{(oRe p_{i 0} + Re p_{i})}{(o + 1)}

(36)

where o is the number of historical tasks in which user w_i participates, and Rep_i0 is the historical reputation value of user w_i.

Reward distribution

The reward provided by the SP to user w_i is expressed as Re_i, and the user w_i chooses the optimal resource contribution level X_i through the RCGA. Thus, the final reward is

R e_{i} = {\begin{matrix} \frac{X_{i}}{\sum_{i = 1}^{m} X_{i}} * R, X_{i} \neq 0 \\ 0, X_{i} = 0 \end{matrix}

(37)

The total payoff ReT_i of the user w_i is the reward for performing one task or several tasks in the MCS network system. When the total payoff ReT_i is greater than a certain threshold V_min, the virtual currency (ReT_i) is converted and applied to the following two methods.

In the first method, the user w_i regards the virtual currency of total payoff ReT_i as the reward for publishing sensing task. The total reward R is also distinct when the user w_i publishes different sensing tasks. However, when the total payoff of user w_i must not be less than the reward R required for publishing a sensing task, he can successfully publish the sensing task. When the user w_i’s total payoff is insufficient as a reward in publishing the sensing task, he will convert real currency (do) into virtual currency (V_p) to publish the sensing task by equation (38). Then, user w_i will publish the sensing task that he needs

V_{p} = c * do

(38)

In the second method, the total payoff ReT_i of user w_i directly converts virtual currency into real currency. The SP will convert virtual currency into real currency successfully when the total payoff of user w_i is greater than the threshold V_min, which is

do = \frac{1}{c^{'}} * V_{p}

(39)

where V_p > V_min, V_p and do are virtual currency and real currency, respectively. c and c’ are system parameters determined by the MCS network system, and c’ is slightly larger than c. Therefore, if the user w_i needs to publish a sensing task, then he will be more willing to convert virtual currency into the total reward of the sensing task.

Simulation results and analysis

Simulation experiments are conducted with MATLAB R2016a and the following network topology is established to evaluate the performance of RCIMA. One TP, One SP, and 1000 users are randomly distributed in the target area with the range of 1 × 1 km², and the TP can be successfully published tasks to the SP. The parameters and experimental values in this study are shown in Table 1.

Table 1.

Simulation parameter value.

Parameter	Value
Target area	1000 × 1000 m²
n	1000
λ	500
Rep_i ₀	[0,5]
d₀	87
k	400
E_elec	50 nJ/bit
ε_fs	10 pJ/bit/m²
ε_amp	0.0013 pJ/bit/m⁴

Average payoff of users

Figure 2 shows the relationship between average payoff obtained by the users and the total reward R given by the SP in RCGA. The reward given by the SP to the user is the user’s payoff when the user completes the sensing task. The users’ payoff is related to the total reward R of the SP and the users’ optimal resource contribution level. As shown in Figure 2, the average payoff of the users will be more when the number of selected users is less. The average payoff increases with the R when the number of selected users is fixed. Therefore, it is more beneficial to users that the number of selected users is small when the SP provides a fixed total reward. It is more favorable for the users when the greater the total reward given by SP given the number of users.

Figure 2.

Changes in the average payoff of users with R.

Average utility of users

Figure 3 shows the relationship between the average utility of users and the total reward R given by the SP in the algorithm. The utility of the user is the payoff subtracted by the cost when user complete the sensing task. In Figure 3, the average utility of users increases with the total reward R. Moreover, the total reward R is linearly related to the average utility of users. Under the condition of R, as the number of users selected increase, the average utility of the users reduce. Because the number of users goes up, the weight of each user’s optimal strategy goes down. Thus, the payoff of each user will be less, and the user’s utility will be reduced correspondingly.

Figure 3.

Changes in the average utility of users with R.

Utility of the SP

Figure 4 shows the relationship between the utility of SP and the total reward R paid by SP to users. The utility of SP is related to the resource contribution level X_i of the users and the total reward R of SP. The experimental result shows the utility of SP decreases as the total reward R increases. The resource contribution level of the users is fixed when the number of selected users is constant. Thus, the utility of the SP is less when the total reward R increases. Given a certain total reward R, SP will obtain more utility when the number of users is more.

Figure 4.

Relationship between the utility of the SP and R.

Resource contribution coefficient β_i of the user

Figure 5 shows the relationship between the resource contribution coefficient β_i of the user w_i and the total reward R of the SP. The resource contribution coefficient β_i of the user w_i is related to the resource contribution level X_i and the user’s energy ratio E_i^’. The experimental results show that the smaller the number of users, the larger β_i the user has. And the resource contribution coefficient β_i of the user is unsteadiness. The reason is the smaller the number of users, each user has the more reward when R is fixed. Thus, the weight of the user’s optimal strategy will increase. β_i will not change significantly with the increasing of R because the optimal strategy of the user is different when R is distinct.

Figure 5.

Relationship between the resource contribution coefficient β_i and R.

Reputation evaluation

Figure 6 analyzes the relationship between the user w_i’s reputation and the quality evaluation matrix e_ii of user w_i in collecting sensing data. The quality evaluation matrix e_ii of the user in submitting the sensing data approximately follows the same normal distribution,³⁴ where μ = 0.75 and σ = 0.125. Figure 5 shows a linear relationship between the quality evaluation matrix and reputation. When the quality evaluation matrix e_ii of user w_i in submitting the sensing data is smaller, the user w_i’s reputation is correspondingly lower in the sensing task. However, if the quality evaluation matrix of the user w_i’s is larger in the sensing task, then the reputation of the user w_i will be higher.

Figure 6.

Relationship between the quality evaluation matrix of sensing data and the user’s reputation.

Analyzing the reputation value of selected users

Table 2 analyzes the comparison of different historical reputation values between randomly selected users and selected trusted users. First, the SP chooses trusted users who have higher average historical reputation value than randomly selected users. Then, the average historical reputation value of users has a small difference when randomly selecting users. However, when selecting trusted users, the average historical reputation value of users is high if the number of selected users is small.

Table 2.

Average historical reputation values of selected users.

Selected users number (N)	Randomly select users	Selecting trusted users
100	2.4696	4.7481
200	2.5028	4.5143
300	2.5478	4.2737

Conclusion

In this study, an incentive mechanism (RCIMA) is proposed on the basis of the Stackelberg game that considers the benefit of the SP and users for MCS. The overall mechanism includes choosing trusted users, RCGA, and reputation update and reward distribution method. The credibility of the collected sensing data has an obvious improvement because the users are selected by the reputation. Compared with the random selected users, the proposed model in this article has higher average historical reputation value. The utility of SC and MUs in the proposed method is good in the RCGA. Meanwhile, two conversion methods between virtual currency and real currency are used to ensure flexible application of the users’ total payoff in the MCS system. However, this article does not consider the user selection task problem when multiple tasks are released. Therefore, the incentive mechanism considering multiple TPs will be investigated in future work. Moreover, the submission of sensing data by users will be studied to prevent the leakage of private information.

Footnotes

Handling Editor: Dr Yanjiao Chen

Declaration of conflicting interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This research was funded in part by the National Key Research and Development Program, grant number 2017YFB1401800 and Jilin Province Education Department projects, grant number JJKH20200802KJ and JJKH20200791KJ.

ORCID iD

Xiaoxiao Yang

References

Nie

Luo

Xiong

, et al. A Stackelberg game approach toward socially-aware incentive mechanisms for mobile crowdsensing. IEEE Trans Wirel Commun 2018; 18(1): 724–738.

Mohan

Padmanabhan

Ramjee

. Nericell: rich monitoring of road and traffic conditions using mobile smartphones. In: ACM conference on embedded network sensor systems. ACM, 2008.

NoiseTube, http://www.noisetube.net/

Sensorly, http://www.sensorly.com/

Kim

Robson

Zimmerman

, et al. Creek watch: pairing usefulness and usability for successful citizen science. In: Proceedings of the SIGCHI conference on human factors in computing systems, 2011, pp.2125–2134. New York: ACM, https://www.researchgate.net/publication/221516109_Creek_Watch_Pairing_Usefulness_and_Usability_for_Successful_Citizen_Science

Jaimes

Vergara-Laurens

Raij

. A survey of incentive techniques for mobile crowd sensing. IEEE Internet Things J 2015; 2(5): 370–380.

Wang

Gao

Liu

, et al. Credible and energy-aware participant selection with limited task budget for mobile crowd sensing. Ad Hoc Netw 2016; 43: 56–70.

Yang

Xue

Fang

, et al. Crowdsourcing to smartphones: incentive mechanism design for mobile phone sensing. In: Proceedings of the international conference on mobile computing & networking, 2012. New York: ACM, http://optimization.asu.edu/papers/XUE-CNF-2012-MOBICOM-MPSensing.pdf

Gao

Chen

Liu

KJR

. On cost-effective incentive mechanisms in microtask crowdsourcing. IEEE Trans Comput Intell AI Games 2015; 7(1): 3–15.

10.

Sun

. Heterogeneous-belief based incentive schemes for crowd sensing in mobile social networks. J Netw Comput Appl 2014; 42: 189–196.

11.

Downs

Holbrook

Sheng

, et al. Are your participants gaming the system? Screening mechanical turk workers. In: Proceedings of the SIGCHI conference on human factors in computing systems, 2010, pp.2399–2402. New York: ACM, https://www.researchgate.net/publication/221518453_Are_your_participants_gaming_the_system_Screening_Mechanical_Turk_Workers

12.

Zhou

Cai

, et al. FIDC: a framework for improving data credibility in mobile crowdsensing. Comput Netw 2017; 120: 157–169.

13.

Zhang

Yang

Sun

, et al. Incentives for mobile crowd sensing: a survey. IEEE Commun Surv Tut 2016; 18(1): 54–67.

14.

Zhang

Liang

Luo

, et al. Privacy-preserving incentive mechanisms for mobile crowdsensing. IEEE Pervas Comput 2018; 17(3): 47–57.

15.

Wang

Gao

, et al. A worker-selection incentive mechanism for optimizing platform-centric mobile crowdsourcing systems. Comput Netw 2020; 171: 107144.

16.

Wang

Guo

Cao

, et al. MeLoDy: a long-term dynamic quality-aware incentive mechanism for crowdsourcing. IEEE Trans Parall Distrib Syst 2018; 29(4): 901–914.

17.

Lin

Yang

, et al. Frameworks for privacy-preserving mobile crowdsensing incentive mechanisms. IEEE Trans Mob Comput 2017; 17: 1851–1864.

18.

Lin

Yang

, et al. Sybil-proof incentive mechanisms for crowdsensing. In: Proceedings of the IEEE INFOCOM 2017—IEEE conference on computer communications, 2017. New York: IEEE, https://ecs.syr.edu/faculty/tang/Pub/Tang-Infocom17-2.pdf

19.

Yao

Zhang

, et al. A reverse auction-based incentive mechanism for mobile crowdsensing. IEEE Internet Things J 2020; 7(9): 8238–8248.

20.

Zhan

Xia

Zhang

, et al. An incentive mechanism design for mobile crowdsensing with demand uncertainties. Inform Sci 2020; 528: 1–16.

21.

Cheung

Hou

Huang

. Delay-sensitive mobile crowdsensing: algorithm design and economics. IEEE Trans Mob Comput 2018; 17(12): 2761–2774.

22.

Yang

, et al. Three-stage Stackelberg long-term incentive mechanism and monetization for mobile crowdsensing: an online learning approach. IEEE Trans Netw Sci Eng. Epub ahead of print 5 February 2021. DOI: 10.1109/TNSE.2021.3057394.

23.

Reddy

Samanta

Burke

, et al. MobiSense — mobile network services for coordinated participatory sensing. In: Proceedings of the 2009 international symposium on autonomous decentralized systems, 2009, https://www.researchgate.net/publication/224579489_MobiSense_-_mobile_network_services_for_coordinated_participatory_sensing

24.

Amintoosi

Kanhere

. A reputation framework for social participatory sensing systems. Mob Netw Appl 2014; 19(1): 88–100.

25.

Huang

Kanhere

. On the need for a reputation system in mobile phone based sensing. Ad Hoc Netw 2014; 12: 130–149.

26.

Yang

Zhang

Roe

. Using reputation management in participatory sensing for data classification. Proced Comput Sci 2011; 5(1): 190–197.

27.

Wei

Long

. A blockchain-based hybrid incentive model for crowdsensing. Electronics 2020; 9(2): 215.

28.

Zhang

Lei

Feng

. Energy-efficient collaborative transmission algorithm based on potential game theory for beamforming. Int J Distrib Sens Netw 2019; 15(9): 9877630.

29.

Myerson

. Game theory. Cambridge, MA: Harvard University Press, 2013.

30.

Shi

Zhao

Zheng

, et al. Incentive design for cache-enabled D2D underlaid cellular networks using Stackelberg game. IEEE Trans Veh Technol 2019; 68(1): 765–779.

31.

Maskin

. Nash equilibrium and welfare optimality. Rev Econ Stud 1999; 66(1): 23–38.

32.

Boyd

Vandenberghe

Faybusovich

. Convex optimization. IEEE Trans Autom Control 2006; 51(11): 1859.

33.

Dawid

Skene

. Maximum likelihood estimation of observer error-rates using the EM algorithm. Appl Stat 1979; 28: 20–28.

34.

Peng

Chen

. Pay as how well you do: a quality based incentive mechanism for crowdsensing. In: Proceedings of the 16th ACM international symposium on mobile ad hoc networking and computing, 2015, pp.177–186, https://www.cs.sjtu.edu.cn/~fwu/res/Paper/PWC15MobiHoc.pdf

Incentive mechanism based on Stackelberg game under reputation constraint for mobile crowdsensing

Abstract

Keywords

Introduction

Related works

System model

Definition 1

Definition 2

Definition 3

Two-stage Stackelberg game

Details of RCIMA

Publishing sensing task and selecting users collect sensing data

Analysis of RCGA

Follower game

Definition 4

Optimal response strategy

Definition 5

Nash equilibrium

Theorem 1

Proof

Theorem 2

Leader game

Theorem 3

Proof

Evaluating reputation

Quality evaluation

Updating reputation

Reward distribution

Simulation results and analysis

Average payoff of users

Average utility of users

Utility of the SP

Resource contribution coefficient βi of the user

Reputation evaluation

Analyzing the reputation value of selected users

Conclusion

Footnotes

Declaration of conflicting interests

Funding

ORCID iD

References

Resource contribution coefficient β_i of the user