Energy Efficiency Oriented Access Point Selection for Cognitive Sensors in Internet of Things

Abstract

This paper studies the distributed energy efficient access point (AP) selection for cognitive sensors in the Internet of Things (IoT). The energy consumption is critical for the wireless sensor network (WSN), and central control would cause extremely high complexity due to the dense and dynamic deployment of sensors in the IoT. The desired approach is the one with lower computation complexity and much more flexibility, and the global optimization is also expected. We solve the multisensors AP selection problem by using the game theory and distributed learning algorithm. First, we formulate an energy oriented AP selection problem and propose a game model which is proved to be an exact potential game. Second, we design a distributed learning algorithm to obtain the globally optimal solution to the problem in a distributed manner. Finally, simulation results verify the theoretic analysis and show that the proposed approach could achieve much higher energy efficiency.

1. Introduction

Much attention has been paid to the emerging Internet of Things (IoT) recently, which has applications on climate monitoring, transport safety, home automation, health care, and so on. The IoT has been seen as an important technological revolution that brings us into a new ubiquitous connectivity, computing, and communication era [1]. The IoT is a global network which allows the communication of anything in the world by providing a unique digital identity to each and every object [2]. Sensing the environment and connecting with other things are expected basic capabilities of users in the IoT. Then the wireless sensor network (WSN) which helps sense the environment and communicate to the others is significant to the IoT.

With the WSN, the information exchange is one of the key points. The sensors should share their sensing information with other things in the IoT through the Internet. Though wired connectivity could provide high speed communication, the limitation of distance and location constrain the development of the IoT. With the rapid development of mobile Internet and wireless communication, the information exchange through wireless communication could neglect the limitation of wired communication, which impels the development of IoT significantly.

With the sensors working on the wireless model in the IoT, energy is very sensitive [3]. Consuming less energy to complete the communication with its Internet access point (AP) is very important to the sensors. Given the communication model and the data rates, the energy consumption is mainly determined by the allocated bandwidth and the channel quality between the sensor and its AP. Due to the possible mobility of sensors and the wireless channel quality change between the sensors and the APs, allocating sensors to APs in a predefined and fixed way is not efficient and vulnerable. In addition, considering the huge number of sensors in the IoT, allocating sensors fixed spectrum resource is impossible due to the limitation of wireless spectrum resource.

There are some researches on the WSN in the IoT. Authors in [4–6] focus on the surroundings sensing in IoT based on the sensor network technologies. Authors in [7, 8] studied the communication protocols. There are also some researches on the energy efficiency problem in the WSN. The tracking effects and energy consumption optimization was studied in [9]. A tradeoff between bandwidth and energy consumption in the IoT was studied in [3]. To the best of our knowledge, there are rare researches which paid attention to the AP selection for the sensors.

Recently, the cognitive Internet of Things (CIoT) has been proposed, which aims to empower the current IoT with a “brain” for high-level intelligence [1]. The cognitive sensors could obtain the environment spectrum information through spectrum sensing approaches [10–12]. Based on the intelligence of sensors in the CIoT, the sensors could smartly select the proper APs according to the environment and other sensors' action, to obtain better energy efficiency.

In this paper, we focus on the energy efficient AP selection for cognitive sensors in the WSN, which is an important and practical problem in the IoT. Due to the extremely high complexity of central control for dense and dynamic deployment of sensors [13], we focus on the distributed selection scheme, which adapts the changeable environment better. We solve the multisensors AP selection problem by using the game theory [14] and distributed learning algorithm. First, we formulate an energy oriented AP selection problem and propose a game model which is proved to be an exact potential game. Second, we design a distributed learning algorithm to obtain the globally optimal solution to the problem in a distributed manner. Finally, simulation results verify the theoretic analysis and show that the proposed approach could achieve much higher energy efficiency.

The rest of this paper is organized as follows: In Section 2, we present the system model and problem formulation. In Section 3, we formulate the proposed game model and investigate its properties. In Section 4, we propose the distributed learning algorithm. In Section 5, simulation results and discussion are presented. Finally, we provide conclusions in Section 6.

2. Related Works

There are some researches on the technologies in the IoT. Authors in [4–6] focus on the surroundings sensing in IoT based on the sensor network technologies. The identification in IoT has been studied in [7, 8].

With the energy consumption optimization problem in WSN which we focus on, there are also some researches. In [3], authors proposed a tradeoff between bandwidth and energy consumption in the IoT. In [9], the authors studied the tracking effects and energy consumption optimization. To the best of our knowledge, there are rare researches which paid attention to the AP selection for the sensors.

With the aspect of optimization approach, there are two main types: the centralized optimization approach and the distributed optimization approach. In [13], the authors pointed that the centralized optimization would bring extremely high complexity due to the dense and dynamic deployment of sensors. Then the distributed optimization approach is the one adopted by this paper.

With the distributed optimization approach, we focus on the game theory [14], which is a powerful tool to study the interactions among multiple players and has been used in the distributed optimization in many researches about the distributed network [15, 16]. For some good properties to the distributed global optimization, the potential game was also applied in some researches. In [17], the authors proposed a multicell coordination approach to mitigate the mutual interference among base stations in the frequency slotted cellular networks. In [18], the authors investigated the problem of joint base station selection and resource allocation in an orthogonal frequency division multiple access heterogeneous cellular network, analyzed this problem by using potential game theoretic approaches, and proposed two different variants of Max-logit learning algorithms which achieved outstanding performances.

For the WSN in IoT, there are only few game based researches to our best knowledge. In [3], a service providing model was built by using a differential game model. The game solution was gotten in the condition of grand coalition, feedback Nash equilibrium, and intermediate coalitions and an allocation policy was obtained by Shapley theory. Authors in [19] proposed an energy aware trust derivation scheme using game theoretic approach, which managed overhead while maintaining adequate security of WSNs. Nevertheless, with the distributed global energy efficient AP selection optimization in IoT which we focus on, related studies are very limited.

3. System Model and Problem Formulation

We consider a WSN consisting of M sensors and N APs. Sensors exchange information with other things in the IoT through the communication with APs and then to the Internet. Large scale and huge number of sensors and APs consist of the common connected and common cooperative IoT. Due to the advantage of location and flexible deployment of sensors, we assume that the wireless communication between sensors and APs is applied.

Importantly, the heterogeneous characteristic of APs is considered, where the spectrum bandwidth resources of APs are heterogeneous. The heterogeneous wireless network is a promising paradigm in 4G and foreseeable 5G wireless communications. For example, in Figure 1, there are two types of APs with different spectrum bandwidth resources. The spectrum bandwidth resource of different APs might be 20 MHz or 3 MHz, which could be seen as the common 3G, 4G, and 5G cellular network cells or WIFI points and so on. Sensors are widely deployed and can freely select APs according to the environment, and the bandwidth resource of each AP would be equally allocated to sensors connected with this AP.

Figure 1

System model.

As an example of energy efficient AP selection, in Figure 1, sensor A could select AP1, AP2, or AP3. The selection decision would be determined by the energy efficiency. The distance for sensor A to AP2 is shortest, but the number of sensors which have selected AP2 is much more than that of AP1. If sensor A selects AP2, it will share the bandwidth resource with the other 4 sensors and the bandwidth obtained would be less than that obtained from AP1. In all, the AP selection would be determined by the distribution of sensors, the channel qualities, the distribution of APs, the bandwidth resources of APs, and the selection of other sensors.

Denote $S_{A P} = {1,2, \dots, N}$ as the set of APs, $B_{A P} = \{B_{1}, B_{2}, \dots, B_{N}\}$ as the bandwidths of APs, and $S_{SE} = \{1,2, \dots, M\}$ as the set of sensors.

Because the qualities of wireless channels are related to the location and channel frequency, choosing different frequency channel means different energy consumption. If sensor m chooses ${A P}_{n}$ , the bandwidth allocated to sensor m would be

\begin{matrix} B_{m, n} = \frac{B_{n}}{|Ω_{n}|}, \end{matrix}

(1)

where

|Ω_{n}|

is the number of sensors connected with

{A P}_{n}

. According to the Shannon equation, the capacity obtained by sensor m would be given by

\begin{matrix} R_{m, n} = B_{m, n} \log (1 + \frac{P_{m, n} {δ_{m, n}}^{- γ_{m, n}} ρ_{m, n}}{B_{m, n} N_{0}}), \end{matrix}

(2)

where

N_{0}

is the noise power spectrum density,

P_{m}

is the transmitting power of sensor m,

δ_{m, n}

is the distance between sensor m and

{A P}_{n}

γ_{m, n}

is the path loss exponent between sensor m and

{A P}_{n}

, and

ρ_{m, n}

is the instantaneous random component of the path loss [20]. Note that importantly (1) could be changed as any other function, for example,

B_{m, n} = f (B_{n})

. The following analysis, including the game model design, the utility function, and the proved properties, would not be changed. We make the assumption on (1) because we do not focus on the bandwidth allocation in AP in this paper. Actually, any allocation scheme would be ok.

Then the transmission power of sensor m connected with ${A P}_{n}$ would be given by

\begin{matrix} P_{m, n} = \frac{B_{m, n} N_{0} (e x p^{(R_{m, n} / B_{m, n})} - 1)}{{δ_{m, n}}^{- γ_{m, n}} θ_{m, n}} = \frac{B_{n} N_{0} (e x p^{(|Ω_{n}| R_{m, n} / B_{n})} - 1)}{|Ω_{n}| {δ_{m, n}}^{- γ_{m, n}} ρ_{m, n}} . \end{matrix}

(3)

For each sensor, given the environment parameters such as location, channel quality, and the resources of APs nearby, it needs to make the decision on which AP to access. The lower power consumption AP selection is expected. However, the interaction among multisensors is extremely complex. The power consumption of each sensor would be effected by other sensors' action significantly. APs with more resources would much more likely be selected by other sensors, which would reduce the low bandwidth allocation.

In the perspective of the whole WSN, the optimization objective is twofold. First, the state of network is expected to be stable; that is, the selections of sensors are expected to be stable. Second, the sum of the power consumption of the network is expected to be low; that is, the optimization objective is given by

\begin{matrix} P r o b l e m : \min P_{n e t} = \sum_{m = 1}^{M} P_{m} ⟺ m a x - P_{n e t} = - \sum_{m = 1}^{M} P_{m} . \end{matrix}

(4)

Remark 1.

The proposed AP selection problem is a typical combinational optimization problem. Though the exhaustive search carried out by the central manager could achieve the global optimal solution, the computation complexity is extremely high. In addition, the flexibility of the system is not good enough. For example, even in a relative small scenario, where $N = 6$ and $M = 40$ , the number of possible sensors' strategy profiles is $6^{40} = 1.3367 \times 1 0^{31}$ . The desired approach is the one with lower computation complexity and much more flexibility, and the global optimization is also expected. In the next section, we propose a game theory [14] based distributed solution to this problem.

4. The Energy Efficiency Oriented Graphic Game

In this section, we proposed an energy efficient oriented graphic (EEOG) game to solve the distributed AP selection optimization problem. Every sensor is regarded as a player in the game, and we aim to obtain expected properties for the global WSN through the design of the utility function.

Definition 2.

One defines the energy efficient oriented graphic (EEOG) game as

\begin{matrix} G = \{S_{SE}, S_{A P}, ℵ, A\}, \end{matrix}

(5)

where

S_{A P}

is the set of APs,

S_{SE}

is the set of sensors, and ℵ is the adjacency matrix of the WSN, among which

ζ_{m, n} \subset ℵ

is the connectivity relationship between

{AP}_{n}

and player m. Player m could connect to

{A P}_{n}

ζ_{m, n} = 1

; otherwise,

ζ_{m, n} = 0

A = A_{1} \otimes A_{2} \otimes \dots \otimes A_{M}

is the set of strategy profiles of all the players, where ⊗ is the Cartesian product and

A_{m}

is the available strategy set of player

m \in S_{SE}

Define the action of player m as $a_{m} \in A_{m}$ ; $u_{m}$ is the utility function of player m, and $u_{m} (a_{m}, a_{- m})$ denotes the player m's utility when action $a_{m}$ is adopted by m and $a_{- m}$ is the action profile of other players.

Define $C_{m} \in S_{SE}$ as the set of sensors which are related to the action of player m:

\begin{matrix} C_{m} = \{i \in S_{SE} : i f ζ_{m, j} = 1, ζ_{i, j} = 1, \forall j \in S_{A P}\} . \end{matrix}

(6)

To achieve expected global network performance, based on the regional interaction and the cooperation feature in the IoT [1], motivated by local collaboration in biographical systems [21, 22] and the collaboration design in networks [23–25], we define the utility function of player, for example, player m's utility function, as follows:

\begin{matrix} u_{m} (a_{m}, a_{- m}) = - P_{m} - \sum_{i \in C_{m}} P_{i}, \end{matrix}

(7)

where

P_{i}

is the power consumption of player m and

C_{m}

is the related player set defined in (6).

Definition 3.

Denote an action profile of players as $a = \{a_{1}^{}, a_{2}^{}, \dots, a_{M}^{}\}$ , and an action profile $a^{*} = \{a_{1}^{*}, a_{2}^{*}, \dots, a_{M}^{*}\}$ is a pure strategy Nash equilibrium (NE) if and only if no player could improve its utility by deviating unilaterally; that is,

\begin{matrix} u_{m} (a_{m}^{*}, a_{- m}^{*}) \geq u_{m} (a_{m}, a_{- m}^{*}), \forall m \in S_{SE}, \forall a_{m} \in A_{m}, a_{m} \neq a_{m}^{*} . \end{matrix}

(8)

Theorem 4.

A pure strategy NE of the proposed EEOG game is just the global optimal solution to proposed optimization problem (4).

Proof.

The following proof is based on the potential game theory [23]. We define the network utility as the potential function of the EEOG game:

\begin{matrix} R (a_{m}, a_{- m}) = - P_{n e t} = - \sum_{m = 1}^{M} P_{m} = - P_{m} - \sum_{i \in C_{m}} P_{i} - \sum_{i \in \{S_{SE} ∖ C_{m}\}} P_{i}, \end{matrix}

(9)

and when player m unilaterally changes its action from

a_{m}

a_{m}^{'}

and other players hold their strategies, the value of potential function would be changed:

\begin{matrix} Δ R = R (a_{m}, a_{- m}) - R (a_{m}^{'}, a_{- m}) = - P_{n e t} + P_{n e t}^{'} = - P_{m} - \sum_{i \in C_{m}} P_{i} - \sum_{i \in \{S_{SE} ∖ C_{m}\}} P_{i} - (- P_{m}^{'} - \sum_{i \in C_{m}} P_{i}^{'} - \sum_{i \in \{S_{SE} ∖ C_{m}\}} P_{i}^{'}) = P_{m}^{'} - P_{m} + \sum_{i \in C_{m}} P_{i}^{'} - \sum_{i \in C_{m}} P_{i} + \sum_{i \in \{S_{SE} ∖ C_{m}\}} P_{i}^{'} - \sum_{i \in \{S_{SE} ∖ C_{m}\}} P_{i} . \end{matrix}

(10)

Note that the mutual influence would not occur when the players are not related players. Then for $i \in \{S_{SE} ∖ C_{m}\}$ , which is not the related player of m, there will be no influence from the m's action change; that is,

\begin{matrix} \sum_{i \in \{S_{SE} ∖ C_{m}\}} P_{i}^{'} - \sum_{i \in \{S_{SE} ∖ C_{m}\}} P_{i} = 0 . \end{matrix}

(11)

Then we have

\begin{matrix} Δ R = R (a_{m}, a_{- m}) - R (a_{m}^{'}, a_{- m}) = P_{m}^{'} - P_{m} + \sum_{i \in C_{m}} P_{i}^{'} - \sum_{i \in C_{m}} P_{i} = u_{m} (a_{m}, a_{- m}) - u_{m} (a_{m}^{'}, a_{- m}) = Δ u . \end{matrix}

(12)

Based on the theory in [23], the proposed EEOG game is an exact potential game, since the change in the potential function equals the change in individual utility function. In addition, the game has at least one pure strategy NE point. Furthermore, in exact potential game, any global or local maxima of the potential function should be a pure strategy NE.

According to the design of the potential function in (9), the proposed potential function equals the network utility; then the globally optimal solution to proposed problem (4) is a pure strategy NE of the proposed EEOG game. Hence, the theorem is proved.

5. Energy Efficient Uncorrelated Concurrent Learning Algorithm

In this section, we propose an energy efficient uncorrelated concurrent learning (EEUCL) algorithm to achieve the global optimal solution of the energy efficiency oriented AP selection problem.

Due to the large scale characteristic of WSN in the IoT and the local interactive characteristic of wireless communication, we design uncorrelated concurrent updating mechanism to speed up the learning. In addition, compared with some existing learning algorithms such as response dynamic [26], the proposed algorithm introduces a probabilistic decision mechanism inspired by Boltzmann exploration strategy [27] to avoid converging to suboptimal NE points. Furthermore, we prove that the proposed algorithm converges to a unique stationary distribution of players' strategy profile, and the global optimum would be achieved with an arbitrarily high probability. The EEUCL algorithm is shown below.

The proposed EEUCL Algorithm

Initialization. Set the iteration $i = 0$ ; each sensor randomly chooses an AP from its available action set $A_{m}$ .

Loop

Step 1 (power consumption computation).

All the sensors calculate the bandwidth allocated by the selected APs and compute the power consumption according to the bandwidth and the channel condition based on (1), (2), and (3).

Step 2 (updating players selection).

For every group of players in the same coverage scale of one AP, set a random counting timer, and the players whose timer counts to zero first would be the selected updating players. The selected updating players would be uncorrelated players, and their actions would not influence others. The selected updating players could change their AP selection strategies in the next iteration, while others would hold their strategies.

Step 3 (strategy updating).

Each selected updating player, for example, player m, randomly chooses one action $a_{m}^{\land} = {A P}_{m}^{\land}$ from its available strategy space $A_{m}$ , calculates the expected utilities $u_{m}^{\land}$ based on (7), and then randomly chooses an action in the next iteration, according to the mixed strategy, where the probability is given by

\begin{matrix} P r (a_{m} (i + 1) = a_{m}) = \frac{\exp \{ε u_{m}\}}{\exp \{ε u_{m}\} + \exp \{ε u_{m}^{\land}\}} \\ P r (a_{m} (i + 1) = a_{m}^{\land}) = \frac{\exp \{ε u_{m}^{\land}\}}{\exp \{ε u_{m}\} + \exp \{ε u_{m}^{\land}\}}, \end{matrix}

(13)

where

u_{m}

is the reward in the current iteration and ε is the learning parameter,

ε > 0

End Loop.

In Step 3, we use the probabilistic decision mechanism to escape from some suboptimal status. The learning parameter ε should be designed according to the environment and the expected performance. If the fast converge speed is mainly expected, the learning parameter ε should be set larger which means that players choose the best response action. If the network performance is mainly expected, the learning parameter ε should be set smaller which means that players have more chance for exploitation and escape from some suboptimal NE points.

Theorem 5.

With the increase of the learning parameter ε, the proposed EEUCL algorithm will converge to the global optimum with an arbitrarily high probability.

Proof.

Denote $a_{m} = {AP}_{m}^{} (i)$ as the player m's action in the ith iteration by choosing ${AP}_{m}^{} (i)$ , and the network state would be denoted as $Ψ (i) = ({AP}_{1}^{} (i), {AP}_{2}^{} (i), \dots, {AP}_{m}^{} (i), \dots, {AP}_{M}^{} (i))$ . According to updating course in the proposed algorithm, $Ψ (i)$ is a discrete time Markov process with a unique stationary distribution [28], because $Ψ (i + 1)$ is only determined by $Ψ (i)$ . Suppose the unique stationary distribution of players' strategy profile is denoted as $\bar{a} = \{\bar{a_{1}}, \bar{a_{2}}, \dots, \bar{a_{M}}\}$ , which is given by

\begin{matrix} π (\bar{a}) = \frac{e x p \{β R (\bar{a})\}}{\sum_{a \in A} e x p \{β R (a)\}}, \end{matrix}

(14)

where

R (a)

is the potential function.

A = A_{1} \otimes A_{2} \otimes \dots \otimes A_{M}

is the set of strategy profiles of all the players, where ⊗ is the Cartesian product.

Denote $Ψ (i + 1) = a_{2}$ , $Ψ (t) = a_{1}$ , and denote the transition probability from state $a_{1}$ to $a_{2}$ as $P_{a_{1}, a_{2}}$ and the transition probability from state $a_{2}$ to $a_{1}$ as $P_{a_{2}, a_{1}}$ . Consider one strategy updating player changing its AP selection, for example, for player m, the action from choosing $a_{m} (i) = {AP}_{m}^{} (i)$ to $a_{m}^{\land} (i + 1) = {AP}_{m}^{\land} (i + 1)$ , one element that may be changed between $a_{2}$ and $a_{1}$ : from $Ψ (i) = ({AP}_{1}^{} (i), {AP}_{2}^{} (i), \dots, {AP}_{m}^{} (i), \dots, {AP}_{M}^{} (i))$ to $Ψ (i + 1) = ({AP}_{1} (i + 1), {AP}_{2} (i + 1), \dots, {AP}_{m}^{\land} (i + 1), \dots, {AP}_{M} (i + 1)) .$

The probability for player m chosen to update its action is $1 / M$ . Then we have

\begin{matrix} π (a_{1}) P_{a_{1}, a_{2}} = [\frac{\exp \{ε R (a_{1})\}}{\sum_{a \in A} \exp \{ε R (a)\}}] \times [(\frac{1}{M}) (\frac{\exp \{ε u_{m}^{\land}\}}{\exp \{ε u_{m}\} + \exp \{ε u_{m}^{\land}\}})] = \frac{e x p \{ε (R (a_{1}) + u_{m}^{\land} (a_{m}^{\land}, a_{- m}))\}}{M \times \sum_{a \in A} \exp \{ε R (a)\} \times (\exp \{ε u_{m}\} + \exp \{ε u_{m}^{\land}\})} . \end{matrix}

(15)

Similarly, we have

\begin{matrix} π (a_{2}) P_{a_{2}, a_{1}} = [\frac{\exp \{ε R (a_{2})\}}{\sum_{a \in A} \exp \{ε R (a)\}}] \times [(\frac{1}{M}) (\frac{\exp \{ε u_{m}\}}{\exp \{ε u_{m}\} + \exp \{ε u_{m}^{\land}\}})] = \frac{\exp \{ε (R (a_{2}) + u_{m} (a_{m}, a_{- m}))\}}{M \times \sum_{a \in A} \exp \{ε R (a)\} \times (\exp \{ε u_{m}\} + \exp \{ε u_{m}^{\land}\})} . \end{matrix}

(16)

Because the game has been proved to be an exactly potential game, we have

\begin{matrix} R (a_{1}) - R (a_{2}) = u_{m} (a_{m}, a_{- m}) - u_{m}^{\land} (a_{m}^{\land}, a_{- m}) . \end{matrix}

(17)

Then we have

\begin{matrix} R (a_{1}) + u_{m}^{\land} (a_{m}^{\land}, a_{- m}) = R (a_{2}) + u_{m} (a_{m}, a_{- m}) . \end{matrix}

(18)

Furthermore,

\begin{matrix} \exp \{ε (R (a_{1}) + u_{m}^{\land} (a_{m}^{\land}, a_{- m}))\} = \exp \{ε (R (a_{2}) + u_{m} (a_{m}, a_{- m}))\}, \end{matrix}

(19)

and thus

\begin{matrix} π (a_{1}) P_{a_{1}, a_{2}} = π (a_{2}) P_{a_{2}, a_{1}} . \end{matrix}

(20)

In all, we have

\begin{matrix} \sum_{a_{1} \in A} π (a_{1}) P_{a_{1}, a_{2}} = \sum_{a_{1} \in A} π (a_{2}) P_{a_{2}, a_{1}} = π (a_{2}) \sum_{a_{1} \in A} P_{a_{2}, a_{1}} = π (a_{2}) . \end{matrix}

(21)

Equation (21) shows that the balance condition of Markov process is satisfied, and the proposed algorithm has the unique stationary distribution as (14) according to the discrete time Markov process theory [28].

Furthermore, according to Theorem 4, the global optimal solution to network utility is exactly the best pure strategy NE of the game. Suppose that $a^{o p t}$ is the globally optimal action profile of players; according to the design of the potential function, we have

\begin{matrix} a^{o p t} = \underset{a \in A}{a r g} \min P_{net} = \underset{a \in A}{a r g} \min R (a) . \end{matrix}

(22)

According to Theorem 4, the algorithm converges to a unique stationary distribution $π (\bar{a}) = e x p \{ε R (\bar{a})\} / \sum_{a \in A} e x p \{ε R (a)\}$ . When $ε \to \infty$ , $e x p \{ε R (a^{o p t})\} ≫ e x p \{ε R (a)\}$ , $\forall a \in \{A ∖ a^{o p t}\}$ .

Then the probability of globally optimal solution $a^{o p t}$ will be

\begin{matrix} \underset{ε \to \infty}{l i m} π (a^{opt}) = \frac{e x p \{ε R (a^{opt})\}}{\sum_{a \in A} e x p \{ε R (a)\}} = 1 . \end{matrix}

(23)

This result means that as the learning parameter increases the proposed learning algorithm converges to the global optimal solution to problem (4) with an arbitrarily high probability. Thus, the proof is completed. In other words, the proposed algorithm could achieve global optimum as the central exhaustive searching approach, but with much lower complexity in a distributed manner.

6. Numeric Results and Discussion

In this section, we evaluate the performance of the proposed EEUCL algorithm by Matlab simulations. Without loss of generality, the following parameters are set: the scale of the network is $1000 m * 1000 m .$ The path loss exponent is 2, and the noise power is $N_{0} = - 130$ dB. The instantaneous random components are assumed to be unit-constant and the channels are assumed to undergo Rayleigh fading with unit mean. The transmission data rate is 1 Mbit/s. The number of APs is 5, and the bandwidth of APs is set as $[6, 10, 20, 25, 32]$ MHz. The bandwidth values are set to simulate the cells in LTE or WLAN [29]. The 20 sensors are randomly deployed and the transmission distance for any sensor to some AP would be random.

Figure 2 shows that the proposed EEUCL algorithm converges to the stable state in about 310 learning iterations. The results are obtained by simulating 1000 independent experiments and then taking the average value. The power consumption of total network is 53.2 mW when the APs are randomly selected at the beginning. Then the power consumption reduces to almost 23.4 mW after the proposed algorithm converges. The best response (BR) algorithm [26] is used to compare with the proposed algorithm. Figure 2 shows that the proposed EEUCL algorithm obtains better performance compared with the BR algorithm, at some cost of converging speed. The reason is that the BR learning algorithm may converge to some local optimum, while the proposed algorithm could escape from local optimum to achieve global optimum based on the design of probabilistic decision in Step 3 of the proposed algorithm (13).

Figure 2

The proposed EEUCL algorithm versus BR learning algorithm, $M = 20$ .

To verify the proposed algorithm furthermore, we observe one sensor's power consumption in the procedure of learning, for example, sensor A in Figure 1. It is shown in Figure 3 that the power consumption fluctuates in the procedure of learning, because other sensors' strategies would change and have influence on it. Importantly, the observed sensor's power consumption converges to a stable value with the algorithm converging.

Figure 3

The power consumption of one sensor in the procedure of learning.

Figure 4 shows the convergence of the proposed algorithm from the AP aspect. We observe the number of sensors which select the 32 MHz AP in the procedure of learning. The strategy changes of sensors would cause the fluctuation. Again, the number of sensors converges to a stable value.

Figure 4

The number of sensors which select the 32 MHz AP.

7. Conclusion

In this paper, we studied the energy efficient access point (AP) selection for cognitive sensors in the Internet of Things (IoT). We proposed energy efficient oriented graphic (EEOG) game model and proved that the proposed game was an exact potential game. Then, to achieve the global optimization to the proposed energy efficient AP selection problem in a distributed manner, we designed energy efficient uncorrelated concurrent learning algorithm to obtain the global optimization. We proved that the proposed learning algorithm could achieve the optimal solution with an arbitrarily high probability. Simulation results verified the theoretic analysis and the performance of proposed algorithm.

Footnotes

Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.

Acknowledgments

This research is supported by Natural Science Foundation of China (no. 71571162), The National Key Technology R and D Program of China (Grant 2014BAH24F06), Natural Science Foundation of Zhejiang Province (Grants LY14F020002 and LY15G010001), and scientific innovation of university.

References

Ding

Feng

Wang

Long

Cognitive internet of things: a new paradigm beyond connection

IEEE Internet of Things Journal 2014 1 2 129 143

10.1109/jiot.2014.2311513

Agrawal

Das

M. L.

Internet of things—a paradigm shift of future Internet applications

Proceedings of the Nirma University International Conference on Engineering (NUiCONE ′11)

December 2011

Ahmedabad, India

IEEE

1 7

10.1109/NUiConE.2011.6153246

Lin

Liu

Zhou

Chen

Huang

Cooperative differential game for model energy-bandwidth efficiency tradeoff in the Internet of Things

China Communications 2014 11 1 92 102

10.1109/CC.2014.6821311

Tozlu

Senel

Mao

Keshavarzian

Wi-Fi enabled sensors for internet of things: a practical approach

IEEE Communications Magazine 2012 50 6 134 143

10.1109/mcom.2012.6211498

2-s2.0-84862276949

Sánchez López

Ranasinghe

D. C.

Harrison

McFarlane

Adding sense to the internet of things: an architecture framework for smart object systems

Personal and Ubiquitous Computing 2012 16 3 291 308

10.1007/s00779-011-0399-8

2-s2.0-84860881384

Roman

Alcaraz

Lopez

Sklavos

Key management systems for sensor networks in the context of the internet of things

Computers and Electrical Engineering 2011 37 2 147 159

10.1016/j.compeleceng.2011.01.009

2-s2.0-79955712919

Chen

Wang

Liu

Wei

A context-aware routing protocol on internet of things based on sea computing model

Journal of Computers 2012 7 1 96 105

10.4304/jcp.7.1.96-105

2-s2.0-84863064953

Danieletto

Bui

Zorzi

Improving internet of things communications through compression and classification

Proceedings of the IEEE International Conference on Pervasive Computing and Communications Workshops

March 2012

Lugano, Switzerland

IEEE

284 289

10.1109/percomw.2012.6197496

2-s2.0-84861536026

Shan

Chunxiao

Joint optimization on energy and delay for target tracking in internet of things

China Communications 2011 8 1 20 27

2-s2.0-80051973076

10.

Yao

Yin

Sequential channel sensing in cognitive small cell based on user traffic

IEEE Communications Letters 2015 19 4 637 640

10.1109/lcomm.2015.2401564

11.

Yao

A hybrid combination scheme for cooperative spectrum sensing in cognitive radio networks

Mathematical Problems in Engineering 2014 2014 7

106106

10.1155/2014/106106

2-s2.0-84896920209

12.

Yao

Zhou

Traffic based optimization of spectrum sensing in cognitive radio networks

Mathematical Problems in Engineering 2014 2014 7

349350

10.1155/2014/349350

2-s2.0-84911889384

13.

Duan

Gao

Yang

Foh

C. H.

Chen

An energy-aware trust derivation scheme with game theoretic approach in wireless sensor networks for iot applications

IEEE Internet of Things Journal 2014 1 1 58 69

10.1109/jiot.2014.2314132

14.

Fudenberg

Levine

D. K.

The Theory of Learning in Games 1998

Cambridge, Mass, USA

MIT Press

15.

Suris

J. E.

Dasilva

L. A.

Han

Mackenzie

A. B.

Komali

R. S.

Asymptotic optimality for distributed spectrum sharing using bargaining solutions

IEEE Transactions on Wireless Communications 2009 8 10 5225 5237

10.1109/twc.2009.081340

2-s2.0-70350586993

16.

Zhang

Shi

Chen

H.-H.

Guizani

Qiu

A cooperation strategy based on Nash bargaining solution in cooperative relay networks

IEEE Transactions on Vehicular Technology 2008 57 4 2570 2577

10.1109/tvt.2007.912960

2-s2.0-48749105529

17.

Zheng

Cai

Liu

Duan

Shen

Optimal power allocation and user scheduling in multicell networks: base station cooperation using a game-theoretic approach

IEEE Transactions on Wireless Communications 2014 13 12 6928 6942

10.1109/twc.2014.2334673

2-s2.0-84919472833

18.

Dai

Huang

Yang

Game theoretic max-logit learning approaches for Joint Base Station Selection and Resource Allocation in Heterogeneous Networks

IEEE Journal on Selected Areas in Communications 2015 33 6 1068 1081

10.1109/jsac.2015.2416988

19.

Duan

Gao

Yang

Foh

C. H.

Chen

An energy-aware trust derivation scheme with game theoretic approach in wireless sensor networks for IoT applications

IEEE Internet of Things Journal 2014 1 1 58 69

10.1109/jiot.2014.2314132

20.

Stuber

Principles of Mobile Communications 2001 2nd

Norwell, Mass, USA

Kluwer Academic Publishers

21.

Nowak

M. A.

Five rules for the evolution of cooperation

Science 2006 314 5805 1560 1563

10.1126/science.1133755

2-s2.0-33845415805

22.

Santos

F. C.

Santos

M. D.

Pacheco

J. M.

Social diversity promotes the emergence of cooperation in public goods games

Nature 2008 454 7201 213 216

10.1038/nature06940

2-s2.0-47049124531

23.

Marden

J. R.

Arslan

Shamma

J. S.

Cooperative control and potential games

IEEE Transactions on Systems, Man, and Cybernetics B: Cybernetics 2009 39 6 1393 1407

10.1109/tsmcb.2009.2017273

2-s2.0-70349623176

24.

Maskery

Krishnamurthy

Zhao

Decentralized dynamic spectrum access for cognitive radios: cooperative design of a non-cooperative game

IEEE Transactions on Communications 2009 57 2 459 569

10.1109/tcomm.2009.02.070158

2-s2.0-61649121728

25.

Altman

Jiménez

Vicuna

Márquez

Coordination games over collision channels

Proceedings of the 6th International Symposium on Modeling and Optimization in Mobile, Ad Hoc, and Wireless Networks (Wiopt ′08)

April 2008

Berlin, Germany

523 527

10.1109/wiopt.2008.4586122

2-s2.0-52249114927

26.

Zhong

Tianfield

Game-theoretic opportunistic spectrum sharing strategy selection for cognitive MIMO multiple access channels

IEEE Transactions on Signal Processing 2011 59 6 2745 2759

10.1109/tsp.2011.2121063

2-s2.0-79957523610

27.

van Laarhoven

P. J. M.

Aarts

E. H. L.

Simulated Annealing: Theory and Applications 1987

Amsterdam, The Netherlands

North Holland, Reidel

10.1007/978-94-015-7744-1

28.

Young

H. P.

Individual Strategy and Social Structure 1998

Princeton, NJ, USA

Princeton University Press

29.

Raychaudhuri

Mandayam

N. B.

Frontiers of wireless and mobile communications

Proceedings of the IEEE 2012 100 4 824 840

10.1109/jproc.2011.2182095

2-s2.0-84858996596