Abstract
This paper studies the distributed energy efficient access point (AP) selection for cognitive sensors in the Internet of Things (IoT). The energy consumption is critical for the wireless sensor network (WSN), and central control would cause extremely high complexity due to the dense and dynamic deployment of sensors in the IoT. The desired approach is the one with lower computation complexity and much more flexibility, and the global optimization is also expected. We solve the multisensors AP selection problem by using the game theory and distributed learning algorithm. First, we formulate an energy oriented AP selection problem and propose a game model which is proved to be an exact potential game. Second, we design a distributed learning algorithm to obtain the globally optimal solution to the problem in a distributed manner. Finally, simulation results verify the theoretic analysis and show that the proposed approach could achieve much higher energy efficiency.
1. Introduction
Much attention has been paid to the emerging Internet of Things (IoT) recently, which has applications on climate monitoring, transport safety, home automation, health care, and so on. The IoT has been seen as an important technological revolution that brings us into a new ubiquitous connectivity, computing, and communication era [1]. The IoT is a global network which allows the communication of anything in the world by providing a unique digital identity to each and every object [2]. Sensing the environment and connecting with other things are expected basic capabilities of users in the IoT. Then the wireless sensor network (WSN) which helps sense the environment and communicate to the others is significant to the IoT.
With the WSN, the information exchange is one of the key points. The sensors should share their sensing information with other things in the IoT through the Internet. Though wired connectivity could provide high speed communication, the limitation of distance and location constrain the development of the IoT. With the rapid development of mobile Internet and wireless communication, the information exchange through wireless communication could neglect the limitation of wired communication, which impels the development of IoT significantly.
With the sensors working on the wireless model in the IoT, energy is very sensitive [3]. Consuming less energy to complete the communication with its Internet access point (AP) is very important to the sensors. Given the communication model and the data rates, the energy consumption is mainly determined by the allocated bandwidth and the channel quality between the sensor and its AP. Due to the possible mobility of sensors and the wireless channel quality change between the sensors and the APs, allocating sensors to APs in a predefined and fixed way is not efficient and vulnerable. In addition, considering the huge number of sensors in the IoT, allocating sensors fixed spectrum resource is impossible due to the limitation of wireless spectrum resource.
There are some researches on the WSN in the IoT. Authors in [4–6] focus on the surroundings sensing in IoT based on the sensor network technologies. Authors in [7, 8] studied the communication protocols. There are also some researches on the energy efficiency problem in the WSN. The tracking effects and energy consumption optimization was studied in [9]. A tradeoff between bandwidth and energy consumption in the IoT was studied in [3]. To the best of our knowledge, there are rare researches which paid attention to the AP selection for the sensors.
Recently, the cognitive Internet of Things (CIoT) has been proposed, which aims to empower the current IoT with a “brain” for high-level intelligence [1]. The cognitive sensors could obtain the environment spectrum information through spectrum sensing approaches [10–12]. Based on the intelligence of sensors in the CIoT, the sensors could smartly select the proper APs according to the environment and other sensors' action, to obtain better energy efficiency.
In this paper, we focus on the energy efficient AP selection for cognitive sensors in the WSN, which is an important and practical problem in the IoT. Due to the extremely high complexity of central control for dense and dynamic deployment of sensors [13], we focus on the distributed selection scheme, which adapts the changeable environment better. We solve the multisensors AP selection problem by using the game theory [14] and distributed learning algorithm. First, we formulate an energy oriented AP selection problem and propose a game model which is proved to be an exact potential game. Second, we design a distributed learning algorithm to obtain the globally optimal solution to the problem in a distributed manner. Finally, simulation results verify the theoretic analysis and show that the proposed approach could achieve much higher energy efficiency.
The rest of this paper is organized as follows: In Section 2, we present the system model and problem formulation. In Section 3, we formulate the proposed game model and investigate its properties. In Section 4, we propose the distributed learning algorithm. In Section 5, simulation results and discussion are presented. Finally, we provide conclusions in Section 6.
2. Related Works
There are some researches on the technologies in the IoT. Authors in [4–6] focus on the surroundings sensing in IoT based on the sensor network technologies. The identification in IoT has been studied in [7, 8].
With the energy consumption optimization problem in WSN which we focus on, there are also some researches. In [3], authors proposed a tradeoff between bandwidth and energy consumption in the IoT. In [9], the authors studied the tracking effects and energy consumption optimization. To the best of our knowledge, there are rare researches which paid attention to the AP selection for the sensors.
With the aspect of optimization approach, there are two main types: the centralized optimization approach and the distributed optimization approach. In [13], the authors pointed that the centralized optimization would bring extremely high complexity due to the dense and dynamic deployment of sensors. Then the distributed optimization approach is the one adopted by this paper.
With the distributed optimization approach, we focus on the game theory [14], which is a powerful tool to study the interactions among multiple players and has been used in the distributed optimization in many researches about the distributed network [15, 16]. For some good properties to the distributed global optimization, the potential game was also applied in some researches. In [17], the authors proposed a multicell coordination approach to mitigate the mutual interference among base stations in the frequency slotted cellular networks. In [18], the authors investigated the problem of joint base station selection and resource allocation in an orthogonal frequency division multiple access heterogeneous cellular network, analyzed this problem by using potential game theoretic approaches, and proposed two different variants of Max-logit learning algorithms which achieved outstanding performances.
For the WSN in IoT, there are only few game based researches to our best knowledge. In [3], a service providing model was built by using a differential game model. The game solution was gotten in the condition of grand coalition, feedback Nash equilibrium, and intermediate coalitions and an allocation policy was obtained by Shapley theory. Authors in [19] proposed an energy aware trust derivation scheme using game theoretic approach, which managed overhead while maintaining adequate security of WSNs. Nevertheless, with the distributed global energy efficient AP selection optimization in IoT which we focus on, related studies are very limited.
3. System Model and Problem Formulation
We consider a WSN consisting of M sensors and N APs. Sensors exchange information with other things in the IoT through the communication with APs and then to the Internet. Large scale and huge number of sensors and APs consist of the common connected and common cooperative IoT. Due to the advantage of location and flexible deployment of sensors, we assume that the wireless communication between sensors and APs is applied.
Importantly, the heterogeneous characteristic of APs is considered, where the spectrum bandwidth resources of APs are heterogeneous. The heterogeneous wireless network is a promising paradigm in 4G and foreseeable 5G wireless communications. For example, in Figure 1, there are two types of APs with different spectrum bandwidth resources. The spectrum bandwidth resource of different APs might be 20 MHz or 3 MHz, which could be seen as the common 3G, 4G, and 5G cellular network cells or WIFI points and so on. Sensors are widely deployed and can freely select APs according to the environment, and the bandwidth resource of each AP would be equally allocated to sensors connected with this AP.

System model.
As an example of energy efficient AP selection, in Figure 1, sensor A could select AP1, AP2, or AP3. The selection decision would be determined by the energy efficiency. The distance for sensor A to AP2 is shortest, but the number of sensors which have selected AP2 is much more than that of AP1. If sensor A selects AP2, it will share the bandwidth resource with the other 4 sensors and the bandwidth obtained would be less than that obtained from AP1. In all, the AP selection would be determined by the distribution of sensors, the channel qualities, the distribution of APs, the bandwidth resources of APs, and the selection of other sensors.
Denote
Because the qualities of wireless channels are related to the location and channel frequency, choosing different frequency channel means different energy consumption. If sensor m chooses
Then the transmission power of sensor m connected with
For each sensor, given the environment parameters such as location, channel quality, and the resources of APs nearby, it needs to make the decision on which AP to access. The lower power consumption AP selection is expected. However, the interaction among multisensors is extremely complex. The power consumption of each sensor would be effected by other sensors' action significantly. APs with more resources would much more likely be selected by other sensors, which would reduce the low bandwidth allocation.
In the perspective of the whole WSN, the optimization objective is twofold. First, the state of network is expected to be stable; that is, the selections of sensors are expected to be stable. Second, the sum of the power consumption of the network is expected to be low; that is, the optimization objective is given by
Remark 1.
The proposed AP selection problem is a typical combinational optimization problem. Though the exhaustive search carried out by the central manager could achieve the global optimal solution, the computation complexity is extremely high. In addition, the flexibility of the system is not good enough. For example, even in a relative small scenario, where
4. The Energy Efficiency Oriented Graphic Game
In this section, we proposed an energy efficient oriented graphic (EEOG) game to solve the distributed AP selection optimization problem. Every sensor is regarded as a player in the game, and we aim to obtain expected properties for the global WSN through the design of the utility function.
Definition 2.
One defines the energy efficient oriented graphic (EEOG) game as
Define the action of player m as
Define
To achieve expected global network performance, based on the regional interaction and the cooperation feature in the IoT [1], motivated by local collaboration in biographical systems [21, 22] and the collaboration design in networks [23–25], we define the utility function of player, for example, player m's utility function, as follows:
Definition 3.
Denote an action profile of players as
Theorem 4.
A pure strategy NE of the proposed EEOG game is just the global optimal solution to proposed optimization problem (4).
Proof.
The following proof is based on the potential game theory [23]. We define the network utility as the potential function of the EEOG game:
Note that the mutual influence would not occur when the players are not related players. Then for
Then we have
Based on the theory in [23], the proposed EEOG game is an exact potential game, since the change in the potential function equals the change in individual utility function. In addition, the game has at least one pure strategy NE point. Furthermore, in exact potential game, any global or local maxima of the potential function should be a pure strategy NE.
According to the design of the potential function in (9), the proposed potential function equals the network utility; then the globally optimal solution to proposed problem (4) is a pure strategy NE of the proposed EEOG game. Hence, the theorem is proved.
5. Energy Efficient Uncorrelated Concurrent Learning Algorithm
In this section, we propose an energy efficient uncorrelated concurrent learning (EEUCL) algorithm to achieve the global optimal solution of the energy efficiency oriented AP selection problem.
Due to the large scale characteristic of WSN in the IoT and the local interactive characteristic of wireless communication, we design uncorrelated concurrent updating mechanism to speed up the learning. In addition, compared with some existing learning algorithms such as response dynamic [26], the proposed algorithm introduces a probabilistic decision mechanism inspired by Boltzmann exploration strategy [27] to avoid converging to suboptimal NE points. Furthermore, we prove that the proposed algorithm converges to a unique stationary distribution of players' strategy profile, and the global optimum would be achieved with an arbitrarily high probability. The EEUCL algorithm is shown below.
The proposed EEUCL Algorithm
Initialization. Set the iteration
Loop
Step 1 (power consumption computation).
All the sensors calculate the bandwidth allocated by the selected APs and compute the power consumption according to the bandwidth and the channel condition based on (1), (2), and (3).
Step 2 (updating players selection).
For every group of players in the same coverage scale of one AP, set a random counting timer, and the players whose timer counts to zero first would be the selected updating players. The selected updating players would be uncorrelated players, and their actions would not influence others. The selected updating players could change their AP selection strategies in the next iteration, while others would hold their strategies.
Step 3 (strategy updating).
Each selected updating player, for example, player m, randomly chooses one action
End Loop.
In Step 3, we use the probabilistic decision mechanism to escape from some suboptimal status. The learning parameter ε should be designed according to the environment and the expected performance. If the fast converge speed is mainly expected, the learning parameter ε should be set larger which means that players choose the best response action. If the network performance is mainly expected, the learning parameter ε should be set smaller which means that players have more chance for exploitation and escape from some suboptimal NE points.
Theorem 5.
With the increase of the learning parameter ε, the proposed EEUCL algorithm will converge to the global optimum with an arbitrarily high probability.
Proof.
Denote
Denote
The probability for player m chosen to update its action is
Similarly, we have
Because the game has been proved to be an exactly potential game, we have
Then we have
Furthermore,
In all, we have
Equation (21) shows that the balance condition of Markov process is satisfied, and the proposed algorithm has the unique stationary distribution as (14) according to the discrete time Markov process theory [28].
Furthermore, according to Theorem 4, the global optimal solution to network utility is exactly the best pure strategy NE of the game. Suppose that
According to Theorem 4, the algorithm converges to a unique stationary distribution
Then the probability of globally optimal solution
This result means that as the learning parameter increases the proposed learning algorithm converges to the global optimal solution to problem (4) with an arbitrarily high probability. Thus, the proof is completed. In other words, the proposed algorithm could achieve global optimum as the central exhaustive searching approach, but with much lower complexity in a distributed manner.
6. Numeric Results and Discussion
In this section, we evaluate the performance of the proposed EEUCL algorithm by Matlab simulations. Without loss of generality, the following parameters are set: the scale of the network is
Figure 2 shows that the proposed EEUCL algorithm converges to the stable state in about 310 learning iterations. The results are obtained by simulating 1000 independent experiments and then taking the average value. The power consumption of total network is 53.2 mW when the APs are randomly selected at the beginning. Then the power consumption reduces to almost 23.4 mW after the proposed algorithm converges. The best response (BR) algorithm [26] is used to compare with the proposed algorithm. Figure 2 shows that the proposed EEUCL algorithm obtains better performance compared with the BR algorithm, at some cost of converging speed. The reason is that the BR learning algorithm may converge to some local optimum, while the proposed algorithm could escape from local optimum to achieve global optimum based on the design of probabilistic decision in Step 3 of the proposed algorithm (13).

The proposed EEUCL algorithm versus BR learning algorithm,
To verify the proposed algorithm furthermore, we observe one sensor's power consumption in the procedure of learning, for example, sensor A in Figure 1. It is shown in Figure 3 that the power consumption fluctuates in the procedure of learning, because other sensors' strategies would change and have influence on it. Importantly, the observed sensor's power consumption converges to a stable value with the algorithm converging.

The power consumption of one sensor in the procedure of learning.
Figure 4 shows the convergence of the proposed algorithm from the AP aspect. We observe the number of sensors which select the 32 MHz AP in the procedure of learning. The strategy changes of sensors would cause the fluctuation. Again, the number of sensors converges to a stable value.

The number of sensors which select the 32 MHz AP.
7. Conclusion
In this paper, we studied the energy efficient access point (AP) selection for cognitive sensors in the Internet of Things (IoT). We proposed energy efficient oriented graphic (EEOG) game model and proved that the proposed game was an exact potential game. Then, to achieve the global optimization to the proposed energy efficient AP selection problem in a distributed manner, we designed energy efficient uncorrelated concurrent learning algorithm to obtain the global optimization. We proved that the proposed learning algorithm could achieve the optimal solution with an arbitrarily high probability. Simulation results verified the theoretic analysis and the performance of proposed algorithm.
Footnotes
Conflict of Interests
The authors declare that there is no conflict of interests regarding the publication of this paper.
Acknowledgments
This research is supported by Natural Science Foundation of China (no. 71571162), The National Key Technology R and D Program of China (Grant 2014BAH24F06), Natural Science Foundation of Zhejiang Province (Grants LY14F020002 and LY15G010001), and scientific innovation of university.
