Abstract
As the use and scale of wireless sensor networks has grown, the lack of transmission spectrum has become a key factor affecting wireless sensor network quality of service. An effective solution is to collect unused spectrum from primary users (i.e. licensed users) and use that spectrum for wireless sensor networks. In this article, for a cognitive radio-based wireless sensor network, a spectrum sharing strategy between primary users and clusters is proposed. First, as a secondary user (i.e. an unlicensed user), the base station plays a Bertrand game with the monopolists (the primary users) to gain spectrum access on behalf of the clusters. Then, the base station allocates the spectrum band to the cluster heads using the ant colony optimization–based multiple knapsack problem. After obtaining the spectrum, each cluster head programs its nodes’ spectrum according to a timing sequence. Simulation results show that the Bertrand game model is a good choice for determining spectrum pricing and that the ant colony optimization–based multiple knapsack problem algorithm is an efficient way to allocate spectrum dynamically.
Keywords
Introduction
As wireless communication technology develops, wireless sensor network (WSN) applications are becoming increasingly widespread and application environments have become increasingly complicated; consequently, the existing spectrum sharing scheme fails to meet WSN spectrum demand in many cases. Using cognitive radio technology,1–4 taking advantage of the idle spectrum owned by primary users under the constraint that such use not affects primary users’ normal communications, secondary users can improve spectral efficiency. However, in WSNs, the spectrum sharing problem that determines how to allocate different spectra to many different nodes is a multiple-seller/multiple-buyer spectrum sharing model.
Assuming that all WSNs can be clustered, meeting WSN spectrum demand is equivalent to solving the spectrum demand in a clustered WSN. In this article, we take a centralized WSN as an example, where the base station acts as a middleman between the primary users and the clusters. In distributed WSNs, we can create a center that knows all the WSN information to act as the middleman. Representing all cluster heads, the base station plays the game with the primary users. Simultaneously, the primary users, who possess different channel qualities, compete on spectrum price. Although the Cournot Game, 5 Bertrand Game,6,7 Stackelberg Game, 8 Evolutionary Game, 9 and Auction 10 are available, in this study, the Bertrand Game was chosen because it is popularly applied to model price competition behavior in the oligopoly market and obtain the equilibrium price of each primary user. Thus, the base station buys bandwidth from each primary user depending on the price and the channel quality. Using the multiple knapsack problem (MKP), the base station allocates the purchased spectrum (with different channel qualities) to the clusters based on the size and urgency of each cluster’s requirements. Finally, the cluster head assigns bandwidth to nodes while considering the timing sequence. In this manner, we can achieve a spectrum pricing and dynamic allocation model. The major contributions of this article are as follows:
Most existing studies focus on multiple sellers selling spectrum to one buyer or one seller selling spectrum to multiple buyers.11,12 This article solves the problem of multiple sellers selling spectrum to multiple buyers in WSN using game theory and the MKP algorithm.
It improves the practicality of the spectrum sharing model by considering the importance and urgency of a node’s information and imposing a maximum waiting period for a node.
The remainder of this article is organized as follows. Section “Related work” reviews related works and analyzes the existing problems. The system model and some assumptions made in this article, including the spectrum pricing and dynamic allocation model, are described in section “System model and assumptions.” In section “Spectrum pricing game,” based on the game, the existence of a Nash equilibrium (NE) is proved and the NE is found. After purchasing spectrum, to address the algorithm that describes spectrum allocation, section “Spectrum allocation model” discusses its implementation process. Section “Performance evaluation” presents a quantitative performance analysis, and section “Conclusion and future work” summarizes the work and proposes future work.
Related work
Cognitive radio technology is an effective method for solving spectrum management problems in WSNs. Xing et al. 13 and Martin et al. 14 provided an overview of different spectrum access/sharing models, including the open sharing model, hierarchical access model and the dynamic exclusive use model. These studies also addressed detecting the unused spectrum of primary users.
Game theory, which can maximize the benefits accrued by both primary and secondary users, has been proved to be feasible for a cognitive radio network (CRN). 15 Based on whether the game participants choose to cooperate, games can be broadly divided into two types: cooperative and non-cooperative games. Non-cooperative games do not achieve overall optimal results for their participants; its players determine their strategies based on local rather than global information.16,17 Therefore, this study proposes using a cooperative game to achieve spectrum management, because cooperative games outperform non-cooperative games in terms of fairness and efficiency.18–20 Cooperative and non-cooperative games were compared in detail in Niyato and Hossain21,22 and Raoof and Al-Raweshidy 23 in terms of their application environments, network structure, advantages and disadvantages, and so forth.
Most of the studies mentioned above solved spectrum sharing problems in which multiple sellers sell spectrum to one buyer or one seller sells spectrum to multiple buyers. Niyato et al. 9 modeled the dynamics of competitive spectrum sharing/pricing among multiple primary users and multiple secondary users in a CRN and proposed a dynamic evolutionary game to analyze the evolutionary and dynamic behavior of secondary users, while a non-cooperative game was formulated to model the competition among primary users in terms of the size of the offered bandwidth and the spectrum price. However, it is this model used the same parameter settings for primary users, which is unrealistic, and secondary users’ spectrum requirements were all considered to be the same. A feedback learning scheme based on the multi-leader multi-follower Stackelberg game model in CRN 24 outperformed many existing schemes in terms of spectrum efficiency, system fairness, throughput, network revenue, and so on. This scheme applied to widely diverse network environment, but obtained only an individual optimal solution. Considering the heterogeneity in terms of primary users’ channel bandwidths and secondary users’ demands, a new combinatorial spectrum auction framework in CRN was proposed for the scenarios in which each primary user had multiple channels to sell and each secondary user demanded multiple channels. 10 In the proposed auction framework, the winner determination problem was formulated as multiple multidimensional knapsack problem (MMKP) and was solved by a polynomial-time approximation algorithm. Although the proposed auction algorithm can be applied to WSNs, whose total spectrum demand and number of secondary users are larger than the total bandwidth and the number of primary users, it lacks fairness for secondary users who are only weakly competitive and faces the dilemma that the bids of some secondary users always succeed, but the most of the other secondary users will be unable to communicate during any period.
System model and assumptions
We consider a system with M primary users, a base station and N clusters (Figure 1). The M primary users own different spectra (denoted by

System model for spectrum sharing.
Pricing model (Bertrand game)
The M primary users, without cooperating with each other, compete to rent their own spectrum to the base station by controlling their own band prices to maximize their own interests. In economics, this is called an oligopoly market. Each monopolist’s decision will change dynamically in response to the strategies of other monopolists. As a type of oligopoly market game, the Bertrand game is used to analyze the price competition behavior and equilibrium pricing strategy of each primary user. The game’s participants are the M primary users. The strategy of each player is to maximize the unit spectrum price (which is non-negative). The final benefit to the primary user is the accrued income from selling spectrum minus any loss caused by a decrease in the QoS. The spectrum demands of secondary users depend on the available channel quality of the
Spectrum allocation model (ant colony optimization–based multiple knapsack problem)
After the base station purchases spectrum, it must allocate the spectrum, which consists of different qualities and quantities, to the N cluster heads to meet spectrum demand. This is a MKP. N items are selectively placed in the M knapsacks in a manner that maximizes the value of the items in the knapsacks. Obviously, each knapsack contains a subset of the N items. This allocation is also an non-deterministic polynomial-time (NP)-hard problem. If the traversal algorithm is used to attempt all possible cases, the calculation complexity becomes an exponential function of M and N; thus, when M or N are relatively large, finding the optimal solution takes too long. The ACO algorithm, as a population-based stochastic meta-heuristic, is amenable to parallel implementation. Therefore, the ACO algorithm can more quickly find an approximately optimal solution of the MKP.25,26
Assumptions
The spectrum allocated to the cluster head is used only for communication between the cluster head and its nodes; inter-cluster communication uses a fixed spectrum. Therefore, the spectrum demand of each cluster is related only to the urgency coefficient and the number of nodes in the cluster and does not involve inter-cluster relationships. We assume that there are no spatial differences between the clusters.
The sensor nodes have the same hardware and adaptive modulation. Therefore, it can be assumed that each node has the same signal-to-noise ratio (SNR) and the same channel quality for the same primary user.
The required channel quality of each node is 9 dB by default. The channel quality of the primary users ranges from 9 to 22 dB.
This article considers only the situation in which the purchased bandwidth is less than the node spectrum demand (otherwise, spectrum allocation would not be required because the requirements of all the nodes could be satisfied.).
Spectrum pricing game
In this section, to find the solution of the spectrum pricing game, that is, the NE, we quantify the utility of the primary user and the base station according to the utility function used in Singh and Vives. 27
Utility of secondary user
In Singh and Vives, 27 the utility function of the secondary user, which consists of three parts (the revenue gained from using purchased spectrum to communicate, the cost caused by secondary users’ switch among the frequency spectra, and the payment for radio resource usage), is related to the owned bandwidth, channel quality, spectrum price, and spectrum substitutability
where
where
In equation (1), the utility function of the base station is not related to the spectrum demand of the WSN. For different networks, we would certainly hope that the purchased bandwidth changes based on network demands. Therefore, the utility function should consider the negative utility caused by the difference between spectrum supply and demand as follows
where
where
The purchased spectrum size at which the base station obtains the maximal profit can be calculated using
From equation (6), the conclusion can be drawn that
Profit function of primary users
The profit acquired by primary user
where
If
Bertrand game mode
The Bertrand game is used to model the process of spectrum pricing; an analysis is given below:
Players: M primary users;
Strategy: variable
Payoff:
In Namvar and Afghah,
17
the existence of a single game solution (i.e. the NE) has been proved; therefore, there is no need to restate it here. To find NE, the best response (BR) of primary user
where
For all
However, the
Similarly, using
Spectrum allocation model
M spectra with varied channel qualities and different widths are assigned to N clusters while ensuring that the cluster with the highest urgency is preferentially satisfied. First, the differences in channel quality are presented in the form of bandwidth; the equivalent bandwidth of each spectrum is calculated as follows
where
where
The MKP is formulated as follows
Then,
However, MKP is a typical NP-hard problem. When either the number of knapsacks or the number of items is relatively large, it is difficult to find the optimal MKP solution. Nevertheless, using an optimization algorithm, an approximately optimal solution is easy to obtain. In this article, we adopt the ACO algorithm, which has been shown to be efficient. Ants consider each cluster as an intersection that have (M + 1) directions; therefore, there are N intersections. Whenever ants pass through an intersection, they choose a path from among the M primary users and user 0 (where path 0 does not choose any primary user) based on a probability that is in a direct ratio to
When all ants have completed their journeys, the
where

ACO-MKP algorithm model.
Here,
Here,
After the spectrum has been allocated through the ACO-MKP algorithm, the clusters can use the spectrum for their own nodes based on the timing sequence. For the other nodes waiting to transmit, their probabilities of obtaining bandwidth during the next period are increased by controlling the
Performance evaluation
Parameter setting
We consider a network model with three primary users providing service for a WSN containing 15 clusters as secondary users (Figure 1). The total bandwidth available to each primary user is 30 MHz (i.e.
Numerical results
BR, NE, and the optimization process
For the Bertrand game with three primary users, the proof of its principle is similar to that of the game with two primary users. As shown in Figure 3, the BR of each primary user corresponding to other primary users is a complex three-dimensional (3D) surface rather than the simple two-dimensional (2D) curve resulting from the game model with two primary users. The three curved surfaces intersect at point (6.7, 5.7, 3.5); this is the NE and proves that the NE is unique.

Best response function and NE.
At the NE, the spectrum prices demanded by the primary users are P = {6.7, 5.7, 3.5} and the sold bandwidth amounts are b = {7.0, 5.9, 3.6}. Furthermore, the utility value of the WSN is U(b) = 57.0, and the utility values of the primary users are π(P) = {70.3, 58.7, 40.9}. Using global optimization, the spectrum price at NE changes to P* = {9.2, 7.8, 5.2}, while the sold bandwidth becomes b* = {5.4, 5.1, 3.6}. The utility values of the WSN and the primary users are U*(b) = 23.2 and π*(P) = {76.2, 66.3, 47.3}, respectively. Through cooperation, the primary users’ utility values have been significantly increased at the expense of the network’s utility value. In economics, three primary users form a market monopoly, controlling the market supply of spectrum and the spectrum price. Therefore, the network should pay more for the same bandwidth, leading the utility value U(b) to decrease from 57.0 to 23.2.
Total profit
As shown in Figure 4, When P2 and P3 are kept constant in P and P* while P1 gradually increases, the total profits corresponding to P and P* both increase as P1 increases; however, after the total profit reaches a certain peak value, the total profit will be reduced for a reduced spectrum demand leading the secondary users to choose a less-expensive primary user. In addition, when P1 reaches the NE, the total profit is the maximal value. The fact that the total profit obtained by P* is higher than that of P proves that optimizing the game does improve the total profit of the primary users.

Total profit when
ACO-MKP algorithm and genetic algorithm–based MKP algorithm during one period
When the purchased bandwidth of WSN is b = {5.4, 5.1, 3.6}, the ACO-MKP algorithm uses ant colony optimization (ACO) to heuristically find the optimal allocation scheme. By influencing the pheromone, the ACO-MKP algorithm can converge to an optimal scheme after repeated iterations. As a comparison, the genetic algorithm–based MKP (GA-MKP) in Unal 28 is taken into consideration.
Figure 5 shows that after 23 iterations, the ACO-MKP algorithm whose time and space complexity are O(N) and O(150 × N) has converged to the clustering scheme (list = [1, 1, 1, 3, 3, 2, 1, 0, 0, 2, 0, 2, 0, 1, 2]), whose value is 26. In other words, b1 supplies spectrum to clusters 1, 2, 3, 7 and 14 and b2 meets the spectrum demand of clusters 6, 10, 12, and 15, b3 is allocated to clusters 4 and 5, and the remaining clusters (8, 9, 11, and 13) must wait to communicate until additional spectrum is purchased and allocated. The remaining bandwidth is b_remain = {0.1, 0.2, 0}. However, to converge to an approximate iteration, the time and space complexity of GA-MKP algorithm are O(2 N) and O(500 × 2 N) (the length of each chromosome is 2N). It obtains list = [2, 1, 0, 3, 3, 1, 1, 2, 0, 2, 1, 0, 0, 1, 2], whose value is only 24.6, and the remaining bandwidth is b_remain = {0.4984, 0.1793, 0.0209}. Compared to the GA-MKP algorithm, the ACO-MKP algorithm uses less time and increases the value to 1.4, which is equivalent to providing bandwidth for an additional 14 nodes. The ACO-MKP algorithm, which outperforms GA-MKP, can improve the utilization rate of the purchased bandwidth.

ACO-MKP and GA-MKP algorithm during one period.
ACO-MKP algorithm and GA-MKP algorithm over 30 periods
We executed both algorithms for 30 periods under the premise of b = {5.4, 5.1, 3.6}; Figure 6 describes each cluster’s degree of spectrum demand satisfaction. When all clusters are sorted by vd in descending order, the satisfaction degree of the two algorithms shows a decreasing trend. This trend occurs because the clusters with higher vd values generally have greater urgency or are more valuable; clusters with lower vd values often do not have sufficient urgency to require immediate communications. Therefore, two algorithms, which both allocate spectrum dynamically, cause low-urgency clusters to fail to communicate over more periods.

ACO-MKP and GA-MKP algorithm over 30 periods.
For those clusters with high vd, the satisfaction degrees of the ACO-MKP algorithm are obviously higher than the GA-MKP algorithm, while for those cluster with low vd, the contrary happens. This is caused by the characteristic of each algorithm. The ACO finds the best solution by affect the pheromone, which not only iteratively screens out those optimal solutions, but focuses on those clusters with high vd. The GA iteratively optimizes the population and selects those better chromosomes to obtain the optimal solution, which cannot take the difference of each gene into consideration. It is worth mentioning that because the urgency coefficients of clusters 4 and 12 are 1, the satisfaction degrees of ACO-MKP algorithms is 1. However, the urgency coefficient cluster 9 and 11 is the lowest; therefore, the satisfaction degrees of the two algorithms are both relatively low.
Through the ACO-MKP algorithm, the σ values of the majority clusters are 1 or 2 (GA-MKP are 3), and when the urgency coefficient is 1, the σ of that cluster is 0 (that is, that cluster is guaranteed to be allowed to communicate in all periods). The maximal σ of all clusters is 3, which fulfills the application requirement. About GA-MKP algorithm, Figure 6 shows that it cannot guarantee the transmission to the cluster, whose urgency coefficient is 1. And its maximal σ of all clusters is higher (5 or higher, that depends). Therefore, ACO-MKP is a better way to solve the spectrum allocation in the WSN.
The remaining b
Figure 7 shows that the remaining b1, b2, and b3 in the ACO-MKP algorithm are generally lower than 0.2, which can be ignored, while in the GA-MKP algorithm, the remaining b3 is generally higher than 0.4, and in some cases, the maximum value reaches 1.3. It is inefficient when the remaining b of the GA-MKP algorithm is high enough to meet substantial node demand. Finally, the ACO-MKP algorithm allows the utilization rate of b to reach 0.9576, while for the GA-MKP algorithm, it reaches only 0.9012. This result also indicates that the ACO-MKP algorithm performs better in many aspects than does the GA-MKP algorithm. The proposed ACO-MKP algorithm can outperform GA-MKP algorithm in terms of the spectrum utilization rate.

The remaining b of the ACO-MKP and GA-MKP algorithm.
Conclusion and future work
Employing game theory and cognitive radio technology in a spectrum sharing model for WSNs finds a solution that simultaneously achieves the maximum utility for both primary and secondary users. Figure 3 confirms that this unique point of equilibrium indeed exists. Through global optimization, Figure 4 shows that the optimized NE is selected even as the spectrum price rises and the total utility of primary users increases, while Figures 6 and 7 show that—compared to the GA-MKP algorithm—the ACO-MKP algorithm increases the bandwidth utilization rate by 5.9%. Overall, this article achieves the multi-seller and multi-buyer spectrum sharing model in WSN by combining the Bertrand game with the ACO-MKP algorithm.
Even though this study realized the spectrum sharing model in WSN, the Bertrand game was simulated in a simplistic fashion. In addition, approaching the problem from the standpoint that the difference in channel quality is equivalent to bandwidth is controversial. Moreover, in this article, the urgency coefficient refers specifically to a static urgency coefficient, while in real-world situations, a dynamic urgency coefficient would be more applicable and will be considered in future work.
Footnotes
Handling Editor: Iyad Dayoub
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by the project of National Natural Science Foundation of China (61571068) and innovative research projects of colleges and universities in Chongqing (12A19369).
