Sage Journals: Discover world-class research

Abstract

This paper presents a selective spectrum sensing and access strategy in a cognitive radio sensor network (CRSN), in order to maximize the throughput of secondary user (SU) system. An SU senses multiple channels simultaneously via wideband spectrum sensing. To maximize the throughput and reduce the sensing energy consumption, not all of the channels are sensed. The SU selects some channels for spectrum sensing and accesses these channels based on the sensing results. The unselected channels are accessed directly with low transmission power. A selection making algorithm based on partially observable Markov decision process (POMDP) theory is proposed, to make the SU determine which channels are selected for sensing, how long the sensing time, and the transmission powers of channels. An optimal policy and a myopic policy are proposed to solve the proposed POMDP problem. Moreover, an optimization problem is proposed to solve the synchronism problem among the selected channels. Numerical results show that the proposed selective spectrum sensing and access strategy improves the system performance efficiently.

1. Introduction

Wireless sensor networks (WSNs) play a critical role in many research areas, such as machine-to-machine network (M2M), emergency communication, and smart home [1–4]. There are two characteristics about the WSNs: limited energy and scarce spectrum. Generally, the nodes of WSNs are powered by battery; energy efficiency is one of the design factors. In this paper, we focus on the spectrum scarcity problem of WSNs. According to the dramatic increasing number of wireless devices, the problem of available spectrum scarcity becomes more serious. Cognitive radio (CR), which has been introduced as a way to improve the efficiency of spectrum utilization, becomes a research focus in recent years [1–27]. Combined with the advantages of CR and WSNs, cognitive radio sensor network (CRSN) has been studied [1–4, 6–8, 15]. In studies [1, 2], CR has been used in smart grid communication networks. The spectrum access strategy has been proposed, in order to find much available spectrum resource for data collection and transmission. In order to improve the energy efficient of vehicular ad hoc network, CRSN has been studied in [3]. In the CRSN, the wireless nodes are defined as secondary (unlicensed) users (SUs), while the other available spectrum owners are defined as primary (licensed) users (PUs). The important challenge is that the SUs find available channels and adjust their transmission parameters (transmission power, transmission time, carrier frequency, etc.) to access these channels, while avoiding harmful interference to PUs.

Normally, the available spectrum opportunities are found in time [4, 7, 8, 10–12, 15] space [14] and frequency domains [18]. In [10–12], SUs perform spectrum sensing to find access opportunities in time domain. The SUs access the licensed channel when it was not used by PUs. In [4], channel sensing and switching were used for SU to access a set of PU channels. In [7, 8], two spectrum sensing methods were studied in CRSN. In [15], both single-user and collaborative spectrum sensing schemes were proposed in cognitive sensor networks, while quantised sensing observation is used. In [13], the SUs accessed the licensed channel directly at any time; the transmission power was limited to avoid unacceptable interference to PU system. The spectrum access opportunity in space domain was used. In order to find available spectrum in different channels, parallel sensing [19], cooperative sensing [17], and wideband spectrum sensing (WSS) were studied. In [16, 17, 20], cooperative sensing has been studied. The SUs sense other channels if the current channel has been occupied or detect a set of channels simultaneously [21–25], to find available spectrum in a wideband spectrum.

Different from the excessive spectrum sensing and access researches, which focus on utilizing the time or space domain only, a spectrum sensing and access strategy of CRSN is proposed in this paper, in which an SU system uses the available spectrum both in time and space domains in multiple channels. WSS is used for SU to identify the presence of PU signals. Different from the normal spectrum sensing scheme, which an SU senses the channels one by one, in WSS scheme, multiple channels are detected simultaneously by an SU; the sensing time durations of channels are the same. After spectrum sensing, the SU accesses all of the channels with mixed access strategy (MAS) [26, 27]. Under MAS, the SU accesses the channels with different powers based on the sensing results. When the channel is sensed as idle, the SU accesses it with a higher transmission power; the available spectrum in time domain is used. Otherwise, transmission power is lower enough to avoid unacceptable interference to PUs. The available spectrum in space domain is used. Thus, comparing with other spectrum access strategies, the SU in MAS obtains greater throughput.

However, all of the channels are selected for WSS which is not a suitable choice. There are three reasons for the necessity of the sensing channels' selection. First, sensing all of the channels is a huge challenge; great energy is needed [9], which is not a good choice for WSNs. Second, if PU signals in some channels are weak, much more sensing time is needed to guarantee the protection of PUs. For the reason of WSS utilization, the sensing overhead of the system is prolonged. Finally, when the idle probabilities of some channels are low, the SU obtains a larger average throughput when it accesses these channels directly, compared with the average throughout which is obtained when the SU accesses the channels after sensing. Thus, in this paper, the SU selects some channels for spectrum sensing. For the channel which has been selected, the SU accesses it and adjusts its transmission power based on the sensing results. Otherwise, the SU accesses the channels directly via underlay access strategy. Therefore, in our proposed system model, a tradeoff exists between achieving larger throughput and selecting appropriate sensing channels.

According to the dynamic spectrum environment, the SU cannot obtain accurate states of PU channels, due to the imperfect spectrum sensing and not all of the channels are selected for sensing. In this paper, we propose a selection making algorithm by using the partially observable Markov decision process (POMDP) theory. Under the selection making algorithm, the SU determines which channels are selected for spectrum sensing, how long the sensing time, and the transmission powers of the accessed channels. The objective of the selection making algorithm is to maximize the throughput of SU system, while avoiding unacceptable interference to PUs. An optimal policy and a myopic policy are derived to solve the formulated POMDP problem. Moreover, we present an optimization algorithm to solve the synchronism problem among the selected channels. Extensive numerical examples are proposed to demonstrate the merit of the proposed algorithms.

The contributions of this paper can be described as follows. (i)

A new selective sensing and access strategy of CRSN based on POMDP theory is proposed, in which at beginning of each slot, an SU selects some channels for wideband spectrum sensing and accesses all of the channels via mixed access strategy.

(ii)

An optimal policy and a myopic policy are proposed to solve the proposed POMDP problem.

(iii)

We consider an optimization problem, in which the decision probability thresholds of the selected channels are jointly optimized, in order to ensure the synchronism among the selected channels.

The rest of this paper is organized as follows. In Section 2, we review the related work. System model is proposed and analyzed in Section 3. In Section 4, we discuss a selection making algorithm via the POMDP framework; an optimal policy and a myopic policy are proposed to solve the proposed POMDP problem. The advantage of the proposed algorithms is illustrated by numerical results in Section 5, and conclusions are drawn in Section 6.

2. Related Work

Wideband spectrum sensing has been discussed in [21–25]. In [21], the detection thresholds of energy detectors in channels were jointly optimized. In order to maximize the throughput of SU, the sensing time and detection thresholds were jointly optimized in [22]. In [23, 24], both the sensing time and transmission powers of channels are jointly optimized. However, in these works, the SU selects all of the channels for spectrum sensing, and accesses these channels only when the sensing result is idle. The selection of sensing channels has not been considered. Different from the previous works, in this paper, not all of the channels are selected for sensing. After spectrum sensing, the SU accesses the channels via mixed access strategy. No matter the sensing result is idle or occupied, the SU accesses the channel with different transmission powers, and greater throughput is obtained. Moreover, different from the previous works in which the WSS was considered in one slot, we consider the sensing channels' selection and spectrum access in multiple slots. The problem becomes complicated.

According to the time-varying character of the dynamic spectrum environment, POMDP is used to formulate the selective spectrum sensing and access problem. In [28, 29], two optimal opportunistic spectrum access MAC protocols were proposed. In [30], a well-known separation principle was proposed to transfer the solution of the POMDP problem from optimal policy to myopic policy. However, both are based on the condition that the SU can sense and access one channel in a slot. In our proposed scheme, the SU selects several channels for spectrum sensing and accesses all of the channels after sensing. The calculation of sensing time becomes complicated. In [31], an adaptive sensing scheduling scheme was proposed, based on POMDP theory. The study [32] is probably the most relevant paper, in which an optimal sensing channels' selection policy is proposed. However, it accesses the channel only when the sensing result is idle. In our work, the SU accesses all of the channels with different transmission powers. Although the previous works take the same mathematical method (POMDP), the problem in this paper becomes quite complicated, and some efficient methods are proposed to solve the problem.

3. System Model

In this section, we present the system model of this paper and the structure of wideband spectrum sensing. Then, a selective sensing and access strategy in multiple channels is proposed.

3.1. System Model

An SU system shares a licensed wideband spectrum assigned to PUs, which can be divided into N nonoverlapping narrowband channels. The channels operate in a time-slotted manner. The traffic of PU system is modeled as a two states ON-OFF process. Figure 1 shows the structure of SU system. We assume the SU obtains the duration of slot and can keep synchronization with PUs [10, 11]. Denote $α_{n}$ as the probability that channel n transits from ON to OFF state and $β_{n}$ as the probability that channel n transits from OFF to ON state. We assume $α_{n}$ and $β_{n}$ are obtained from the previous long time measurement. $P_{n} (H_{0}) = α_{n} / (α_{n} + β_{n})$ and $P_{n} (H_{1}) = β_{n} / (α_{n} + β_{n})$ , $n \in {1,2, \dots, N}$ , where $P_{n} (H_{0})$ and $P_{n} (H_{1})$ are the average idle and occupied probabilities. Let a $1 \times N$ vector $S (m)$ denote the state vector of PU channels in slot m, $S (m) = (s_{1} (m), s_{2} (m), \dots, s_{N} (m))$ , where m is the index of slot, $m \in {1,2, \dots, M}$ , $s_{n} (m) = 1$ indicates the operating channel n in slot m is occupied, and $s_{n} (m) = 0$ indicates the channel is idle. The state space of $S (m)$ is

\begin{matrix} w (m) = {(w_{1}, w_{2}, \dots, w_{N}) | w_{n} \in {0,1}} . \end{matrix}

(1)

Let

P_{w_{n} w_{n}^{'}}

denote the state transition probability from state

w_{n}

to state

w_{n}^{'}

of channel n.

P_{w_{n} w_{n}^{'}}

Pr (s_{n} (m + 1) = w_{n} | s_{n} (m) = w_{n}^{'})

, where

w_{n}^{'} \in {0,1}

. For channel n, the state transition probabilities are

P_{01} = β_{n}

P_{00} = 1 - P_{01}

P_{10} = α_{n}

, and

P_{11} = 1 - P_{10}

Figure 1

Structure of SU system.

At beginning of each slot, the SU selects some channels for WSS and accesses the channels with MAS. For the other channels, the SU waits the same time, in order to keep synchronism with others. The unselected channels are accessed directly. The SU cannot transmit in one channel while performing spectrum sensing in another one. Denote $τ_{n} (m)$ as the sensing time duration of channel n in slot m. Under MAS, the SU detects the channel states firstly. When the sensing result is idle, the SU accesses it with power $P_{1 n}$ . When the sensing result is occupied, the transmission power is $P_{2 n}$ . Normally, we assume $P_{1 n} > P_{2 n}$ . The values of $P_{1 n}$ and $P_{2 n}$ are obtained by some power allocation optimization algorithms [27], which are not our focus in this paper. If the SU accesses the channel directly, the transmission power is $P_{2 n}$ . After transmission, the SU receiver announces ACKs to the transmitter of SU. If the transmission is not successful, the SU receiver announces a NAK. The time duration for acknowledges is ignored in our proposed structure. The structure of SU in one slot is shown as Figure 2.

Figure 2

Structure of SU in one slot.

3.2. Wideband Spectrum Sensing

In WSS scheme, we assume an wideband spectrum is occupied by some PUs. The PUs are operating in different spectrum bands; idle probabilities of channels are not the same. WSS is used to identify the presence of PUs in some special channels. In WSS, the wideband is divided into N nonoverlapping channels. The SU receives data through a wideband RF antenna. Then, the received data is passed through a high speed A/D converter [21–25]. The structure of WSS is shown as Figure 3. We can find that the sensing time durations of channels are the same.

Figure 3

Structure of wideband spectrum sensing.

3.3. Selective Sensing and Access Strategy

At the beginning of each slot, the SU determines (1) which channels are selected for spectrum sensing, (2) how long the sensing time, (3) which channels to be accessed with power $P_{1 n}$ , and which channels with power $P_{2 n}$ . The design objective is to maximize the throughput of SU system during a desired period of M slots, while the interferences to PUs are under the predetermined thresholds. Let $Ω$ denote the set of the selected channels. For the channel n in slot m, $n \in Ω$ , the SU selects it for spectrum sensing and accesses it based on the sensing results. According to the difference between the sensing results and the real states of channel, four cases are considered. When the sensing result is idle and the channel is not occupied by PUs, the transmission is successful with power $P_{1 n} (m)$ ; the transmission rate is $R_{00 n} (m)$ . However, if the real state of the channel is occupied and the SU has not detected it, the transmission of SU cannot be successful, for the reason of the interference from PUs. When the sensing result is occupied, the transmission power is $P_{2 n} (m)$ . The power $P_{2 n} (m)$ is limited to avoid unacceptable interference to PUs. No matter the real state of channel is idle or occupied, the transmission of SU is successful. Therefore, the achievable rate of channel n in slot m is given by

\begin{array}{l} r_{n} (m) = P_{n} (H_{0}) (1 - P_{f, n} (τ_{n} (m))) R_{00 n} (m) \\ + P_{n} (H_{0}) P_{f, n} (τ_{n} (m)) R_{01 n} (m) \\ + P_{n} (H_{1}) P_{d, n} (τ_{n} (m)) R_{11 n} (m), \end{array}

(2)

where

R_{00 n} (m) = B lo g_{2} (1 + (P_{1 n} (m) h_{s s n} / N_{0}))

R_{01 n} (m) = B lo g_{2} (1 + (P_{2 n} (m) h_{s s n} / N_{0}))

, and

R_{11 n} (m)

B lo g_{2} (1 + (P_{2 n} (m) h_{s s n} / (P_{p} h_{p s n} + N_{0})))

h_{s s n}

and

h_{p s n}

are the channel gains, B is the bandwidth of channel, and

P_{p}

is the power of PUs. The channel which has not been selected, the SU estimates it as occupied and accesses it directly with transmission power

P_{2 n} (m)

. The transmission rate is

r_{n}^{'} (m) = P_{n} (H_{0}) lo g_{2} (1 + (P_{2 n} (m) h_{s s n} / N_{0})) + P_{n} (H_{1}) lo g_{2} (1 + (P_{2 n} (m) h_{s s n} / (P_{p} h_{p s n} + N_{0})))

. To maximize the throughput of SU system, the optimization problem is formulated as

\begin{matrix} \max_{Ω, τ (m)} \sum_{m = 1}^{M} ‍ \frac{T - τ_{n} (m)}{T} (\sum_{n \in Ω} ‍ r_{n} (m) + \sum_{n \notin Ω} ‍ r_{n}^{'} (m)) \end{matrix}

(3)

subject to

\begin{matrix} P_{d, n} (τ_{n} (m)) \geq P_{d, t h}, \end{matrix}

(4)

\begin{matrix} τ_{1} (m) = τ_{2} (m) = \dots = τ_{N} (m) = τ (m), \end{matrix}

(5)

\begin{matrix} n \in {1,2, \dots, N} m \in {1,2, \dots, M} . \end{matrix}

(6)

Constraint (4) is a sensing time constraint, in order to protect PUs. Constraint (5) guarantees the WSS is used for spectrum sensing. It is not easy to solve this problem directly. We present a selection making algorithm based on POMDP theory, to find the optimal and suboptimal solutions.

4. Selection Making Algorithm

In the proposed strategy, because not all of the channels are selected for sensing and the presence of sensing errors, the SU cannot obtain the accurate states of each channel. At the beginning of each slot, decisions are made based on the previous actions and observations. This setting matches well with the POMDP framework [28–35]. Therefore, an optimization selection making problem under the proposed strategy is formulated as a POMDP, which determines an optimal policy for sensing channels' selection, the size of sensing time, and the access decisions. Next, we describe the POMDP framework. An optimal and a myopic policy are proposed to solve the POMDP problem.

4.1. POMDP Framework

4.1.1. Actions

At the beginning of slot m, the actions of SU have three stages: determine which channels to sense, the size of sensing time, and the transmission power. Let $A (m)$ denote the SU action in slot m

\begin{matrix} A (m) = [A_{1} (m), τ (m), A_{2} (m)], \end{matrix}

(7)

where

A_{1} (m)

denotes which channels are selected for spectrum sensing,

A_{1} (m) = [a_{11} (m), a_{12} (m), \dots, a_{1 N} (m)]

a_{1 n} (m) \in {0,1}

. When channel n is selected for sensing,

a_{1 n} (m) = 1

. Otherwise,

a_{1 n} (m) = 0

τ (m)

denotes the sensing time duration.

τ (m) = [τ_{1} (m), τ_{2} (m), \dots, τ_{N} (m)]

, and

τ_{1} (m) = τ_{2} (m) = \dots = τ_{N} (m)

. The WSS is used to detect the presence of PU channels. The sensing time durations of the selected channels are the same. If none of the channels are selected, the sensing time durations are zero.

A_{2} (m)

denotes which channels are selected to access with power

P_{1 n} (m)

P_{2 n} (m)

A_{2} (m) = [a_{21} (m), a_{22} (m), \dots, a_{2 N} (m)]

a_{2 n} (m) \in {0,1}

a_{2 n} (m) = 1

indicates the channel n is accessed with

P_{1 n} (m)

and

a_{2 n} (m) = 0

indicates it is accessed with power

P_{2 n} (m)

. If the channel n has not been selected for sensing, the transmission power is

P_{2 n} (m)

a_{2 n} (m) = 0

4.1.2. Observations

Let $θ (m)$ denote the channel observation vector in slot m. $θ (m) = [θ_{1} (m), θ_{2} (m), \dots, θ_{N} (m)]$ . In the proposed scheme, the observation of channel n in slot m has four possible values. $θ_{n} (m) \in {0,1, 2,3}$ . (i)

If channel n has been selected for sensing and the sensing result is idle, the SU accesses it with power $P_{1 n} (m)$ . After transmission, the SU transmitter receives ACK1, which indicates the sensing result is correct. We denote this as observation 0, $θ_{n} (m) = 0$ .

(ii)

However, if the SU transmitter receives NAK, it indicates the sensing result is wrong; the transmission of SU causes unacceptable interference to PU. We denote this as observation 1, $θ_{n} (m) = 1$ .

(iii)

If the sensing result of channel n is occupied, the SU accesses it with power $P_{2 n} (m)$ . The SU transmitter receives ACK2 after transmission. We denote this as observation 2, $θ_{n} (m) = 2$ .

(iv)

If the SU accesses channel n directly with power $P_{2 n} (m)$ and receives ACK2 after transmission. We denote this as observation 3, $θ_{n} (m) = 3$ .

The difference between observations 2 and 3 is that the SU has not sensed the channel in observation 3. Although the announcing signals are the same, the transmitter of SU can distinguish different observations.

4.1.3. Belief Vector

In the POMDP formulation, belief vector is used to infer the channel states at the beginning of each slot. It is a conditional probability for the past history, including the past decisions and observations. At the end of each slot, the belief vector is updated based on different actions and the corresponding observations, in order to obtain accurate information of the dynamic environment. The belief vector of channel n in slot m is denoted as $b_{n} (m)$ , $b_{n} (m) = [b_{n, 0} (m), b_{n, 1} (m)]$ , $b_{n, 0} (m) + b_{n, 1} (m) = 1$ . $b_{n, 0} (m)$ is a conditional idle probability. Consider the following:

\begin{array}{l} b_{n, 0} (m + 1) \\ = Pr (s_{n} (m + 1) = 0 | b_{n, 0} (m), A_{n} (m), θ_{n} (m)), \end{array}

(8)

where

A_{n} (m) = [A_{1 n} (m), τ_{n} (m), A_{2 n} (m)]

is the actions of SU in channel n. Let

b_{n} (0)

denote the initial belief vector; it is equal to the stationary probability vector.

b_{n, 0} (0) = α_{n} / (α_{n} + β_{n})

b_{n, 1} (0) = β_{n} / (α_{n} + β_{n})

. At the end of slot m, the SU transmitter receives four different observations; the belief vectors

b_{n, 0} (m)

and

b_{n, 1} (m)

are updated in different cases.

θ_{n} (m) = 0

means the real state of the channel n in slot m is idle. The belief vector in slot

m + 1

b_{n, 0} (m + 1) = P_{00}

θ_{n} (m) = 1

means the real state is occupied.

b_{n, 0} (m + 1) = P_{10}

. When

θ_{n} (m) = 2

, the SU senses channel n and the sensing result is occupied, but the SU cannot obtain the real state of that channel. From the Bayes rule, the belief vector updated formula is given as the following equation:

\begin{array}{l} b_{n, 0} (m + 1) \\ = (\sum_{w_{n} = 0}^{1} ‍ b_{n, w_{n}} (m) P_{w_{n} 0} \\ \times Pr (θ_{n} (m) | A_{n} (m), s_{n} (m + 1) = 0)) \\ \times (\sum_{w_{n}^{'} = 0}^{1} ‍ \sum_{w_{n} = 0}^{1} ‍ b_{n, w_{n}} (m) P_{w_{n} w_{n}^{'}} \\ \times {Pr (θ_{n} (m) | A_{n} (m), s_{n} (m + 1) = w_{n}^{'}))}^{- 1} \\ = ((b_{n, o} (m) P_{00} + b_{n, 1} (m) P_{10}) P_{f, n}) \\ \times ((b_{n, 0} (m) P_{00} + b_{n, 1} (m) P_{10}) P_{f, n} \\ {+ {(b_{n, o} (m) P_{01} + b_{n, 1} (m) P_{11}) P}_{d, n})}^{- 1} . \end{array}

(9)

Finally, when $θ_{n} (m) = 3$ , the belief vector is updated as

\begin{matrix} b_{n, 0} (m + 1) = b_{n, 0} (m) P_{00} + b_{n, 1} (m) P_{10} . \end{matrix}

(10)

4.1.4. Reward Function

Denote $R_{n} (m)$ as the immediate reward of channel n in slot m; the immediate reward of the SU system is $R (m) = \sum_{n = 1}^{N} R_{n} (m)$ . It is associated with actions and observations. Denote ${\tilde{R}}_{n} (m)$ as the expected reward of channel n. When the channel n has not been selected for spectrum sensing, we have

\begin{matrix} {\tilde{R}}_{n} (m) = R_{n} (m) = r_{n}^{'} (m) . \end{matrix}

(11)

Otherwise, the expected reward of the SU system

\tilde{R} (m)

is calculated as

\begin{array}{l} \tilde{R} (m) \\ = \sum_{n = 1}^{N} ‍ {\tilde{R}}_{n} (m) \\ = \sum_{n = 1}^{N} \sum_{z = 0}^{2} ‍ ‍ Pr (θ_{n} (m) = z | b_{n} (m - 1), A_{n} (m), s_{n} (m)) R_{n} (m), \end{array}

(12)

where

z \in {0,1, 2}

denotes the observation value space.

Pr (θ_{n} (m) = z | b_{n} (m - 1), A_{n} (m), s_{n} (m))

is the probability of observation; it is associated with the belief vector in last slot, the actions and the real state of the current slot. The expression of

\tilde{R} (m)

is shown as follows:

\begin{matrix} \tilde{R} (m) = {\begin{cases} \frac{(T - τ_{n} (m))}{T} R_{00 n} \\ \times (b_{n, 0} (m - 1) P_{00} \\ + b_{n, 1} (m - 1) P_{10}) (1 - P_{f, n}), & if θ_{n} (m) = 0 \\ 0, & if θ_{n} (m) = 1 \\ \frac{(T - τ_{n} (m))}{T} R_{01 n} \\ \times (b_{n, 0} (m - 1) P_{00} \\ + b_{n, 1} (m - 1) P_{10}) P_{f, n} \\ + \frac{(T - τ_{n} (m))}{T} R_{11 n} \\ \times (b_{n, 0} (m - 1) P_{01} \\ + b_{n, 1} (m - 1) P_{11}) P_{d, n}, & if θ_{n} (m) = 2 . \end{cases} \end{matrix}

(13)

Based on the above discussion and analysis, the procedure in the POMDP framework is shown as Figure 4.

Figure 4

The procedure in the POMDP framework.

4.2. Solution to POMDP

In the proposed scheme, the design objective is to develop an optimal selective sensing and access policy in each slot, in order to maximize the expected total reward obtained in the finite M slots. The complete problem formulation based on POMDP is given by

\begin{matrix} \max_{A (m)} \sum_{n = 1}^{N} ‍ \sum_{m = 1}^{M} ‍ R_{n} (m) \end{matrix}

(14)

subject to

\begin{matrix} P_{d, n} (τ_{n} (m)) \geq P_{d, t h} \\ n \in {1,2, \dots, N} m \in {1,2, \dots, M} . \end{matrix}

(15)

It is a constraint POMDP problem, which requires an intractable randomized policy to achieve optimality. However, the objective function can be separated from the constraint, if the SU trusts the current sensing result and accesses the channel based on the sensing result in the current slot [28, 30]. The sensing time is obtained from $P_{d, n} (τ_{n} (m)) = P_{d, t h}$ , $n \in {1,2, \dots, N}$ . The problem becomes an unconstrained POMDP problem.

After obtaining the sensing time, the problem reduces to a simple one. Two questions are considered: which channels are selected for spectrum sensing and which transmission powers are selected for transmission. The narrowband channels are independent with each other. The actions of SU system can be divided into the combination of each channel's actions. Then, we can calculate the optimal actions of each channel independently.

4.2.1. Optimal Policy

In order to calculate the optimal policy of channel n effectively, a value function $J_{m} (b_{n} (m - 1))$ is proposed, which denotes the maximum expected remaining reward accumulated from slot m to the frame horizon slot M, when the current belief vector is $b_{n} (m - 1)$ . Using Bellman equation, we have

\begin{array}{l} J_{m} (b_{n} (m - 1)) \\ = \max_{A_{n} (m)} ({\tilde{R}}_{n} (m) \\ + \sum_{z = 0}^{3} ‍ Pr (θ_{n} (m) = z | b_{n} (m - 1), A_{n} (m), s_{n} (m)) \\ \times J_{m + 1} (Γ (b_{n} (m - 1), A_{n} (m), θ_{n} (m) = z))), \end{array}

(16)

where

\begin{matrix} b_{n} (m) = Γ (b_{n} (m - 1), A_{n} (m), θ_{n} (m) = z) . \end{matrix}

(17)

It represents the updated knowledge of channel state based on the actions and observations of SU in slot m. ${\tilde{R}}_{n} (m)$ is given by (13). The value function contains two parts: the immediate expected reward obtained in slot m and the maximum expected future reward. The optimal policy is obtained via a fast point-based solution method [35].

4.2.2. Myopic Policy

The solution of optimal policy leads to great computational complexity, especially when the number of channels is large. In order to address this problem, a myopic policy is proposed, in which the SU maximizes the immediate expected reward in the current slot m. The myopic policy solution is given by

\begin{matrix} J_{m} (b_{n} (m - 1)) = \max_{A_{n} (m)} {\tilde{R}}_{n} (m) . \end{matrix}

(18)

Generally, the myopic policy balances the computational complexity and the optimality of solution. Dynamic programming can be used to find the solution.

4.2.3. Synchronism among Channels

In the proposed solutions, we calculate an optimal policy and a myopic policy of each channel independently, instead of calculating all of the channels at the same time. The access point is $P_{d, n} (τ_{n} (m)) = P_{d, t h}$ , and the sensing time $τ_{n} (m)$ is obtained. They may be different from each other. However, in this paper, the SU senses the selected channels with WSS; the sensing time durations of the selected channels should be the same. Therefore, a question exists: how can one ensure the synchronism among the selected channels?

Denote $P_{d, t h}^{n}$ as the detection probability threshold of the selected channel n. In order to maximize the current reward of the slot and ensure the synchronism of the selected channels, we adjust the $P_{d, t h}^{n}$ according to the difference between the selected channels. The sensing time of the selected channel n is obtained from

\begin{matrix} P_{d, n} (τ_{n} (m)) = P_{d, t h}^{n} . \end{matrix}

(19)

Then, the

P_{d, t h}^{n}

is formulated as a function of sensing time,

P_{d, t h}^{n} (τ (m))

. The optimization problem is formulated as

\begin{matrix} \max_{τ (m)} \sum_{n \in Ω} ‍ {\tilde{R}}_{n} (m) \end{matrix}

(20)

subject to

\begin{matrix} P_{d, t h}^{n} (τ (m)) \geq P_{d, t h}, \end{matrix}

(21)

\begin{matrix} τ_{n} (m) = τ (m), n \in Ω, \\ m \in {1,2, \dots, M} . \end{matrix}

(22)

The detection probability thresholds of each channel are adjusted, based on the sensing time duration. The $P_{d, t h}^{n} (τ (m))$ should be set as larger than the predefined threshold $P_{d, t h}$ , in order to ensure the protection of PUs. Constraint (21) is converted to the interval of sensing time $τ (m)$ , based on the convex character of the function $P_{d, t h}^{n} (τ (m))$ . One-dimensional exhaustion search method is used to solve this problem. The sensing time in slot m is optimized to balance maximizing the immediate current expected reward and keeping synchronism among the selected channels.

5. Numerical Results

In this section, the proposed optimal selective sensing and access policy will be compared with myopic policy and random policy under different simulation conditions. In random policy, the SU selects channels for spectrum sensing, and the access actions are also selected randomly. The sensing time durations of the three policies are the same, which are obtained from the proposed optimization algorithm (20). The slot size of PUs is fixed, the same as the sensing period of SU system, $T = 10$ ms. The channel power gains of channels are ergodic stationary. For the sake of simplicity, we assume the adaptive modulations are not used [33, 34]. When the transmission power is $P_{1 n}$ , the transmission rate is $R_{00 n} (m) = 0.06$ Mbps. When the power is $P_{2 n}$ , the transmission rates are $R_{01 n} (m) = R_{11 n} (m) = 0.02$ Mbps. The bandwidth of each narrowband channel is $B = 1$ MHz and the sampling frequency is $f_{s} = 2$ MHz. The decision threshold is $ɛ_{n} = 1.5$ , the predefined detection probability is $P_{d, t h} = 0.9$ , and the number of slot is $M = 30$ .

Figure 5 shows the performance of SU's aggregate throughput under different total channel number N. We consider two cases: $N = 6$ and $N = 2$ ; N denotes the number of subchannels. The idle probability $P_{n} (H_{0})$ and the signal-to-noise ratio (SNR) $γ_{n}$ are shown in Table 1. It is found that the performance of SU in the optimal policy is better than others. The reason is that the SNR $γ_{n}$ affects the sensing time of that channel. When the channel number is large, the channel with smaller SNR affects the whole sensing time duration efficiently. In optimal policy, the SU obtains greater throughput gains when it selects some channels for spectrum sensing, not all of the channels. Comparing with the myopic policy, the selection is more accurate. The SU in random policy may select the channels with lower SNR, which affects the sensing time and the throughput of the SU system. When the channel number is large, the SU in the optimal policy obtains greater throughput than others.

Table 1

Denotation	$N = 6$	$N = 2$	Definition
$P_{n} (H_{0})$	0.3 + 0.1( $n - 1$ )	0.3 + 0.4( $n - 1$ )	Idle probability
$γ_{n}$	$2.1$ + 0.1( $n - 1$ )	2.1 + 0.2( $n - 1$ )	SNR (dB)

Figure 5

SU's throughput performance comparison under different total channel number N.

In Figure 6, we study the performance of SU's aggregate throughput under different idle probability of two adjust PU channels. The idle probability of channel n is $P_{n} (H_{0}) = 0.3 + (n - 1) (0.1 + 0.05 * (λ - 1))$ . When $λ = 1,2, 3$ , the idle probabilities of the channels are $0.3,0.4,0.5,0.6$ , $0.3,0.45,0.6,0.75$ , and $0.3,0.5,0.7,0.9$ , respectively. It is found that the SU in the optimal policy obtains greater throughput than others under different cases. This is because the idle probabilities of channels affect the updating of belief vector. Under the optimal policy, the SU selects the suitable sensing channels and access actions.

Figure 6

SU's throughput performance comparison under different idle probability λ between two adjust channels.

Figure 7 illustrates the performance of aggregate throughput under optimal sensing time and fixed sensing time. Both are under the myopic policy. In the fixed sensing time case, the SU senses the channels in each slot with $τ_{n} (m) = 2$ ms, $n \in {1,2, 3,4}$ , $m \in {1,2, \dots, 30}$ . It is found that the SU with the optimal sensing time obtains greater throughput. The reason is that the sensing time in each slot is optimized; it balances the tradeoff between maximizing the throughput and sensing efficiency.

Figure 7

SU's throughput performance comparison under different sensing time duration.

Figure 8 illustrates the performance of aggregate throughput under different access strategies. All of them are under the optimal policy. The number of channels is four, and the probability of channel n is $P_{n} (H_{0}) = 0.3 + 0.2 * (n - 1)$ ; the SNR is $r_{n} = 2.1 + 0.2 (n - 1)$ , $n \in {1,2, 3,4}$ . It is found that the SU with the mixed access strategy can obtain larger throughput than others. In the underlay access strategy, the SU accesses the channels directly without sensing; the sensing time is saved. However, the transmission power is fixed as $P_{2 n}$ . In the overlay access strategy, the SU accesses the channels just when the sensing result is idle. The SU in the mixed access strategy can access the channels, no matter the sensing result is idle or occupied. Thus, the SU can obtain greater throughput.

Figure 8

SU's throughput performance comparison under different access strategy.

6. Conclusion

In this paper, we propose a selective spectrum sensing and access strategy in cognitive radio sensor networks. In order to maximize the aggregate throughput of SU system and reduce the spectrum sensing energy consumption, the SU selects some channels for spectrum sensing, accesses these channels based on the sensing results, and accesses the other channels directly. According to the dynamic spectrum environment, a selection making algorithm based on PODMP theory is proposed. An optimal policy and a myopic policy are proposed to solve the POMDP problem. Theoretical analysis and numerical results show that the proposed selection making algorithm can better balance maximizing the throughput of SU system and avoiding unacceptable interference to PUs.

Footnotes

Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.

Acknowledgment

The work in this paper is partly supported by programs of Natural Science Foundation of China under Grant nos. 60903170, U0835003, and U1035001.

References

Zhang

Gjessing

Yuen

Xie

Guizani

Cognitive radio based hierarchical communications infrastructure for smart grid

IEEE Network 2011 25 5 6 14

10.1109/MNET.2011.6033030

2-s2.0-80053502277

Zhang

Zhou

Yang

Hybrid spectrum access in cognitive radio based smart grid communications networks

IEEE Systems Journal 2014 8 2 577 587

10.1109/JSYST.2013.2260931

2-s2.0-84880100311

Yang

Zhang

Xie

Energy-efficient hybrid spectrum access scheme in cognitive vehicular Ad hoc networks

IEEE Communications Letters 2013 17 2 329 332

10.1109/LCOMM.2012.122012.122341

2-s2.0-84874975746

Liang

Feng

Zhao

Shen

Delay performance analysis for supporting real-time traffic in a cognitive radio sensor network

IEEE Transactions on Wireless Communications 2011 10 1 325 335

10.1109/TWC.2010.111910.100804

2-s2.0-78651514591

Haykin

Cognitive radio: brain-empowered wireless communications

IEEE Journal on Selected Areas in Communications 2005 23 2 201 220

10.1109/JSAC.2004.839380

2-s2.0-13844296408

Akan

O. B.

Karli

O. B.

Ergul

Cognitive radio sensor networks

IEEE Network 2009 23 4 34 40

10.1109/MNET.2009.5191144

2-s2.0-68949152592

Zahmati

A. S.

Fernando

Grami

Application-specific spectrum sensing method for cognitive sensor networks

IET Wireless Sensor Systems 2013 3 3 193 204

10.1049/iet-wss.2013.0006

2-s2.0-84882998422

Tseng

L.-C.

Chien

F.-T.

Marzouki

Chang

R. Y.

Chung

W.-H.

Huang

Self-organized cognitve sensor networks: distributed channel assignment for pervasive sensing

International Journal of Distributed Sensor Networks 2014 2014 10

183090

10.1155/2014/183090

Romero

Blesa

Araujo

Nieto-Taladriz

A game theory based strategy for refucing energy consumption in cognitive WSN

International Journal of Distributed Sensor Networks 2014 2014 9

965495

10.1155/2014/965495

10.

Liang

Y.-C.

Zeng

Peh

Hoang

A. T.

Sensing-throughput tradeoff for cognitive radio networks

IEEE Transactions on Wireless Communications 2008 7 4 1326 1337

10.1109/TWC.2008.060869

2-s2.0-46149091216

11.

Zhang

Xie

Song

Guizani

Secondary users cooperation in cognitive radio networks: balancing sensing accuracy and efficiency

IEEE Wireless Communications 2012 19 2 30 37

10.1109/MWC.2012.6189410

2-s2.0-84860561074

12.

Lee

W.-Y.

Akyildiz

I. F.

Optimal spectrum sensing framework for cognitive radio networks

IEEE Transactions on Wireless Communications 2008 7 10 3845 3857

10.1109/T-WC.2008.070391

2-s2.0-55149098083

13.

Bansal

Hossain

Md. J.

Bhargava

V. K.

Optimal and suboptimal power allocation schemes for OFDM-based cognitive radio systems

IEEE Transactions on Wireless Communications 2008 7 11 4710 4718

10.1109/T-WC.2008.07091

2-s2.0-57149131975

14.

Kang

Liang

Y. C.

Garg

H. K.

Zhang

Optimal power allocation for fading channels in cognitive radio networks: ergodic capacity and outage capacity

IEEE Transactions on Wireless Communications 2009 8 2 940 950

10.1109/TWC.2009.071448

2-s2.0-61349094185

15.

Chen

Tse

C. K.

Zhao

Optimal quantisation bit budget for a spectrum sensing scheme in bandwidth-constrained cognitive sensor networks

IET Wireless Sensor Systems 2011 1 3 144 150

10.1049/iet-wss.2011.0055

2-s2.0-84555218173

16.

Zhang

Huang

Xie

Cross-layer optimized call admission control in cognitive radio networks

Mobile Networks and Applications 2010 15 5 610 626

10.1007/s11036-009-0194-1

2-s2.0-78649506763

17.

Liu

Xie

Zhang

Leung

V. C. M.

Energy-efficient spectrum discovery for cognitive radio green networks

Mobile Networks and Applications 2012 17 1 64 74

10.1007/s11036-011-0307-5

2-s2.0-84861825408

18.

Wellens

Riihijärvi

Mähönen

Empirical time and frequency domain models of spectrum use

Physical Communication 2009 2 1 10 32

19.

Xie

Liu

Zhang

A parallel cooperative spectrum sensing in cognitive radio networks

IEEE Transactions on Vehicular Technology 2010 59 8 4079 4092

10.1109/TVT.2010.2056943

2-s2.0-77958084080

20.

Liu

Xie

S. L.

Zhang

An efficient MAC protocol with selective grouping and cooperative sensing in cognitive radio networks

IEEE Transactions on Vehicular Technology 2013 62 8 3928 3941

21.

Quan

Cui

Sayed

A. H.

Poor

H. V.

Optimal multiband joint detection for spectrum sensing in cognitive radio networks

IEEE Transactions on Signal Processing 2009 57 3 1128 1140

10.1109/TSP.2008.2008540

MR3027792

2-s2.0-61549113405

22.

Pei

Liang

Teh

K. C.

K. H.

How much time is needed for wideband spectrum sensing?

IEEE Transactions on Wireless Communications 2009 8 11 5466 5471

10.1109/TWC.2009.090350

2-s2.0-70749089661

23.

Paysarvi-Hoseini

Beaulieu

N. C.

Optimal wideband spectrum sensing framework for cognitive radio systems

IEEE Transactions on Signal Processing 2011 59 3 1170 1182

10.1109/TSP.2010.2096220

MR2767684

2-s2.0-79951611470

24.

Paysarvi-Hoseini

Beaulieu

N. C.

On the benefits of multichannel/wideband spectrum sensing with non-uniform channel sensing durations for cognitive radio networks

IEEE Transactions on Communications 2012 60 9 2434 2443

10.1109/TCOMM.2012.062512.100568

2-s2.0-84866729028

25.

Sun

Nallanathan

Wang

Chen

Wideband spectrum sensing for cognitive radio networks: a survey

IEEE Wireless Communications 2013 20 2 74 81

10.1109/MWC.2013.6507397

2-s2.0-84877771071

26.

Kang

Liang

Y.-C.

Garg

H. K.

Zhang

Sensing-based spectrum sharing in cognitive radio networks

IEEE Transactions on Vehicular Technology 2009 58 8 4649 4654

10.1109/TVT.2009.2018258

2-s2.0-70350220361

27.

Khoshkholgh

M. G.

Navaie

Yanikomeroglu

Access strategies for spectrum sharing in fading environment: overlay, underlay, and mixed

IEEE Transactions on Mobile Computing 2010 9 12 1780 1793

10.1109/TMC.2010.57

2-s2.0-78049500193

28.

Zhao

Tong

Swami

Chen

Decentralized cognitive MAC for opportunistic spectrum access in ad hoc networks: a POMDP framework

IEEE Journal on Selected Areas in Communications 2007 25 3 589 599

10.1109/JSAC.2007.070409

2-s2.0-34247229185

29.

Hoang

A. T.

Liang

Y.-C.

Wong

D. T. C.

Zeng

Zhang

Opportunistic spectrum access for energy-constrained cognitive radios

IEEE Transactions on Wireless Communications 2009 8 3 1206 1211

10.1109/TWC.2009.080763

2-s2.0-62949104513

30.

Chen

Zhao

Swami

Joint design and separation principle for opportunistic spectrum access in the presence of sensing errors

IEEE Transactions on Information Theory 2008 54 5 2053 2071

10.1109/TIT.2008.920248

MR2450849

2-s2.0-43749094330

31.

Jeon

W. S.

Jeong

D. G.

Adaptive sensing scheduling for cognitive radio systems

Computer Networks 2012 56 14 3318 3332

10.1016/j.comnet.2012.06.002

2-s2.0-84865776906

32.

Gong

S. M.

Wang

Liu

Yuan

Maximize secondary user throughput via optimal sensing in multi-channel cognitive radio networks

Proceedings of the 53rd IEEE Global Communications Conference (GLOBECOM '10)

December 2010

Miami, Fla, USA

1 5

10.1109/GLOCOM.2010.5683327

2-s2.0-79551654090

33.

Jiang

Lai

Fan

Poor

H. V.

Optimal selection of channel sensing order in cognitive radio

IEEE Transactions on Wireless Communications 2009 8 1 297 307

10.1109/T-WC.2009.071363

2-s2.0-61349151191

34.

Fan

Jiang

Channel sensing-order setting in cognitive radio networks: a two-user case

IEEE Transactions on Vehicular Technology 2009 58 9 4997 5008

10.1109/TVT.2009.2027712

2-s2.0-70450201763

35.

Smith

Simmons

Point-based POMDP algorithms: improved analysis and implementation

http://uai.sis.pitt.edu/papers/05/p542-smith.pdf

Selective Sensing and Access Strategy to Maximize Throughput in Cognitive Radio Sensor Network

Abstract

1. Introduction

2. Related Work

3. System Model

3.1. System Model

3.2. Wideband Spectrum Sensing

3.3. Selective Sensing and Access Strategy

4. Selection Making Algorithm

4.1. POMDP Framework

4.1.1. Actions

4.1.2. Observations

4.1.3. Belief Vector

4.1.4. Reward Function

4.2. Solution to POMDP

4.2.1. Optimal Policy

4.2.2. Myopic Policy

4.2.3. Synchronism among Channels

5. Numerical Results

6. Conclusion

Footnotes

Conflict of Interests

Acknowledgment

References