Abstract
Cognitive radio networks are software controlled radios with the ability to allocate and reallocate spectrum depending upon the demand. Although they promise an extremely optimal use of the spectrum, they also bring in the challenges of misuse and attacks. Selfish attacks among other attacks are the most challenging, in which a secondary user or an unauthorized user with unlicensed spectrum pretends to be a primary user by altering the signal characteristics. Proposed methods leverage advancement to efficiently detect and prevent primary user emulation future attack in cognitive radio using machine language techniques. In this paper novel method is proposed to leverage unique methodology which can efficiently handle during various dynamic changes includes varying bandwidth, signature changes etc… performing learning and classification at edge nodes followed by core nodes using deep learning convolution network. The proposed method is compared with that of two other state-of-art machine learning-based attack detection protocols and has found to significantly reduce the false alarm to secondary network, at the same time improve the overall detection accuracy at the primary network.
Keywords
Introduction
Cognitive radio (CR) provides effective utilization of the spectrum between unlicensed and licensed users. The sharing of the radio band can be done based on the probability of a user, hence, providing a better sharing mechanism by channel occupancy prediction. 1 Generally, in conventional type of CR the spectrum which is utilized by the secondary user will be evoked by the primary user whenever the latter demands. Shams Shafigh et al. 2 have proposed a novel CR paradigm wherein the secondary users can use or share the spectrum with the primary user with an effective cost or price negotiation with the primary user and the primary user is not the only dominant user of the spectrum. CRs tend to adjust to the environment and alter the way they communicate in such a way that it is secure and optimal. 3 Such a spectrum sharing has evolved into a new technique called as opportunistic spectrum access (OSA), which has provided a new possible solution to the lack of the spectrum and further improved the efficient utilization of the spectrum. 4 The prime factor of CR is its capability to sense, measure and be aware of the parameters related to the characteristics of the radio channel. It is evident that primary users have higher priority when compared to the secondary users who have lower priority. In such a case, the latter must have the capacity to sense the spectrum reliably when it is free for using the unused part of the spectrum. The task of spectrum sensing is to provide information regarding the usage of the spectrum and presence of primary users. 5 Spectrum sensing period is the prime influencing factor in CR. This period provides the system information regarding the end-to-end latency and degradation in end-user quality of services (QoS) 4 by the primary users in having access to the band. Spectrum sensing provides lot of challenges due to various uncertainties in the channel and noise. In such a scenario, CR must be capable of identifying a faded or shadowed primary signal. 6 The major aim of spectrum sensing is to find the status of the spectrum, whether it is idle or not, so that it can be made available to an unlicensed user. 7 The performance of the spectrum sensing is affected by shadowing, multipath fading and other uncertainty issues in the system. In order to overcome these issues, in the study by Akyildiz et al., 8 a cooperative spectrum sensing algorithm is proposed, which is observed to improve the performance in detection by exploiting spatial diversity but at the cost of system overhead. In cooperative spectrum sensing, secondary users report the local energy statistics to the secondary base station, which is called as fusion centre. The cost of local processing is the overhead in the collection of the data/statistics and the calculation of the energy by each user. 9 Some of the spectrum sensing methods are cyclostationary detection (CS-D), classic likelihood ratio test detection (LRT-D), matched filtering detection (MF-D) and energy-based detection (EB-D). Based on the requirements for implementation, these methods are classified into three categories as follows: (1) requires no information regarding source and noise power, (2) requires only noise power and (3) requires both source and noise power. 10
The attacks in CR consist of unknown user hacking the spectrum, and preventive measures are necessary. Shams Shafigh et al. 2 and Wang et al. 11 proposed preventive measures for cross-layer attacks and defensive mechanism in CR. Primary user emulation (PUE) attack and reporting false sensing data attack are the two types of attacks when physical (PHY) layer is considered. The attacker’s strength increases when there are unusual mechanisms in each layer. 11 In PUE attack, an unfriendly user gives an interference signal showing that this user is the primary user, hence, the secondary user has to leave the spectrum by allotting space to the unintended primary user who is the attacker. 12 Such an attack can be detected by marking the starting point of the two signal sequences that interfere at the receiver. Based on the results obtained at multiple receivers and the positions of the transmitter’s reference, the position of the primary user can be determined. Comparing this with the position of the primary user, the PUE attack can be detected. 13 Another type of attack in CR is denial-of-service (DoS) attack. In such an attack, the attacker attempts to reduce the bandwidth of the user, thereby preventing the authorized user to get access to the network. DoS is considered to be the attack with highest risk. Such an attack can only be prevented by having better cross-layer design. 14 Hard decision fusion rules 15 are proposed for the prevention of PUE attack such that the fusion centre based on the result of the fusion of all these rules determines the actual primary user, thus preventing the attack. The key point is to find the collisions to the primary user that will not happen if all the secondary users obey the fusion centre rules. In the study by Duan et al., 16 a novel algorithm is proposed for the prevention of PUE attack, which does not need the fusion centre decision for identification and exclusion of the attackers. To provide optimal efficiency in allocation of spectrum to unlicensed users, an objective function along with the modified self-organizing map (SOM) method 17 is experimented, indicating neural network approach to overcome optimization limitations in discussion. There exists challenge between selecting two nodes in allocating a spectrum due to dynamic variation in bandwidth. During intention of forwarding bandwidth link to neighbour nodes and selecting hop with next highest bandwidth, a method 18 of machine language (ML)-based feed forward neural network (FFN) algorithm is proposed. Li et al. 19 proposed detection of modulation relatively using lesser training samples using fourth-order cyclic cumulant vector for the support vector machine supervised learning model (SVM-S-ML).
Detection and selection of idle channels in cognitive radio networks (CRNs) during spectrum sensing is a challenge. The stated problem is addressed 20 by the channel control protocol using an SVM learning technique. Also, radio frequency (RF) signals can be classified based on sparse coding, a blind signal classification unsupervised learning mode 21 with no prior protocol-specific knowledge, which has shown outcome performance with false alarm rate (under 20 dB). However, there exist many kinds of attacks in CRNs from simple to complex distributed attacks and scenarios. The former is easy to detect at early stage, but later scenario detection (intrusion detection) can be solved by 22 reinforcement learning (RL) focusing on hierarchical strategy. The method of dual classification approach with combination of supervised machine learning (S-ML) and unsupervised machine learning (US-ML) with distinct training set was first proposed by Srinivasan and Shivakumar 23 with initial set of experiments and validated results, which motivated to study and to enhance results further.
The current section describes primary consideration of CRNs and their challenges towards attacks, leveraging the free spectrum channel bandwidth allocation to CR users. Consequently, the remainder of the article is organized as follows: section ‘Problem statement and contributions’ sets the paradigm for current state-of-art problems. Section ‘Research methodology’ introduces previous and existing methods that have been experimented with the impact and limitations of the proposed system models. Section ‘Research methodology’ sets the paradigm for current state-of-art problems. Section ‘Proposed machine learning model’ elaborates quantitative experimental research conducted on the proposed model for detection and defence measures using advanced artificial intelligence (AI)-based techniques with dual classification at core and edge-based metrics. In section ‘Experiment and simulation results’, experimental results are evaluated with key performance index (KPI) to validate the hypothesis. The system is assumed to be more efficient even when the attack signature changes rapidly in accordance with state of network (
Problem statement and contributions
In general, a machine learning technique depends on training and testing phases, and the accuracy depends upon the training data set. It is very difficult to gather training data set with enough linearly separable distinct anomaly classes. Therefore, deep learning and RL are preferred over any real-time machine learning and classification problems. However, classification of valid or malicious network states in either the spectrum or the channel state also requires some kind of preliminary knowledge, which is often based on parametric thresholds.
Recursive attack may lead to change of state in an uncontrolled distribution, resulting in failure of the classifier due to an unsuitable reference model. Also, once such attacks are detected the CRN often takes preventive measures, which range from changing the central frequency, modulation, spectrum usage and so on. Therefore, a robust and scalable CR defence system against any form of attacks (particularly selfish attacks) is expected to combine the edge-based observation, detection and classification engine with that of the cloud. Even in the absence of a recursive attack, change in the attack signature often leads to misdetection. In the context of the learning of the classifier, these changes might get classified as anomalies. An efficient CR attack prevention should not only depend upon a machine learning-based algorithm but also be coupled with a big data analytics framework to import and incorporate instantaneous vectors and their corresponding deviation in the decision system. The current state of art in attack detection in CR depends upon a primary knowledge of ‘Non-Attack State’ and measuring the deviation of the current state from such a normal state. Because of frequent changes in the attack signature, recursive attacks lead to ‘normal state’ itself, and such models are not adaptable in time.
To overcome these challenges, a semi-supervised distributed learning technique (semi-supervised machine learning (SS-ML)) is proposed for AI that runs on the cloud. The architecture allows the edge engine running in cognitive radio base station (CR-BS) to perform data clustering and session classification based on the SOM. These labelled data are then sent to the cloud core where a supervised learning technique (SS-ML) first classifies the data based on past learning. In case the error rate of the training is extremely high, they refrain themselves from adjusting the training vectors. This improves the overall performance significantly.
Research methodology
The overall CRN architecture is represented in Figure 1. The problem can be defined as to detect and isolate S3, which is a selfish attacker. Due to less bandwidth usage in the cell covered by cognitive base station 2, the node S3 pretends to be a primary user. In such a scenario, node S2’s link will be dropped and the session and spectrum will be allocated to S3, denying the service to S2 and reducing the opportunity for any other secondary user to apply and get assigned with the residual spectrum in the cell covered by B2.

Cognitive radio protocol architecture.
The CRN’s architecture illustrated in Figure 1 is also interesting because it clearly shows that at any given instant of time, not every cell will be attacked and not every cell will have similar attacks. As a matter of fact, the probability of any cell being under attack will be far lesser than a normal cell at any given instant of time. If this belief can be propagated over the entire network and the network state normalcy can be derived based on the aggregated belief mitigation, then a system can more efficiently detect the abnormalities. From Figure 1, it is also clear that it is impossible for B2 to know that S3 is an attacker without sufficient knowledge about the normal state, which is available as distributed information across the network. By introducing a cloud-based centralized decision-making system, which depends primarily on the information retrieved by all the CR-BSs, the AI running in the cloud can be empowered with a better belief system.
Most of the existing studies model CRN’s normal state (non-attack state) as a noise-free transmission state. Although earlier work has introduced partially blind spectrum sensing and blind spectrum sensing mechanism, 21 spectrum usage is not the only criteria that can determine the attack. For instance, an attacker needs several permutations and combinations of changing the modulation schema before succeeding in emulation. Such changes cannot be observed only within the purview of the spectrum usage. Different volume of network traffic leads to different state of interference, noise, signal degradation and so on. Hence, it is paramount to define set of model parameters that can accurately represent a network state, which can be further classified into normal or abnormal. The methodology section is explained in two parts. First, the network state is elaborated and the parameter model followed by machine learning engine that runs both in the edge and in the core. The model training parameters are calculated to identify network state.
A system model needs to be represented mathematically to identify the network state. Hence, the network state is defined as a set of distinct parameters that can be used to determine the attack from non-attacked state.
Probability of assignment of channel using channel assignment matrix
The channel assignment matrix
17
can be mathematically defined as equation (1), where ‘
where SC is the total assignment factor represented in equation (2)
subject to conditions
Amount of interference can be measured by assigning a weight to each assignment. The proximity factor
The cost function can be generated by
The aforementioned matrix is used by the SC (CR-BSs B1 and B2 in Figure 1). The aforementioned cost function derived from interference observation is used by the base stations to determine the channel usage by the primary user and viability to assign the channel to competing secondary users. By using the statistics from this matrix, average interference with respect to number of competing secondary users for channel ‘
Average usage of spectrum estimation using link bandwidth estimation
Suppose node
at time
The algorithm illustrated in the study by Yang et al.
18
indicates that node
Representation of modulation signals using cyclic cumulants
Equations (6) and (7) 19 are used to represent modulation signal through cyclic cumulates, whereas equation (8) is a representation of complete modulated signal through fourth-order cyclic cumulate. This is used for recognizing whether the received modulated signal is genuine or not. The reason for considering fourth-order cyclic cumulants vector (CRM) in frequency domain 19 is because it is being considered as the most robust model in a spectrum varying CRN. According to the findings in the study by Weifang, 14 first, recognize accuracy is increased when the signal power is getting stronger in the channel, almost all the recognized accuracy is 100% when SNR is 10 db, and recognize accuracy of fourth cumulant vector gets about 90% accuracy when SNR is 4 db, which shows that fourth-order cyclic cumulant vector has good discrimination capability in the non-noisy channel. Hence, our model adapts fourth-order CRM due to its robustness under non-ideal network conditions that include low spectrum availability, low SNRs, fast spectrum switching and multipath fading.
Assuming signal sequence is
where Fourier coefficient
For signal sequence
where
Calculation of energy sensing model using dynamic spectrum sensing
Energy sensing model using dynamic spectrum sensing (DSS) and noise spectral density (sdf), 20 equations (9)–(12) should be used for energy sensing
The average number of SUs successfully transmitting within the first (
Average number of secondary users left for transmitting their sensing result is given by equation (13) and transmission probability success is given by equations (14) and (15)
The network state (communication set)
where STAT represents a set of functions comprising minimum, maximum, average, standard deviation and variance. Therefore, STAT (A) defines network state set of minimum function set, STAT(B) defines network state set of maximum function set, STAT(C4) defines network state of average function set and STAT(P) defines network state set of variance set.
Proposed machine learning model
The machine learning process in the proposed model is distributed in nature. It comprises of three important stages of classification and learning for PUE detection and prevention.
Classification at edge with unsupervised (US-ML)
In real-time communication, network state changes unpredictably. In general, it is difficult to create the test rules and cases needed for training as the network state, communication entities and communication metrics change frequently. Therefore, an SOM-based edge classifier is proposed in Figure 2. SOM is essentially a clustering technique, which can organize data into sets and classify any input sets into one of these sets. Once sufficient observable data are collected at any of the edge nodes, the data are trained with SOM where

Self-organizing map structure (SOM model).
However, assuming that not sufficient outliers or intrusion cases may be present at the beginning, the classes and the training cannot be considered to be in the steady state. One of the common rules that would remain consistent in this context is that of a normal network state where all the edge nodes will have similar observation of the mentioned parameters. Therefore, if all the edge classified state and the corresponding data are gathered centrally and a model is built, then the model can be trained to select and tune the parameters and test cases automatically in the core cloud.
Classification at core with supervised learning (S-ML)
Mixed model optimization (MMO) performed under a series of steps is illustrated in Figure 3. In Step 1, during the model training phase, the unsupervised classes and the parameters are collected from the edge nodes and are clustered with a two-class clustering. The iteration first starts with one class of an edge node and the parameters are grouped from all the other edge nodes that are close to this class. This process is continued and repeated till a steady state is arrived. At the beginning there are no PUE attacks. So, essentially the model training is performed to obtain one single class and validating the model building process. Once generic normal network state reaches steady state, deep-learning phase is entered which is represented by the ML model in Figure 3.

Model optimization at the core (mixed model learning and optimization).
The proposed system adapts a reinforced machine learning (RML)-based feedback multilayer neural network with the hybrid learning model.21,22 In this method, neural network is trained with small initial knowledge base. As the system acquires and classifies network package as either normal or abnormal, the classification is validated by a human expert who classifies the packet based on human intelligence. The neural network is retrained periodically with newly available data. This approach of combining human security expert’s decision with that of machine learning was introduced, 22 which was also successfully used by palantir systems, while tracking the co-ordinates of various financial defaulters and many others. This is elaborated in detail. 22 Traditional data and network intrusion detection model is redesigned to suit CRNs and is adapted in the proposed work.
Classification of network state (Ns ) at the core
In the second phase of training (both edge and the core), controlled PUE attacks are generated randomly from different edges, and model at the core is retrained as illustrated in Figure 4. By the end of this phase, the core model is expected to essentially have two primary classes: normal and abnormal. Abnormal class can be considered as a superset of all the abnormal categories where each of the abnormal categories might be one specific type of PUE attack.

Machine learning model – training and testing of DLCNN at the core.
In the third phase, the constructed edge and core model is validated by generating periodic PUE attack from a specific node. The core is expected to correctly classify this as abnormal data, whereas data coming from all other nodes must be classified correctly as normal data. In case the classification is not 100%, retraining of the edge and core models is repeated. By the end of the controlled testing, the core and edge models reach steady state and are capable of handling and classifying PUE attacks. The efficiency of the classifier is validated through suitable test cases, which are presented in detail in section ‘Experiment and simulation results’.
Experiment and simulation results
A CRN with 6 cells and 60 primary users is created and trained. 23 For testing, the number of attackers is given as input which the simulation randomly places into the cells. The attacker chooses its nearest primary user. It is assumed to know the spectrum of the attacked primary user. In total, 50% of the users in a cell are considered to be primary users and the remaining as secondary users. An attacker is selected from the secondary users. The attacked primary user is ideal and does not communicate, but the remaining primary users communicate randomly. Secondary users collaboratively sense the spectrum and the network state and request for granting a spectrum to the CR controller within the cell where the edge classifier runs. The CR local controller also receives periodic state observation data from the primary users. When the attacker node requests for granting a spectrum by emulating the spectrum of the primary user, the goal is to classify this request as an attack request. Edge node performs first level of classification and forwards the data of the new request to the core node, which is linked to all edge nodes. Classification at the core node for each request is checked with the actual request. Observed network state at each of the active primary user is periodically forwarded to the core node for classification. By varying number of attackers, error rate, false acceptance and false rejection are calculated. Beside the proposed deep learning convolution network (CDLN), a rule-based classifier is also considered, which is based on fast Fourier transform (FFT)-based classification of the aggregated end signal at the core and makes use of IF-THEN rules for class prediction and FFN at the edge that leverages the same network state as the proposed states. Results are presented in Figures 5–8.

Performance analysis – attacker node versus detection accuracy.

Performance analysis – attacker node versus false alarm.

Performance analysis – SNR versus detection accuracy.

Performance evaluation and impact – detection accuracy with varying bandwidth.
Consequently, it can be seen from the aforementioned result illustrated in Figure 8 that the proposed machine learning-based algorithms (CDLN) perform better than rule-based classifier for any given attack conditions. However, due to bi-level classification, the proposed algorithm supersedes, that is, performs better than conventional neural network in terms of error rate (
One of the major advantages of the proposed system is that the system is assumed to be more efficient when the attack signature changes frequently. In order to validate this hypothesis, the network state (
KPI quantitative metric’s comparison of the proposed model with FFN and rule-based is tabulated in Table 1 and Figure 9, respectively, which clearly legitimates that the proposed ML-based CDLN model supersedes the existing model under consideration.
Quantitative metrics comparison chart of proposed with FFN and rule-based methods.

KPI comparisons of the proposed models with FNN and rule-based models.
CDLN: deep learning convolution network; FFN: feed forward neural network.Summary and conclusion
CRNs are gaining significant popularity due to their spectrum management and ability to programme automatically. An unlicensed secondary user can request for an unused spectrum to the cognitive radio controller (CRC), which can allocate the unused spectrum temporarily before the spectrum is needed by the primary user again. Through PUE attack, a selfish secondary node attempts to emulate a primary node, which prevents a secondary node to view the free spectrum and the secondary node itself is given privileged access to the spectrum. Considering that the attacker node has acquired the access token of the primary node, it can bypass the security. Proposed work is aimed at detecting such scenario.
Although several previous works have focussed on PUE attacks and detection and prevention of PUE attacks, often the models are based on assumption that the nature of the attack remains consistent. However, the proposed work indicates often PUE attack signature may vary significantly from case to case. Hence, more adaptive classification of the observed state of the network is needed for better detection of the attack.
This study dealt with an algorithm proposal and validation of PUE attack detection in the CRN using the dual classification method, and primary classification is performed at the edge CR node followed by the final classification at the core using CDLN. The algorithm is compared with both popular rule-based method and FFN-based algorithms as in Figure 8. Retraining is an essential part to improve the accuracy of the attack detection when the attack signature changes rapidly where CDLN performs much better than FFN. On an average, any machine learning-based algorithm works better over rule-based (or threshold-based) algorithms as in Figure 8. Simulation results show that the accuracy of the proposed algorithm is at least 25% better than conventional neural network and 40% better than the threshold-based algorithm.
The proposed model can further be improved by feedback learning and rule governed by SS-ML at the edge of the network and decision support system governed by the cores.
Footnotes
Handling Editor: Mihael Mohorcic
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Declaration of patent
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
