Abstract
Rogue access point attack is one of the most important security threats for wireless local networks and has attracted great attention from both academia and industry. Utilizing received signal strength information is an effective solution to detect rogue access points. However, the received signal strength information is formed by multi-dimensional received signal strength vectors that are collected by multiple sniffers, and these received signal strength vectors are inevitably lacking in some dimensions due to the limited wireless transmission range and link instability. This will result in high false alarm rate for rogue access point detection. To solve this issue, we propose a received signal strength–based practical rogue access point detection approach, considering missing received signal strength values in received signal strength vectors collected in practical environment. First, we present a preprocessing scheme for received signal strength vectors, eliminating missing values by means of data filling, filtering, and averaging. Then, we perform clustering analysis on the received signal strength vectors, where we design a distance measurement method that dynamically uses partial components in received signal strength vectors to minimize the distance deviation due to missing values. Finally, we conduct the experiments to evaluate the performance of the practical rogue access point detection. The results demonstrate that the practical rogue access point detection can significantly reduce the false alarm rate while ensuring a high detection rate.
Introduction
Nowadays, the IEEE 802.11 wireless local network (WLAN) is becoming an extremely popular wireless technology for various scenarios, such as campuses, homes, enterprise environments, and public spaces.1,2 With the wide deployment of WLANs, the issues of security and privacy have been increasingly emerging. 3 Due to the openness of the wireless transmission medium, a variety of attacks can be launched easily. Among these attacks, rogue access point (AP) attacks have attracted more and more attention, and the rogue AP is defined as an illegal AP that is not deployed by the WLAN administrator. 4
An adversary can set up a rogue AP with the same service set identifier (SSID) of the legitimate APs, and attract users to connect with it, so as to passively obtain user privacy information or actively perform man-in-the-middle (MITM) attack. 5 Moreover, in order to avoid being caught, adversaries usually capture the MAC address information of legitimate APs through passive monitoring and then modify the MAC address of the rogue AP by simply issuing an ifconfig command to masquerade as a legitimate AP. At present, the hardware cost of a rogue AP is low, and its software installation is also very convenient, which brings serious security threats to WLANs, especially public WLANs. The CCTV “3.15” evening party in 2015 reported the whole process of hackers using rogue APs to steal users’ photos, email accounts, passwords, and other private information. In recent years, some cases of personal property loss have occurred frequently, and bank accounts and passwords have been stolen because of misuse of rogue APs in public WLANs. Within the enterprise, the rogue AP attack has become an effective approach for hackers to invade internal networks, which poses a great threat to intranet security. Moreover, the performance of enterprise WLANs is being significantly impacted by the ever-increasing rogue APs due to the carrier sense interference and hidden terminal interference. 6 Because the vast majority of existing user devices lack the recognition and authentication mechanisms for the APs, it is impossible to effectively distinguish legitimate APs from rogue APs. Therefore, it is necessary to study the rogue AP detection technology, which is the premise and basis for locating and troubleshooting rogue APs.
The existing solutions of rogue AP detection mainly include the following three aspects, that is, user-side detection, wired-side detection, and wireless-side detection. Through using user-side detection methods,4,7–9 a user device can determine whether its associated AP is a rogue AP. Although these solutions are lightweight and low cost, the user devices need to be customized or additional software should be installed. Thus, it is hard to be widely applied for user-side detection, while the detection from the perspective of network management has more advantages. Among them, the wired-side detection methods10–12 assume that the adversary will use a specific wired backbone network to pass victims’ data to the Internet and achieve rogue AP detection by monitoring the backbone network and looking for the traffic that appears to come from rogue APs. However, this assumption limits the application scope because the adversary may use the same wired backbone network with legitimate APs, or use a different way, such as 4G long term evolution (LTE) and other mobile communication networks. Besides wired-side detection, another approach is to detect rogue APs in the wireless side,13–22 which directly senses the signals of rogue APs in the air and detects them according to the characteristics of the signals, such as clock skew and received signal strength (RSS). An RSS value is the signal strength of a received frame measured at the sniffer, and the signal strength of the frame measured at multiple sniffers can constitute an RSS vector. Because RSS is correlated to the transmission power, the transmitter–receiver distance, and the environment, the RSS vectors of frames at one location will differ from that at another location. Therefore, the RSS vectors can be used to effectively detect the rogue APs that have different locations than legitimate APs, where a multi-dimensional RSS vector is aggregated from distributed RSS measurements.
However, for a practical environment, we have found that there exist a great many missing values in RSS vectors, due to the limited wireless transmission range and unreliable wireless links. If the sniffer is out of the frame’s transmission range, the corresponding RSS value is missing. If the frame is lost at the sniffer, the corresponding RSS value is also missing. The missing values in RSS vectors will affect the effectiveness of rogue AP detection, resulting in a higher false alarm rate.
To solve this issue, we propose a novel RSS-based approach for practical rogue access point detection (PRAPD), reducing the effect of missing RSS values on detection performance as much as possible. The main contributions of this article are as follows:
We present a preprocessing scheme to eliminate the missing values in the collected RSS vectors by means of data filling, filtering, and averaging.
We use k-medoid algorithm to perform clustering analysis on the RSS vectors, where we design a distance metric method that dynamically uses partial components in RSS vectors to minimize the distance deviation caused by the missing values.
We conduct the experiments in the practical environment and evaluate the performance of the proposed approach PRAPD. The results show that the PRAPD can effectively reduce the false alarm rate while ensuring a high detection rate.
The rest of the article is organized as follows: section “Related work” briefly reviews the existing work on rogue AP detection. Section “Rogue AP attack” describes the attack model we address in this article. In section “Design of PRAPD,” we present the rogue AP detection approach, mainly including data preprocessing and clustering analysis. In section “Evaluation,” the experiments and results are presented and analyzed. Finally, section “Conclusion” concludes this article.
Related work
The rogue AP attack is a serious security threat in WLANs and has attracted significant attention from both industry and academia. 5 At present, the function of rogue AP detection has been integrated in some WLAN network management systems, such as AirWave. 23 In addition, there are some systems that deploy sniffer nodes to achieve rogue AP detection, such as AirMagnet. 24
Rogue AP detection has also caught the attention of researchers for many years. The traditional detection method can collect the SSID, MAC, and other parameters of the legitimate AP in advance to form a white list, then capture 802.11 frames, and detect rogue APs by MAC address filtering. Although this method is simple and efficient, an attacker can easily evade the detection by modifying the MAC address of the rogue AP so that it is the same with a legitimate AP. 25 Because the MAC address can be spoofed, researchers have begun to explore other effective features for rogue AP detection. In this article, we classify the existing approaches of rogue AP detection into three main categories: user-side detection, wired-side detection, and wireless-side detection.
User-side detection
Several approaches have been proposed to implement low-cost and lightweight rogue AP detection from the perspective of users.
Han et al. 4 considered a category of rogue APs that are equipped with dual wireless interfaces (an interface is connected to a legitimate AP, while the other interface is pretended to be a legitimate AP to induce users) and designed a timing-based scheme that allows users to avoid connecting to rogue APs. The user-centric scheme employs the round trip time between the user and the DNS server to independently determine whether the associated AP is a rogue AP. For detecting the same category of rogue APs, Yang et al. 7 analyzed the interpacket arrival time (IAT) as the detection feature, and the IAT is the time interval between two consecutive packets from the same device (the remote sever or the associated AP) to the user device. On this basis, the authors proposed detection algorithms that utilized the IAT as the detection feature, considered the influencing factors of RSS and network saturation, and employed sequential probability ratio test technology to achieve rogue AP detection.
Nakhila et al. 8 proposed a comprehensive real-time user-side method to detect both types of rogue APs in parallel by creating two virtual wireless clients (VWCs). For a rogue AP connecting to a legitimate AP, a VWC monitored multiple channels in random order looking for specific data packets sent by a server on the Internet, and the rogue AP would be detected if duplicated data or no data were captured. For a rogue AP connecting to 4G LTE and other mobile communication networks, the second VWC would detect it when the wireless network used two different gateways by switching from one AP to another in the middle of a secure connection. Gonzales et al. 9 presented a context-leashing strategy for rogue AP detection, where users compared the current context with the previously learned context for the AP, and determined whether it was a rogue AP.
These approaches provide technical solutions to prevent rogue AP attack from the user’s point of view and have the advantages of lightweight and low cost. However, users are required to customize their devices or install additional software, and it is difficult to be widely applied.
Wired-side detection
Wired-side detection is a category of rogue AP detection technologies from the perspective of network administrators. Because WLANs generally use a wired network as the backhaul network to connect to the Internet, network traffic can be captured at the gateway or at the mirror port of the switch, and then traffic analysis is performed to detect rogue APs.
Beyah et al. 10 found that a wireless link in a network path of multiple links would cause a more random and temporally different spreading of packets. On this basis, the characteristic of inter-packet spacing was used to detect unwanted wireless traffic on the switch port and decides whether there exists a rogue AP. Wei et al. 11 demonstrated that the inter-arrival time of TCP ACK pairs could effectively differentiate wired and wireless connections and designed two online algorithms to detect rogue APs. Both algorithms used sequential hypothesis test technique and took the inter-ACK times as the input. Burns et al. 12 considered the relay-based rogue AP attack, that is, relaying the traffic through a legitimate AP, and set up a remote server to detect rogue APs by analyzing the user–server and server–user traceroute results. Although a rogue AP could hide the existence of the legitimate AP by tampering the user–server traceroute results, it could not prevent the server discovering the rogue AP from the server–user traceroute results.
It can be seen that this category of detection methods utilizes the characteristics of the traffic in the wired network to perform rogue AP detection and have the advantages of easy data collection, high detection efficiency, and independence of the signal range of rogue APs. However, these methods can only apply to rogue APs that connect to the Internet through a wired network that can be controlled by the administrator and cannot work if rogue APs use 4G LTE networks to connect to the Internet.
Wireless-side detection
Wireless-side detection is another category of rogue AP detection technologies from the perspective of network administrators. It aims to capture wireless signals and extract some features to achieve rogue AP detection.
Jana and Kasera 13 used the clock skew as AP’s fingerprint to detect rogue APs, and the clock skew was estimated using the time synchronization function (TSF) timestamps in the 802.11 beacon/probe response frames. Lanze et al. 14 considered the effect of temperature on clock skew and proposed a corresponding detection method. Jang et al. 15 developed a rogue AP detection mechanism that used the feature of channel interference. Guo and Chiueh 16 proposed a detection algorithm that leveraged the sequence number field in the link-layer header of IEEE 802.11 frame. In addition to these features, the RSS is the most commonly used location-dependent feature and has also attracted the attention of researchers. 26
Sheng et al. 17 discovered that the RSS values followed a mixture of multiple Gaussian distributions due to antenna diversity and then proposed an approach based on Gaussian mixture models, building RSS profiles for rogue AP detection. Chen et al. 18 used the spatial correlation of RSS and detected rogue APs by performing clustering analysis in RSS. Furthermore, Yang et al. 19 considered that multiple rogue APs were existing in the network and proposed the corresponding solutions. Alotaibi and Elleithy 20 also utilized RSS and proposed an approach based on the random forests ensemble method to detect rogue APs. Zhou et al. 21 proposed a crowdsensing-based approach to detect rogue APs without specialized hardware requirement. The authors designed a grid-based profiling method to build RSS profile with crowdsensing collections and presented a matching algorithm to detect abnormal samples based on the majority voting. Qu et al. 22 considered a rogue AP that was set up in moving vehicles and proposed a detection algorithm based on RSS. The algorithm used RSS to estimate the distance between the rogue AP and the detector and compared it to the distance calculated by the fake GPS location of the rogue AP.
As discussed above, the spatial correlation of RSS can be used to effectively detect a rogue AP that should be at a different location from the legitimate AP, and several RSS-based approaches have been proposed. However, few of them considers the missing RSS values that will significantly affect the effectiveness of rogue AP detection. Hence, in this article, we propose a novel RSS-based approach for PRAPD, reducing the effect of missing RSS values on detection performance as much as possible.
Rogue AP attack
To demonstrate the efficacy of the proposed approach for rogue AP detection, we present the description of the rogue AP attack and related assumptions as follows.
The adversary can use an off-the-shelf device to impersonate a legitimate AP, such as an OpenWrt wireless router or a Linux laptop running the hostapd software. In this article, we assume that the rogue AP attack can be implemented by an adversary with the capability to mimic the configurations of the legitimate WLAN. Specifically, the SSID, BSSID (an identifier that uniquely identifies an access point, and corresponds to the MAC address of the access point), and other configurations of a rogue AP should be exactly the same as the corresponding legitimate AP.
We assume that the adversary sets up a rogue AP after finding a legitimate AP as the target, and the legitimate AP and the rogue AP coexist in the WLAN during the process of the attack, as shown in Figure 1. The rogue AP can increase its signal strength to lure users to connect. We also assume that the rogue AP is deployed in a different location from the legitimate because an unknown device placed near the legitimate AP will easily attract administrators’ attention. In addition, because wireless-side detection is done in this article, we do not care how the rogue AP connects to the Internet. The rogue AP can relay to legitimate APs or use 4G LTE networks.

Attack model of rogue APs.
Design of PRAPD
To detect a rogue AP coexisting with the mimicked legitimate AP, we design the approach PRAPD that senses and identifies the changes of RSS information when a rogue AP is set up. In this section, we discuss the overview of our approach and present details about data preprocess and clustering analysis.
Overview
In PRAPD, an RSS value is the signal strength of an AP’s beacon frame captured by a sniffer. The RSS value is closely related to the AP’s physical location and is determined by the distance to the sniffer. We deploy n sniffers that, respectively, capture APs’ beacon frames and collect the corresponding RSS values, as shown in Figure 2. Since the locations of these sniffers are distinctive in physical space, the RSS values for a beacon frame are also usually different.

RSS collection by n sniffers.
Then, the RSS values are aggregated and processed by a central server. It can be seen from the structure of the beacon frame in Figure 3 that there is a timestamp field. The sending timestamp will be inserted into this field when the frame is ready to send, and we can use the timestamp to identify a beacon frame, and aggregate its RSS values collected by multiple sniffers. For a beacon frame, the aggregated RSS vector is denoted as

802.11 beacon frame. 27
For each BSSID, the RSS vectors will be close to each other in signal space when no rogue AP exists because the corresponding frames are sent from the same location. However, when there is a rogue AP using this BSSID, there are two APs at different physical locations claiming the BSSID. As a result, the RSS vectors from the legitimate AP will be mixed with that from the rogue AP, and these RSS vectors from the two different locations in the physical space should form two clusters in signal space. Hence, we can conduct clustering analysis on the RSS vectors from each BSSID in order to detect rogue APs.
As shown in Figure 4, our approach consists of two phases. During the offline profiling phase, we collect RSS vectors multiple times for each legitimate AP and obtain the distance between two centroids in signal space after clustering analysis. Then, we use the distribution of the distance information to determine the threshold

Overview of the proposed approach PRAPD.
Data preprocess
In the practical environment, there are many missing values in RSS vectors, as shown in Table 1. This is because that the wireless transmission range is limited or wireless links are unreliable. Specifically, the sniffer cannot receive an AP’s frame if it is out of the transmission range of the AP, and there is another case that the sniffer may lose a frame due to unreliable wireless links. In this article, we deal with these missing values by data filling, filtering, and averaging.
Fragment of RSS vectors from a legitimate AP (dBm).
AP: access point; RSS: received signal strength.
We first use a constant (–100) that is smaller than any of the measured values and fill the RSS vectors to eliminate these missing values.
28
The data filling can well address the issue of missing values caused by AP’s limited transmission range, such as the missing values of the component
RSS vectors after data filling (dBm).
RSS: received signal strength.
In order to solve this issue, we perform data filtering and data averaging in sequence:
Data filtering. For each component, we think that the missing values are caused by frame lost, if there are only a small amount of missing values in the component, that is, the proportion of vectors missing in the component is less than
Data averaging. After data filtering, we average the RSS values in each component and replace the constant value (–100) with the average.
Clustering analysis
In the section, we use the k-medoid algorithm
Because we only consider the RSS vectors from one or two APs (offline profiling or online detection) in clustering analysis, there exist a large number of RSS vectors similar to each other. Since the measured RSS value is a discrete integer, we find that multiple RSS vectors will correspond to a point in the signal space. In order to improve the efficiency of clustering analysis, we can scan the RSS vectors, obtain a set of RSS vectors that differ from each other, and count the number of occurrences for each RSS vector. The set of RSS vectors is denoted by
It is of great importance for clustering analysis to define the distance between RSS vectors. The traditional Euclidean distance considers all the components of the RSS vectors and can be easily exaggerated due to the missing values. Therefore, we define a new distance function that dynamically considers partial components of the RSS vectors, that is
where B
As described in Algorithm 1, we first initialize two medoids
where
After clustering analysis, we can calculate the distance between the two medoids. In the offline profile phase, for each BSSID, we perform clustering analysis multiple times on different groups of RSS vectors that are collected from the network scenario with no rogue AP and obtain the distribution of medoid distance. We can use the distribution information to determine the distance threshold
Evaluation
We conduct experiments to evaluate the performance of our approach PRAPD. In this section, we describe our evaluation methodology and results, including experimental setup, performance metric, impacts of medoid distance threshold, varying number of components used, and varying locations of the rogue AP.
Experimental setup
First, we set up a 15 m × 7 m environment at Computer Building of Jiulonghu Campus of Southeast University, as shown in Figure 5.

Experimental environment.
In our experiments, the implementation of the proposed approach PRAPD is described as follows:
Legitimate AP and rogue AP. Two NETGEAR WNDR3800 routers running OpenWrt are used as the legitimate AP and the rogue AP, respectively. We modify the BSSID of the rogue AP to be the same of the legitimate AP.
Sniffers. We deploy six sniffers, that is, Raspberry Pi equipped with wireless adapter Ralink RT5370. We configure the wireless interface to be monitor mode and use the libpcap library to capture 802.11 beacon frames. Sniffer deployment is presented in Figure 5.
Central server. We use an HP PC as a central server that aggregates the collected RSS values and perform clustering analysis to detect rogue APs.
We collect 300 groups of RSS vectors from the scenarios without no rogue AP, and each group contains 1000 RSS vectors. For each rogue AP location, we, respectively, collect 150 groups of RSS vectors from the scenarios with the rogue AP.
Performance metric
In this work, we use the following two metrics to evaluate the performance of our proposed approach:
Detection rate. For the RSS samples (we use a group of RSS vectors as a sample) collected from the scenarios with a rogue AP, the detection rate is the proportion of the samples through which our approach detects the existence of the rogue AP correctly.
False alarm rate. For the RSS samples collected from the scenarios with no rogue AP, the false alarm rate is the proportion of the samples through which our approach incorrectly determines the existence of a rogue AP.
Impacts of medoid distance threshold
Here, we evaluate the performance of the proposed approach affected by threshold estimation. We set the number of components used
We use 150 RSS samples (with no rogue AP) to estimate the medoid distance threshold. According to the distribution of the medoid distance, we obtain a quantile as the threshold that corresponds to the probability

Impacts of medoid distance threshold.
It can be seen from Figure 6 that it is able to find a trade-off between the detection rate and false alarm rate through adjusting the medoid distance threshold.
Varying number of components used
Next, we evaluate the performance of the proposed approach varying number of components used
Figure 7 shows the results. From this figure, we can see that the false alarm rate is high when all the components are used

Varying number of components used
Varying locations of the rogue AP
Finally, we evaluate the performance of the proposed approach varying locations of the rogue AP. We set the number of components used B to be 5 and set the medoid distance threshold
Figure 8 shows the results. We can see from this figure that our proposed approach is effective when the rogue AP is deployed in any of these three locations. Moreover, the results show that the detection rate is slowly decreasing when the rogue AP is near to the legitimate AP and stays above 0.80 even if the rogue AP is only 4.7 m away from the legitimate AP.

Varying locations of the rogue AP.
Conclusion
We investigated the RSS-based rouge AP detection approach in practical environments, while considering missing RSS values in the collected RSS vectors. First, we presented a data preprocessing scheme to eliminate the missing values in the collected RSS vectors by means of data filling, filtering, and averaging. Then, we utilized the k-medoid algorithm to perform clustering analysis on the RSS vectors, where we designed a distance metric method that dynamically used partial components in RSS vectors to minimize the distance deviation caused by the missing values. Finally, we conducted the experiments in the practical environment and evaluated the performance of the proposed approach PRAPD. The results demonstrated that the PRAPD can effectively reduce the false alarm rate while ensuring a high detection rate.
In the future, we plan to extend the proposed approach for large-scale scenarios. Specifically, we will investigate how to properly deploy sniffers and explore how to deal with high-dimensional RSS data.
Footnotes
Handling Editor: Xinwen Fu
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by National Key R&D Program of China (Grant No. 2017YFB1003000), National Natural Science Foundation of China (Grant Nos 61632008, 61572130, 61502100, 61602111, 61532013, and 61320106007), Jiangsu Provincial Natural Science Foundation of China (Grant Nos BK20150637 and BK20150628), Jiangsu Provincial Scientific and Technological Achievements Transfer Fund (Grant No. BA2016052), Jiangsu Provincial Key Laboratory of Network and Information Security (Grant No. BM2003201), and Key Laboratory of Computer Network and Information Integration of Ministry of Education of China (Grant No. 93K-9).
