Abstract
Anomaly detection in wireless sensor networks (WSNs) is critical to ensure the quality of senor data, secure monitoring, and reliable detection of interesting and critical events. The main challenge of anomaly detection algorithm in WSNs is identifying anomalies with high accuracy while consuming minimal resource of the network. In this paper two lightweight anomaly detection algorithms LADS and LADQA are proposed for WSNs. Both algorithms utilize the one-class quarter-sphere support vector machine (QSSVM) and convert the linear optimization problem of QSSVM to a sort problem for the reduced computational complexity. Experimental results show that the proposed algorithms can keep the lower computational complexity without reducing the accuracy for anomaly detection, compared to QSSVM.
1. Introduction
Wireless sensor networks (WSNs) have been widely used in various applications including civil and military domains [1]. However, the harsh deployment environment and the constrained capabilities of sensors (energy, CPU, memory, etc.) make WSNs more vulnerable to different types of misbehaviors or anomalies. In WSNs, an anomaly or outlier (these terms are used interchangeably in this paper) is defined as the measurement that significantly deviates from the normal pattern of the sensed data [2]. Sensor data are affected by these anomalies that always correspond to node software or hardware failure, reading errors, unusual events, and malicious attacks. Therefore, it is critical to efficiently and accurately identify anomalies in the sensor data to ensure data quality, secure monitoring, and reliable detection of interesting and critical events.
The context of sensor networks and nature of sensor data make design of an appropriate anomaly detection technique challenging [3]. The constrained environment of a WSN impacts on anomaly detection algorithms. Node constraints on computational power and memory mean that algorithms for anomaly detection should have low computational complexity and occupy little memory space. Moreover, prelabelled data are expensive or difficult to obtain in WSNs. Anomaly detection for WSNs should be able to operate on unlabelled data. In general, the key challenge of anomaly detection algorithm in WSNs is identifying anomalies with high accuracy while consuming minimal resource of the network.
Recently, support vector machines (SVM) in the form of the one-class quarter-sphere (QSSVM) have been used for anomaly detection in WSNs due to their reduced computational complexity and ability to operate on unlabelled data [4, 5]. QSSVM can convert the quadratic optimization problem of one-class SVM to a linear optimization problem. A family of algorithms based on QSSVM are proposed for anomaly detection in WSNs and have shown the potential for anomaly detection. However, the main disadvantage of those algorithms derived from QSSVM is the high computational cost for the solution of a linear programme.
In this paper, we convert the linear optimization problem of QSSVM to a sort problem and propose two lightweight anomaly detection algorithms for WSNs to identify anomalies. Simulations show that our proposed algorithms are able to reduce the computational complexity without reducing the accuracy for anomaly detection.
Our paper makes the following contributions:
We present a mathematic method to prove that the linear optimization problem of QSSVM can be converted to a sort problem. Based on the presented method, we propose two lightweight anomaly detection algorithms for WSNs, namely, lightweight anomaly detection algorithm using sort (LADS) and lightweight anomaly detection algorithm using quick select (LADQS). It is shown that our algorithms are equivalent to QSSVM but have lower computational complexity. The experimental evaluation of the effectiveness and efficiency of the proposed algorithms on real world WSN dataset is presented.
The remainder of this paper is structured as follows. Related works on QSSVM-based anomaly detection models are presented in Section 2. In Section 3, first the principle of QSSVM is described, and then a method to convert the linear optimization problem of QSSVM to a sort problem is discussed. Furthermore, our proposed lightweight anomaly detection algorithms are also explained. Experimental results and performance evaluation are reported in Section 4. Section 5 concludes the paper and suggests some directions for future research.
2. Related Work
Anomaly detection typically makes use of data mining and machine learning techniques to detect abnormal activities in the systems [2]. Anomaly detection techniques for WSNs can be categorized into statistical-based, nearest neighbor-based, clustering-based, classification-based, and spectral decomposition-based approaches [6]. Classification models are important models of data mining and machine learning community in which a classification model is learned using the known training data and used after that to classify the unseen testing data into different types of classes. SVM-based techniques are one of the popular classification-based approaches and have been widely used to detect anomalies due to the advantages of no requirement of an explicit statistical model and prevention from the curse of dimensionality.
In WSNs, prelabelled data are expensive or difficult to obtain. Several one-class SVM-based anomaly detection techniques have been proposed to process the unlabeled data. Their main idea is to use a nonlinear function to map the data vectors collected from the original input space to a higher dimensional space called feature space. Then a decision boundary of normal data is found, which encompasses the majority of data vectors in the feature space. Those data vectors falling outside the normal boundary are classified as anomalous. To this end, Schölkopf et al. have presented a hyperplane-based one-class SVM by fitting a hyperplane from the origin. Those data vectors near the origin are considered as anomalous [7]. Tax and Duin have proposed a hypersphere-based one-class SVM by fitting a hypersphere with a minimum radius [8]. Those data vectors falling outside the hypersphere are considered as anomalous. However, these one-class SVM-based techniques still require solving a quadratic optimization, and that is extremely costly.
In order to reduce expensive computational complexity of the quadratic optimization, Campbell and Bennett have formulated a linear programming approach for the hyperplane-based SVM proposed in [9], which is based on attracting the hyperplane towards the average of the distribution of mapped data vectors. Laskov et al. have extended work in Tax and Duin by proposing a quarter-sphere one-class SVM, which converts the quadratic optimization problem to a linear optimization problem by fitting a hypersphere centered at the origin and consequently reduces computational complexity of learning the normal boundary of data vectors [5, 8]. Based on QSSVM, several distributed outlier detection techniques for WSNs are proposed by Rajasegarar et al. and Zhang et al. [10, 11]. After that, Rajasegarar et al. have further extended work in [12] by proposing a hyperellipsoidal one-class SVM using a linear optimization. However, the solution of a linear optimization, rather than a quadratic, still requires expensive computational complexity due to the fact that the solution of the linear programme is
In this paper, we propose two lightweight anomaly detection techniques which convert the linear optimization problem of QSSVM to a sort problem and consequently reduce computational complexity of learning the normal boundary of data vectors.
3. Lightweight Anomaly Detection Algorithm for WSNs
In this section, we first introduce the principles of modeling the one-class quarter-sphere support vector machine (QSSVM) proposed in [5]. After that, we use a mathematical method to prove that the linear optimization problem of QSSVM can be converted to a sort problem and further propose two lightweight anomaly detection algorithms for WSNs.
3.1. Principles of the One-Class Quarter-Sphere SVM
In this section we discuss the principles of QSSVM proposed by Laskov et al. in [5]. They have converted the quadratic optimization problem of the one-class SVM to a linear optimization problem by fixing the center of the quarter-sphere at the origin. The geometry of hypersphere SVM is shown in Figure 1.

Geometry of QSSVM formulation in the feature space.
Assume that n data vectors
Obtaining the dual form of the optimization problem allows its formulation in terms of dot products of the data vectors in the training set. Using the kernel trick, the dot products
It can be seen from (2) that optimization problem is stated in terms of a dot product of an image vector with itself; this causes an issue with distance-based kernels, such as the RBF kernel, as the diagonal term
There is no explicit vector in feature space that represents the mean; however the dot product
After solving (2) for
3.2. Lightweight Anomaly Detection Algorithm Using Sort
In the previous section, we introduced the principle of QSSVM which convert the quadratic optimization problem to a linear optimization problem. We are aware that the learning process of QSSVM is to find the minimal radius R of one-class quarter-sphere for the training set, which can be obtained by the distances between border support vectors
In order to find the minimal radius R, the sequence
Actually, there exists an implied constraint that
Now we prove that the minimal radius R can be computed directly from the descending sequence
Theorem 1.
If the values of
Proof.
Since
Firstly, we prove that
Subtract
From the contradictions, we have
In a similar way, we can prove that
As proved above, it is evident that
According to Theorem 1, we can infer Theorems 2 and 3 as follows.
Theorem 2.
If the values of
Proof.
Since one of the constraint conditions is
Recalling Theorem 1, we have
Theorem 3.
Given that the values of if
otherwise,
Proof.
In the case when
Theorem 4.
The minimal radius R of the quarter-sphere in QSSVM can be obtained by
Proof.
As we know, the constrained optimization problem of QSSVM can be formulated as (2). Note that the terms
From Theorem 4, the minimal radius R of the quarter-sphere in QSSVM can be obtained by a descending sorted sequence using (9). So the linear optimization problem of QSSVM is converted to a sort problem of
After the generation of kernel matrix and the transformation of central kernel matrix, the data vectors
Now, our algorithm can be described in Algorithm 1.
Step 1. Calculate Step 2. Sort Step 3. Calculate Step 4. If Else End If
Algorithm 1
3.3. Lightweight Anomaly Detection Algorithm Using Quick Select
As discussed in the previous section, the minimal radius R of the quarter-sphere in QSSVM can be obtained from the descending sequence
function LADQS() let calculate the distance obtain return R = QuickSelect( // Returns the nth largest element of list. function QuickSelect(list, if return list pivotIndex = l + floor(rand() ∗ ( // randomly select a pivotIndex between l and r pivotIndex = QuickPartition(list, l, r, pivotIndex) if n = pivotIndex return list else if n < pivotIndex return QuickSelect(list, l, pivotIndex − 1, n) else return QuickSelect(list, pivotIndex + 1, r, n)
4. Experimental Results and Evaluation
This section specifies the performance evaluation of our two techniques compared to QSSVM. In our experiments, we have used real data gathered at the Grand-Saint-Bernard [15], which is similar to the one used in [16]. For simulation, we use Matlab to implement our algorithms and QSSVM on a single node in a WSN. For fairness, we use the average of the tests operated on 7 different nodes, respectively, as the experimental results.
4.1. Experimental Datasets
The real data are collected from a closed neighborhood from a WSN deployed in Grand-Saint-Bernard as shown in Figure 2. The closed neighborhood consists of node 2 and its 6 spatially neighboring nodes, namely, nodes 3, 4, 8, 12, 20, and 14. In our simulations, we test the real data collected during the period of 6 am–6 pm on September 20, 2007, with two attributes: ambient temperature and relative humidity for each sensor measurement. The data is preprocessed and normalized to the range

The relative positions of the nodes in Grand-Saint-Bernard.
4.2. Experimental Results and Evaluation
We choose radial basis function (RBF) kernel function to generate kernel matrices. RBF kernel function
We have examined the effect of the regularisation parameter v for our two anomaly detection algorithms and QSSVM. v represents the fraction of anomalies in training set and we have varied it in the range from 0.01 to 0.25 in intervals of 0.03. And we also have examined the training time for the three algorithms.
Figures 3 and 4 show that the detection rate and the false alarm rate obtained for our algorithm LADS use RBF kernel function for real data, respectively. As we discussed in the previous section, LADS, LADQS, and QSSVM have the same principle of data classification but have difference in the way of finding the minimal radius R. This means the LADS behaves in a similar manner to LADQS. Therefore, results of LADQS and QSSVM have been omitted.

Detection rate of LADS in terms of parameter v selection.

False alarm rate of LADS in terms of parameter v selection.
Figure 5 shows the training time elapsing for the three algorithms. We can see that the training time of our algorithms LADS and LADQS is significantly less than that of QSSVM. It indicates that our two algorithms have less time and lower computational complexity, compared to QSSVM.

Training time in terms of parameter v selection.
Simulation results show that our two algorithms LADS and LADQS have the lower computational complexity without reducing the accuracy for anomaly detection, compared to QSSVM.
Computational complexity of our techniques is presented in Table 1, where n denotes the number of data in the training sets, d represents the dimensionality of the measurement, v represents the fraction of anomalies in the training set, and
Computational complexity of our techniques.
5. Conclusion
In this paper we propose two lightweight anomaly detection algorithms for WSNs, LADS and LADQS. Both algorithms are based on QSSVM but convert the linear optimization problem of QSSVM to a sort problem. Simulation results show that our algorithms reduce the computational complexity while achieving the same accuracy for anomaly detection. Our future research includes selecting the optimal parameters for v and implementing our algorithms on multiple sensor nodes in real-life.
Footnotes
Conflict of Interests
The authors declare no conflict of interests.
Acknowledgment
The work is supported in part by National Science and Technology Major Project under Grant no. 2011ZX03005-002.
