Abstract
Supervised machine learning has been widely used in context-aware wireless sensor networks (WSNs) to discover context descriptions from sensor data. However, collecting a lot of labeled training data in order to guarantee good performance requires much cost and time. For this reason, the semisupervised learning has been recently developed due to its superior performance despite using only a small amount of the labeled data. In this paper, we extend the standard support vector regression (SVR) to the semisupervised SVR by employing manifold regularization, which we call Laplacian SVR (LapSVR). The LapSVR is compared with the standard SVR and the semisupervised least square algorithm that is another recently developed semisupervised regression algorithm. The algorithms are evaluated for location awareness of multiple mobile robots in a WSN. The experimental results show that the proposed algorithm yields more accurate location estimates than the other algorithms.
1. Introduction
Location awareness is one of the most important issues for context-aware wireless sensor networks (WSNs) because many applications need to know where the data has been obtained. For example, in applications such as target tracking [1–5], network routing [6, 7], human-centric service [8–10], and monitoring system [11, 12], the measurement data is useless without location information of sensing device.
Machine learning methods have been adopted to learn location context from law sensor data. The standard SVR learning [13, 14], manifold learning [15], and neural network learning [16, 17] can be employed to obtain precise estimate of the location. However, all these methods require a large amount of the labeled training data for high accuracy. Also, a significant effort such as cost, time, and human skill is needed in order to generate a large set of the labeled data. To make location-aware algorithm viable with a negligible effort, this paper adopts a semisupervised learning which uses a small number of the labeled data.
Based on the semisupervised learning framework suggested in [18], various algorithms such as [19–21] have been developed by defining different loss functions and regularization functions. In this work, we develop Laplacian support vector regression (LapSVR) which extends the standard SVR to the semisupervised learning framework. This approach uses the standard SVR to define the loss function and kernel-based regularization. However, for utilization of unlabeled data, the manifold regularization based on the concept of the graph Laplacian [22] is performed.
We evaluate the LapSVR to estimate locations of multiple mobile robots in the WSN. The deployed anchor nodes broadcast RF signal whose signal strength (RSSI) is measured by the mobile robots. The mobile robots independently localize themselves by using the trained regression model and current RSSI measurements. The LapSVR is compared with the standard SVR and the semisupervised Laplacian least square algorithm [23]. The location awareness performance is compared according to the number of labeled data, unlabeled data, and anchor nodes. From the experimental results, we confirm that the LapSVR estimates the location most accurately and is least sensitive to the variation of the number of the anchor nodes.
This paper is organized as follows. In Section 2, we provide details on Laplacian support vector regression algorithm. Section 3 describes the estimation process of mobile robots using the proposed algorithm. Section 4 reports empirical results with the description of the experimental setting. Finally, concluding remarks are given in Section 5.
2. Laplacian Support Vector Regression
This section describes a semisupervised regression approach named Laplacian support vector regression (LapSVR). We use the standard SVR algorithm to define the loss function and kernel-based regularization. Also, the graph Laplacian is employed for manifold regularization. We describe these ingredients of the formulation in the following.
2.1. Semisupervised Learning Framework
We define a set of l labeled samples
2.2. Decision Function
By the representer theorem [25], the solution of (1) can be defined as an expansion of kernel function over the labeled and the unlabeled data, given by
Also, the regularization term
2.3. Manifold Regularization
According to the manifold regularization (or manifold assumption), the data points are samples from a low-dimensional manifold which is embedded in a high-dimensional space. This is represented by the graph Laplacian and the kernel function:
Minimizing
2.4. Loss Function for Regression
We employ the ε-insensitive loss function:

A behavior of the ε-intensive loss function with slack variable ξ [24].
2.5. Complete Formulation
In this subsection, the complete formulation of the LapSVR is given based on the ingredients described in Sections 2.1–2.4. Substituting (2)–(6) into (1) with the slack variables
After the introduction of four sets of l multipliers β,
In order to convert the primal problem to the dual representation, we take derivatives
Using the above equations, we can rewrite the Lagrangian as a function of α, β, and
Taking derivatives with respect to α, we obtain
2.6. Computation of b
Now, the remaining work to establish regression function (2) is computation of the bias term b. As described in [24], b is calculated by using Karush-Kuhn-Tucker (KKT) conditions which state that the product between dual variables and constraints has to be zero at the point of the solution. From (8), we obtain
Equations (14) lead to three conclusions. First, only the samples
3. RSSI-Based Location Awareness
In RSSI-based location awareness, mobile nodes of unknown locations collect the RSSI measurements emitted by n anchor nodes. We assume that the anchor nodes are stationary with known location in
Here, we introduce notations of the labeled and the unlabeled training data sets. Because each robot estimates its own location independent of the others, we will describe the algorithm of one mobile node. In the training phase, we put the mobile node at the specific location
In the test phase, with the trained semisupervised SVR model, the mobile node determines its own location
We summarize the detail of the semisupervised SVR for RSSI-based location awareness Algorithms 1 and 2.
Algorithm 1 (training phase).
Consider the following.
Step 1.
Collect the training data set
Step 2.
Obtain the kernel matrix K in (2).
Step 3.
Build weight matrix W and the graph Laplacian
Step 4.
Choose the regularization weights
Step 5.
Solve the QP problem twice in order to obtain
Step 6.
Compute
Algorithm 2 (test phase).
Consider the following.
Step 1.
At time t, the mobile node collects the RSSI measurements
Step 2.
Obtain the regression estimation with
4. Experimental Results
In the experimental setup, the total of thirteen anchor nodes using IEEE 802.15.4 and TinyOS is deployed. The nodes use commercial CC2420 radio chip which operates in the 2.4 GHz ISM band with a data rate of 256 kbps and provides a received signal strength indicator (RSSI) for each received packet. Two mobile robots, which are equipped with one receiver node, respectively, estimate their own position using the RSSI measurements obtained from the anchor nodes distributed in the 3 m × 4 m workspace, shown in Figure 2. For accuracy analysis, the true locations of the mobile robots are measured by the Vicon Motion System [26] composed of eight cameras that track reflective markers attached to the mobile robots.

Experimental setting with two mobile robots and thirteen anchor nodes in the 3 m × 4 m area.
For obtaining labeled training data, we collect the RSSI data sets of the mobile robot placed at the different locations, repeatedly. On the other hand, the unlabeled data sets are obtained as we let the mobile robots move automatically and collect the RSSI measurements. Figure 3 is an illustration of the obtained data set. Note that the real locations of the unlabeled data are not included in the unlabeled data set, although Figure 3 shows the locations of the unlabeled data.

Distribution of the thirty labeled and eighty unlabeled data and thirteen anchor nodes.
4.1. Characteristics of RSSI
Radio signal is easily influenced by environmental noise. Also, its measured value affects the performance of the location awareness. Here, we examine the characteristic of RSSI measurement.
We record RSSI measurements of one moving receiver, emitted by the fixed thirteen anchor nodes. The distance between the anchor nodes and receiver varies from 0.2 m to 3 m. Figure 4 shows the boxplot of the obtained measurements. The central red bar is the median and the edges of the box are the 25% and 75%. The upper and lower bars outside the box represent extreme data points. The results show highly nonlinear and noisy RSSI measurements, which supports the need for learning to obtain accurate estimates. Still, we can observe that similar RSSI values denote similar distance values, which suggests that RSSI values are meaningful data representing the actual phenomenon in a kernel space (i.e., RSSI measurements obtained from our experimental setup satisfy manifold assumption described in Section 2.3). Hence, the RSSI is suitable to be used for the LapSVR. We note that the LapSVR can be used with other sensors as long as they satisfy the manifold assumption.

Relationship between RSSI measurements and distance between fixed thirteen anchor nodes (transmitter) and one mobile node (receiver).
4.2. Model and Parameter Setting
We use as the kernel function in (2) the radial basis function (RBF) given by
On the other hand, the regularization parameters

Training error over the validation set for the LapSVR as a function of regularization parameters
4.3. Performance
The proposed algorithm is compared with two other algorithms. First, the standard SVR is conventionally employed for the location awareness problem as in [14, 27]. Second, Laplacian least square (LapLS) is a recently developed semisupervised regression algorithm and its application to the location estimation can be found in [23]. LapLS solves optimization using graph Laplacian and loss function in the least square that does not have any constraint for the optimization. Therefore, an optimal solution can be obtained without a numerical process.
Figure 6 illustrates the true and estimated trajectories of the LapSVR, standard SVR, and LapLS algorithms using the obtained data set in Figure 3, where the LapSVR gives the best tracking performance. In the following, we show three kinds of performance criteria to compare the algorithms.

True and estimated trajectories of the mobile robots via Laplacian SVR (LapSVR), standard SVR (SVR), and Laplacian least square (LapLS).
First, Figure 7 shows the average location estimation errors according to the number of the labeled data, which are calculated after two mobile robots move along their own trajectories. We confirm that the proposed algorithm gives the best performance over the other two algorithms.

Performance of the three different algorithms according to the number of the labeled data.
Second, location estimation error of the LapSVR according to the number of the unlabeled data is described in Figure 8. From this result, we confirm that the unlabeled data gives a positive effect when the labeled data is insufficient (less than 50 labeled data in Figure 8). The excessive unlabeled data (more than 50 unlabeled data in Figure 8) does not improve the performance.

Performance of the LapSVR according to the number of the unlabeled data.
Third, we examine the sensitivity of the location estimation error with respect to the number of the anchor nodes. We prepare the set 2 consisting of all the deployed nodes. The set 1 is formed with the nodes whose numbers are from 1 to 9 depicted in Figure 2. The set 0 consists of the nodes whose numbers are 1, 3, 5, 7, and 9. In this experiment, we use 40 labeled data and 200 unlabeled data. As the result in Figure 9, the LapSVR algorithm causes the smallest growth of error as the number of the anchor nodes decreases in comparison with the other algorithms.

Sensitivity of the location estimation error to the number of the anchor nodes.
5. Conclusion
This paper proposes a semisupervised regression algorithm named Laplacian SVR by combining the core concepts of the standard SVR and a manifold regularization framework. The proposed algorithm learns the relation mapping function between the observation space and the physical space using a small number of the labeled data and a large number of the unlabeled data which can often be easily acquired more than the labeled data. The suggested algorithm is evaluated for location estimation of mobile robots in the wireless sensor network. In our multirobot system, the input of the learner is defined as the RSSI observation set and the output is the physical location. The experimental results show that the proposed algorithm gives the most precise performance and is least sensitive to variation of the number of anchor nodes in comparison with the standard SVR and the Laplacian least square algorithm.
Footnotes
Conflict of Interests
The authors declare that there is no conflict of interests regarding the publication of this paper.
Acknowledgments
This research was financially supported by a grant to MEMS Research Center for National Defense funded by Defense Acquisition Program Administration and the National Research Foundation of Korea (NRF) Grant funded by the Ministry of Science, ICT & Future Planning (MSIP) (no. 2009-0083495).
