Robust Indoor Sensor Localization Using Signatures of Received Signal Strength

Abstract

Indoor localization based on the received signal strength (RSS) values of the wireless sensors has recently received a lot of attention. However, due to the interference of other wireless devices and human activities, the RSS value varies significantly over different times. This hinders exact location prediction using RSS values. In this paper, we propose three methods to counter the adverse effect of the RSS value variation on location prediction. First, we propose to use an index location to select the best radio map, among several preconstructed radio maps, for online location prediction. Second, for an observed value of the signal strength of a sensor, we record, respectively, the distances from the sensor to the nearest location and the farthest location where the signal strength value has been observed. The minimal and maximal (min-max) distances for each signal strength value of a sensor are then used to reduce the search space in online location prediction. Third, a location-dependent received signal strength vector, called the RSS signature, is used to predict the location of a user. We have built a system, called the region-point system, based on the proposed three methods. The experimental results show that the region-point system offers less mean position error compared to the existing methods, namely, RADAR, TREE, and CaDet. Furthermore, the index location method correctly selects the best radio map for online location prediction, and the min-max distance method promotes the prediction accuracy of RADAR by restricting the search space of RADAR in location prediction.

1. Introduction

Indoor localization is important for many real-life applications. For example, it gives the location context of a context-aware system that provides proper settings of the system based on the location, activity, and physiology of the user and the environmental context information [1]. Recently, indoor navigation applications, which require an exact indoor location, are becoming a very popular research area [2]. Due to the increasing need for indoor localization, many indoor localization techniques have been proposed. An indoor localization method can be categorized as a range-based or a range-free method [3]. While point-to-point distance information is required for a range-based method, it is not required for a range-free method. The techniques for estimating the distance between two communication nodes include the time of arrival (TOA) [4], time difference of arrival (TDOA) [5], and the angle of arrival (AOA) [6]. The TOA technique uses the radio signal propagation time to estimate the distance. The TDOA technique utilizes two radio signals with different propagation speeds and estimates the distance between the two communication nodes by measuring the difference between the arrival times of the two signals. Unlike TOA and TDOA, AOA technique measures the angle at which a signal arrives. It can be used to complement TDOA or TOA in location calculation [3]. Indoor localization methods that use range information usually achieve high accuracy in location estimation. For example, the Cricket [7] indoor localization system of MIT reported the error of 1 to 3 centimeters in position estimation. Despite being accurate in location prediction, the range-based localization techniques require large scale deployment and costly devices.

The range-free location prediction techniques have received a lot of attention recently. The well-known range-free location prediction methods include RADAR [8] and the probability-based methods [9–12]. RADAR is developed by Microsoft. In RADAR, for a predefined set of training locations, the received signal strength (RSS) values from several IEEE 802.11 access points are recorded in a database, called the radio map. To estimate the position of a user, the RSS values from the access points are collected at the location of the user. Afterwards, RADAR performs pattern matching of the collected RSS values against the RSS values in the radio map to find a fixed number of locations with the most similar RSS values against those of the user. Finally, the positions with the most similar RSS values are averaged to give the estimated position of the user. The probability-based methods also use the RSS values for location prediction. However, instead of a fixed number of locations for prediction, the probability-based methods use the Bayes theorem to predict the location of the user by finding the location where the collected RSS values of the user can be observed with the highest probability. In [13], the authors proposed to learn, at time $t_{0}$ , a set of equations to fit the RSS values of a location using the RSS values of a set of reference points. With this method, the RSS value pattern of a specific location at a later time $t_{j}$ can be calculated by using the RSS value patterns of the reference locations at time $t_{j}$ . Therefore, the effort to collect the RSS values at the offline training phase can be significantly reduced. However, in an environment where the RSS values observed at a location vary over times, the regression equations learned at time $t_{0}$ may not properly reflect the relationship between the RSS values of the location and those of the reference points. This may result in poor prediction accuracy. In [14], the authors proposed a method, called CaDet, which uses multiple decision trees for location prediction. They first divide the training dataset into several clusters and build a decision tree for each cluster. To predict the user's location, the RSS values of the user are compared against the means of the RSS values of each cluster center to find the cluster with the least distance from the RSS values of the user for prediction. Finally, the decision tree of the selected cluster is used to predict the location of the user. Besides using the values of the received signal strength, in [15], the authors proposed to use the link quality indicator (LQI) values for location prediction. They modeled the location prediction problem as a classification problem and used a neural network model to solve the problem. However, their method is more suitable for finding a coarse position for a user, such as in the kitchen or in the living room.

The most difficult problem for the range-free methods in location prediction is that the offline constructed radio map may not be suitable for online location prediction. The variation of the received signal strength values may outdate the radio map when an online location prediction is required. In this paper, we propose three methods to counter the adverse effect of the variation of the received signal strength values on location prediction. First, we propose to construct several radio maps over different nonoverlapping time intervals and use an index location to select the best radio map for online location prediction. Second, for an RSS value of a sensor observed in the location prediction area, we propose to record the minimal and the maximal (min-max) distance from the sensor to the locations where the same RSS value has been observed. The min-max distance information is used to reduce the number of locations required to be searched for in online location prediction. Thirdly, we propose to use a location-dependent received signal strength vector, called the RSS location signature, for pattern matching in online location prediction. A system, called the region-point system, which implemented the three proposed methods, has been implemented. The experimental results show that the region-point system offers less position prediction error compared to the existing methods, including RADAR, TREE, and CADet. Furthermore, the experiment also shows that the index location method correctly selects the best radio map for location prediction, and the min-max distance method significantly reduces the position prediction error of RADAR. The rest of this paper is organized as follows. In Section 2, we describe the phenomenon of the variation of the received signal strength values. In Section 3, we present the details of the region-point localization system. In Section 4, we present the experimental results. In Section 5, we give a discussion of the experimental result. Finally, in Section 6, we give the conclusion of this paper.

2. Variation of the Received Signal Strength

The most challenging problem for location prediction using RSS values is that the RSS values of a sensor observed at a fixed location change over different times [12–14, 16]. In this paper, we use the MPR2400CA sensor, a ZigBee-based sensor called Mote, to show the phenomenon of RSS value variation over different times. The Mote uses the RF frequency band of 2.4–2.4835 GHz for communication. The 2.4 GHz band frequency is a very noisy band since the wireless local area network (802.11b and 802.11g), the Bluetooth personal area network (802.15.1), and the industrial, scientific, and medical (ISM) devices are all using this unlicensed frequency band. The interference from other networks or devices forces the received signal strength value of a sensor at a fixed location to vary significantly over different times. Furthermore, the unpredictable people moving and door opening or closing cause the changes in the reflection, absorption, diffraction and scattering of the RSS values amplify the variation of the RSS values in an indoor environment [13].

To show the variation of RSS values over different times, we collected 500 RSS values from a fixed location which is 84.85 centimeters away from a ZigBee sensor for a time interval of 4 consecutive hours. Figure 1(a) shows the distribution of the RSS values from 10 a.m. to 2 p.m., while Figure 1(b) shows the distribution of the RSS values from 3 p.m. to 7 p.m. These figures show that not only the shapes of the distributions but also the averages of the signal strength values in different time intervals are different. The variation of the RSS values over different times implies that the RSS values collected at the offline training phase may not be good for online location prediction [13].

Figure 1

Distributions of the signal strength values.

3. The Region-Point Location Prediction System

In this section, we present the implementation of a robust sensor prediction system which considers the variation of the RSS values.

3.1. The Components and Layout of the System

The components and the layout of the system are shown in Figure 2. The system is implemented in a classroom measuring 9.3 m × 13 m. There are three rows of tables with a desktop on each table. There are two doors and one electronic podium in the room. We placed ten Mote sensors, denoted by M in Figure 2, as the reference sensors.

Figure 2

The components and layout of the system.

A sensor, denoted by U, is mounted on a moving cart for testing the location prediction algorithm. To predict the location of a user, the sensor U (stands for the user) broadcasts a packet to the reference sensors. Upon receiving the packet from U, a reference sensor records the RSS value of its received packet, stores the RSS value in a new packet, and then sends the new packet to the location prediction computer, denoted by C in Figure 2, to predict the location of U.

3.2. Architecture of the System

Figure 3 shows the architecture of the region-point location prediction system. It contains the offline training phase and the online location prediction phase. The offline training phase contains the following steps. (1)

For different time periods, collect the RSS values of the reference sensors for each training location and store the RSS values in the radio maps.

(2)

Create a min-max distance table for each radio map.

(3)

Find the index location for radio map selection.

Figure 3

Architecture of the region-point location prediction system.

The online location prediction phase contains the following steps. (1)

Collect a number of RSS values at the index location.

(2)

Select the best radio map for online location prediction.

(3)

At the location that needs to be localized, collect the RSS values from the reference sensors; find the region for location prediction using the RSS values and the min-max distance table.

(4)

Find the position of the predicted location in the selected region using the RSS signature of the collected RSS values.

The details of each step are discussed in the following.

3.3. Radio Map Construction

During the offline phase, we choose a number of different time intervals and construct a radio map for each time interval. A set of training locations denoted by $L = {l_{1}, l_{2}, \dots, l_{n}}$ is chosen for collecting the RSS values. Each location $l_{i}$ is associated with a coordinate $(x_{i}, y_{i})$ . Assume that there are k reference sensors, denoted by $R = {m_{1}, m_{2}, \dots, m_{k}}$ , where $m_{i}$ denotes sensor i. Then, each RSS value is stored in a vector $o_{j}$ of k elements, denoted by $o_{j} = ({s s}_{1}, {s s}_{2}, \dots, {s s}_{k})$ , where ${s s}_{i}$ is the RSS value of the packet received from reference sensor i. Table 1 shows an example of the radio map.

Table 1

An example of the radio map of the system.

locx	locy	$s s_{1}$	$s s_{2}$	$s s_{3}$	$s s_{4}$	$s s_{5}$	$s s_{6}$	$s s_{7}$	$s s_{8}$	$s s_{9}$	$s s_{10}$
1	1	$- 9$	$- 33$	$- 30$	$- 43$	$- 40$	$- 32$	$- 32$	$- 30$	$- 33$	$- 31$
1	1	$- 9$	$- 33$	$- 30$	$- 43$	$- 40$	$- 32$	$- 32$	$- 30$	$- 33$	$- 31$
1	1	$- 9$	$- 33$	$- 30$	$- 43$	$- 40$	$- 32$	$- 31$	$- 30$	$- 32$	$- 31$
1	1	$- 9$	$- 33$	$- 29$	$- 42$	$- 40$	$- 32$	$- 31$	$- 29$	$- 33$	$- 31$
1	1	$- 9$	$- 33$	$- 29$	$- 43$	$- 40$	$- 32$	$- 31$	$- 31$	$- 33$	$- 31$
1	1	$- 9$	$- 33$	$- 30$	$- 43$	$- 40$	$- 32$	$- 32$	$- 29$	$- 32$	$- 32$
1	1	$- 9$	$- 33$	$- 30$	$- 43$	$- 41$	$- 33$	$- 31$	$- 30$	$- 33$	$- 31$
1	1	$- 9$	$- 33$	$- 29$	$- 43$	$- 44$	$- 32$	$- 32$	$- 30$	$- 33$	$- 31$
1	1	$- 9$	$- 33$	$- 30$	$- 43$	$- 43$	$- 32$	$- 31$	$- 30$	$- 33$	$- 31$
1	1	$- 9$	$- 33$	$- 30$	$- 43$	$- 42$	$- 32$	$- 32$	$- 31$	$- 33$	$- 31$
1	1	$- 9$	$- 33$	$- 29$	$- 43$	$- 43$	$- 32$	$- 32$	$- 30$	$- 33$	$- 31$
1	1	$- 9$	$- 33$	$- 29$	$- 43$	$- 43$	$- 32$	$- 32$	$- 30$	$- 32$	$- 31$
1	1	$- 9$	$- 32$	$- 29$	$- 43$	$- 43$	$- 33$	$- 31$	$- 30$	$- 32$	$- 31$
1	1	$- 9$	$- 32$	$- 30$	$- 44$	$- 42$	$- 32$	$- 31$	$- 31$	$- 32$	$- 31$
1	1	$- 9$	$- 33$	$- 30$	$- 43$	$- 42$	$- 33$	$- 31$	$- 30$	$- 32$	$- 31$
1	1	$- 9$	$- 33$	$- 29$	$- 43$	$- 42$	$- 33$	$- 32$	$- 31$	$- 33$	$- 31$
1	7	$- 25$	$- 23$	$- 29$	$- 35$	$- 31$	$- 31$	$- 31$	$- 29$	$- 36$	$- 30$
1	7	$- 25$	$- 23$	$- 26$	$- 33$	$- 31$	$- 31$	$- 33$	$- 28$	$- 36$	$- 30$
1	7	$- 27$	$- 22$	$- 29$	$- 33$	$- 31$	$- 31$	$- 34$	$- 28$	$- 36$	$- 29$
⋯	⋯	⋯	⋯	⋯	⋯	⋯	⋯	⋯	⋯	⋯	⋯

3.4. The Min-Max Distance Table

Due to the variation of the RSS values, a reference sensor may observe different RSS values from the localization sensor U when U is fixed at a specific location. Similarly, the same RSS value observed by a reference sensor may be from different packets transmitted by U at different locations. For example, the RSS value −29 dbm of sensor $s s_{8}$ in Table 1 is observed when U is located at location $(1, 1)$ and location $(1, 7)$ . During the offline training phase, for each observed received signal strength value ${s s}_{i}$ of reference sensor $m_{j}$ , we keep track of the minimum and the maximum distances from sensor $m_{j}$ to sensor U. Table 2 shows an example of the min-max distance table.

Table 2

The min-max distance table.

Mote	$s s$	Min dist.	Max dist.
1	$- 35$	5.099019514	23.60084744
1	$- 39$	5.099019514	23.60084744
1	$- 38$	5.099019514	23.60084744
1	$- 40$	5.099019514	23.60084744
1	$- 41$	5.099019514	23.60084744
1	$- 44$	5.099019514	23.60084744
1	$- 43$	5.099019514	23.60084744
1	$- 42$	5.099019514	23.60084744
1	$- 37$	5.099019514	23.60084744
1	$- 36$	5.099019514	23.60084744
1	$- 20$	5.099019514	7.071067812
1	$- 19$	5.099019514	7.071067812
⋯	⋯	⋯	⋯

The min-max distance table is used to reduce the search region of locations during the online location prediction phase.

3.5. The Index Location

As noted in [13], the radio map constructed in the training phase may not be suitable for online location prediction. We propose to use several radio maps for location prediction. Assume that the set of time intervals is $T = {t_{1}, t_{2}, \dots, t_{x}}$ . Let $M_{t}$ denote the radio map constructed at time interval t, $t \in T$ .

Let ${\bar{o}}_{t, i} = (\bar{s} {\bar{s}}_{t, i, 1}, \bar{s} {\bar{s}}_{t, i, 2}, \dots, \bar{s} {\bar{s}}_{t, i, k})$ denote the average RSS vector at location $l_{i}$ in $M_{t}$ , where $\bar{s} {\bar{s}}_{t, i, j}$ , $1 \leq j \leq k$ , is the average of the received signal strength values of sensor j at location $l_{i}$ . Then, for each location $l_{i}$ , we calculate $D_{i}$ , the summation of the Manhattan distances between every pair of average RSS vectors at location $l_{i}$ , where each vector belongs to a different radio map. That is,

\begin{matrix} D_{i} = \sum_{t, t^{'} \in T, t \neq t^{'}} \sum_{m = 1}^{k} | \bar{s} {\bar{s}}_{t, i, m} - \bar{s} {\bar{s}}_{t^{'}, i, m} | . \end{matrix}

(1)

The index location $l_{i}$ is the location which maximizes $D_{i}$ . That is, $D_{i} \geq D_{j}$ , $j = 1, \dots, n$ .

During the online localization phase, we collect five received signal strength vectors at the index location. Take the average of the signal strength vectors, and then use the average RSS vector to select the best radio map for online location prediction. Assume that the average RSS vector is ${\bar{o}}_{i} = (\bar{s} {\bar{s}}_{i, 1}, \bar{s} {\bar{s}}_{i, 2}, \dots, \bar{s} {\bar{s}}_{i, k})$ . Then, the radio map $M_{t^{*}}$ is found by using the following equation:

\begin{matrix} t^{*} = \underset{t = 1, \dots, x}{arg min} \sum_{m = 1}^{k} | \bar{s} {\bar{s}}_{t, i, m} - \bar{s} {\bar{s}}_{i, m} | . \end{matrix}

(2)

That is, we choose the radio map which minimizes the Manhattan distance against the online average RSS vector at location $l_{i}$ for online location prediction.

3.6. The RSS Location Signature

While the probability-based methods use the original radio map, as shown in Table 1, for location prediction, we propose to use a refined variant of the RSS vectors, called RSS signatures, for location prediction. An RSS signature of a location is a distinctive RSS representative for the location. Let $P (s s_{j, r} = k)$ denote the probability that the RSS value k of sensor r is observed at location $l_{j}$ . Probability $P ({s s}_{j, r} = k)$ is defined in the following equation:

\begin{matrix} P (s s_{j, r} = k) = \frac{fr (s s_{j, r} = k)}{\sum_{z = 1}^{n} fr (s s_{z, r} = k)}, \end{matrix}

(3)

where

fr (s s_{j, r} = k)

denotes the number of observations (frequency) of RSS value k of sensor r at location

l_{j}

. Note that, since the RSS value k of sensor r may be observed at different locations,

P (s s_{j, r} = k)

is the location distribution of the RSS value k of sensor r at location

l_{j}

. We then define the discernability factor

f (s s_{r} = k)

of an RSS value k of sensor r by the following equation:

\begin{array}{l} f (s s_{r} = k) = 1 - \frac{(- 1)}{\log n} * \sum_{j = 1}^{n} p (s s_{j, r} = k) \\ * \log (p (s s_{j, r} = k)) . \end{array}

(4)

The third and fourth terms of (4) together represent the entropy of location distribution of RSS value k of sensor r over different locations. The second term is used to normalize the entropy value to the interval (0,1). The maximal value of the entropy function occurs when value k of sensor r is evenly distributed over n locations. In this case, value k of sensor r does not have any discernability to distinguish between different locations. The higher the skewness of the location distribution is, the smaller the normalized entropy is. The normalized entropy value equals zero if the RSS value k of sensor r can only be observed at a single location. Therefore, the discernability factor of an RSS value k of a sensor r is a measure of the ability to distinguish between different locations in the system. Note that n in (4) is the number of locations in the system.

Having defined the discernability factor, we define the weight of an RSS value k of sensor r at location $l_{i}$ by the following equation:

\begin{matrix} w (s s_{i, r} = k) = \frac{fr (s s_{i, r} = k)}{m_{i}} * f (s s_{r} = k), \end{matrix}

(5)

where

m_{i}

is the total number of RSS samples, that is, the number of RSS vectors, collected at location

l_{i}

. Equation (5) shows that the weight of RSS value k of sensor r at location

l_{i}

is the product of the discernability factor of RSS value k and the probability of observing k at location

l_{i}

For location $l_{i}$ , we define its location signature at sensor r to be the RSS value received from sensor r whose weight is greater than that of any other RSS value received by U at location $l_{i}$ from sensor r. To obtain the RSS location signature vector for location $l_{i}$ , we find the RSS location signature value of each sensor r, $1 \leq r \leq k$ . Table 3 shows an example of the table of RSS location signatures for the radio map in Table 1. Table 4 shows the weights of the corresponding RSS location signatures in Table 3.

Table 3

The RSS location signature table.

locx	locy	$s s_{1}$	$s s_{2}$	$s s_{3}$	$s s_{4}$	$s s_{5}$	$s s_{6}$	$s s_{7}$	$s s_{8}$	$s s_{9}$	$s s_{10}$
1	1	$- 9$	$- 33$	$- 30$	$- 43$	$- 37$	$- 29$	$- 29$	$- 25$	$- 33$	$- 27$
1	7	$- 25$	$- 21$	$- 34$	$- 35$	$- 33$	$- 27$	$- 33$	$- 37$	$- 43$	$- 29$
1	13	$- 29$	$- 25$	$- 19$	$- 29$	$- 33$	$- 27$	$- 35$	$- 37$	$- 47$	$- 36$
1	19	$- 35$	$- 25$	$- 19$	$- 13$	$- 34$	$- 30$	$- 24$	$- 29$	$- 33$	$- 36$
5	19	$- 35$	$- 19$	$- 19$	$- 17$	$- 23$	$- 25$	$- 29$	$- 25$	$- 39$	$- 27$
5	13	$- 35$	$- 20$	$- 9$	$- 33$	$- 39$	$- 32$	$- 29$	$- 26$	$- 44$	$- 25$
5	7	$- 47$	$- 14$	$- 17$	$- 35$	$- 34$	$- 25$	$- 24$	$- 26$	$- 25$	$- 29$
5	1	$- 21$	$- 17$	$- 23$	$- 42$	$- 39$	$- 23$	$- 35$	$- 23$	$- 26$	$- 13$
10	1	$- 41$	$- 31$	$- 30$	$- 35$	$- 46$	$- 24$	$- 19$	$- 31$	$- 13$	$- 29$
10	7	$- 31$	$- 35$	$- 35$	$- 46$	$- 29$	$- 14$	$- 15$	$- 21$	$- 30$	$- 32$
10	13	$- 49$	$- 27$	$- 22$	$- 33$	$- 25$	$- 13$	$- 19$	$- 21$	$- 36$	$- 30$
10	19	$- 47$	$- 35$	$- 22$	$- 23$	$- 21$	$- 17$	$- 22$	$- 33$	$- 35$	$- 37$
14	19	$- 48$	$- 34$	$- 29$	$- 29$	$- 11$	$- 24$	$- 38$	$- 25$	$- 39$	$- 38$
14	13	$- 43$	$- 27$	$- 33$	$- 50$	$- 22$	$- 21$	$- 25$	$- 24$	$- 29$	$- 31$
14	7	$- 31$	$- 26$	$- 43$	$- 37$	$- 29$	$- 29$	$- 22$	$- 17$	$- 28$	$- 32$
14	1	$- 25$	$- 27$	$- 46$	$- 35$	$- 27$	$- 39$	$- 19$	$- 8$	$- 19$	$- 38$

Table 4

The weight table.

locx	locy	$s s_{1}$	$s s_{2}$	$s s_{3}$	$s s_{4}$	$s s_{5}$	$s s_{6}$	$s s_{7}$	$s s_{8}$	$s s_{9}$	$s s_{10}$
1	1	0.3640	0.3016	0.0736	0.1075	0.0913	0.1034	0.1029	0.0906	0.1324	0.0766
1	7	0.1376	0.1832	0.0733	0.1479	0.0987	0.0535	0.0709	0.0947	0.0822	0.1346
1	13	0.0862	0.0697	0.1518	0.1566	0.0587	0.0946	0.0864	0.0782	0.0708	0.1242
1	19	0.1026	0.0782	0.1773	0.2940	0.0703	0.1022	0.0953	0.1014	0.1317	0.1576
5	19	0.1823	0.2862	0.2723	0.4140	0.1325	0.1147	0.0686	0.3700	0.0813	0.2053
5	13	0.1141	0.1504	0.2700	0.1944	0.0851	0.1423	0.0668	0.0895	0.0941	0.2057
5	7	0.0630	0.5700	0.0965	0.0986	0.1421	0.1159	0.0823	0.1278	0.4695	0.0990
5	1	0.7471	0.1425	0.3014	0.0724	0.0779	0.1030	0.0803	0.2471	0.3600	0.3980
10	1	0.0646	0.0864	0.0895	0.1881	0.0939	0.2180	0.1038	0.0979	0.3600	0.1663
10	7	0.2262	0.0787	0.0903	0.0630	0.1668	0.1865	0.3520	0.3476	0.1817	0.1327
10	13	0.0521	0.1717	0.0967	0.0934	0.0992	0.1764	0.2365	0.5402	0.1206	0.1699
10	19	0.1132	0.1129	0.1063	0.2117	0.3700	0.6320	0.1667	0.1122	0.1170	0.1287
14	19	0.1255	0.1264	0.0743	0.4479	0.3220	0.2139	0.0947	0.2848	0.0813	0.1493
14	13	0.0597	0.1727	0.0806	0.1265	0.1485	0.2720	0.1916	0.1437	0.0728	0.1105
14	7	0.0929	0.0722	0.0825	0.1094	0.3351	0.1344	0.0863	0.2180	0.2163	0.1156
14	1	0.1480	0.2418	0.0610	0.1380	0.0970	0.0755	0.2427	0.2360	0.2991	0.1104

3.7. The Online Location Prediction Phase

During the online localization phase, we first collect several RSS samples at the index location. Then, we compute the average RSS value vector of the collected samples and use it to select the best radio map for online location prediction.

To find the position of the user, we collect an RSS value vector, denoted by $O^{*} = (s s_{1}, s s_{2}, \dots, s s_{k})$ , at the designated location of the user. Then, for each component $s s_{i}$ of vector $O^{*}$ , we refer to the min-max distance table to find the minimum and the maximum distances from sensor i for this signal strength value. Figure 4 shows the minimum and maximum distances from three sensors for an example.

Figure 4

Min-max distance and bounding box.

From the circles with radii of minimum and maximum distances from their corresponding sensors, we can find the intersection points, that is, $P 1$ , $P 2$ , $P 3$ , $P 4$ , $P 5$ , $P 6$ , and $P 7$ , as shown in Figure 4. Then, we find the bounding box of the intersection points as the region within which the position (coordinates) of the user is to be found.

Finally, we find the training locations within the bounding box and use these locations to predict the position of the user. The pattern matching on RSS location signatures is used to find the position of the user. For each location $l_{i}$ in the bounding box, we find the top-p weighted RSS value components of its RSS location signature. Then, we compute the Euclidean distance between the vector of the top-p RSS value components of location $l_{i}$ and the vector of the corresponding components of $O^{*}$ . Let us denote the top-p weighted RSS value components of the RSS location signature of $l_{i}$ by $V_{i}^{'} = (f_{1}^{'}, f_{2}^{'}, \dots, f_{p}^{'})$ and the corresponding components of $O^{*}$ by $O^{'} = ({s s}_{1}^{'}, {s s}_{2}^{'}, \dots, {s s}_{p}^{'})$ . Then, the Euclidean distance between $V^{'}$ and $O^{'}$ is calculated according to the following equation:

\begin{matrix} distance (V^{'}, O^{'}) = \sum_{z = 1}^{p} {(f_{z}^{'} - s s_{z}^{'})}^{2} . \end{matrix}

(6)

After computing the distances between $O^{*}$ and the RSS location signatures of the training locations in the bounding box, the position of the user is predicted to be the position of the location with the smallest Euclidean distance of its top-p weighted RSS value components against $O^{'}$ .

4. Experiments

To show the performance of the region-point system, we perform several experiments on location prediction in the classroom. In this section, we present the experiments and the results.

4.1. The Experimental Environment

As shown in Figure 2, we implement the localization system in a classroom. Figure 5 shows the layout of the reference sensors and the locations where the training samples are taken. The ground of the classroom is decorated with tiles. The tile's dimension is 60 centimeters on each side. We set the origin of the coordinate system at the top left corner of Figure 5. Ten reference sensors, denoted by large circles in Figure 5, are evenly located in the classroom. The training locations are denoted by small circles. Totally, we have 16 training locations. The coordinates of two examples of training locations are $(1, 1)$ and $(1, 7)$ . Note that, since each grid in Figure 5 represents one tile on the floor, the Euclidean distance between any two locations in Figure 5 can be calculated by multiplying their Euclidean distance by 0.6 meters. To build the radio maps, we collect 500 RSS value samples from each of the 16 training locations over a consecutive 4-hour time interval of the day. Three radio maps, denoted by $M_{1}$ , $M_{2}$ , and $M_{3}$ , are constructed for the experiment.

Figure 5

The positions of the sensors and the training locations.

For comparison purpose, we implement the RADAR method and a decision tree method called TREE and the CaDet method. For RADAR, the RSS vectors of different training samples from the same location are averaged. As a result, each location is associated with only one average RSS vector. To predict the coordinates of a test sample, three neighbors whose RSS vectors are among the top 3 shortest distances from the test sample are retrieved from the radio map and their corresponding coordinates are averaged to give the predicted coordinates of the test sample. To examine the effect of the search space reduction on RADAR, we revised the RADAR method by using the min-max distance table to confine the search region of RADAR. We call the revised RADAR method ReRADAR in the experiment.

For the TREE method, a decision tree is constructed for every radio map. The decision tree is then used to predict the coordinates of a test sample. Note that we use the CART decision tree model in R [17] to construct the decision trees. For CaDet method [14], we first use the K-means method in R to divide the training samples into three clusters based on their RSS vectors. A CART decision tree is then built for each cluster. To predict the coordinates a test sample, we compare the RSS vector of the test sample against the cluster mean of each cluster and select the decision tree whose corresponding cluster center has the shortest distance against the test sample to predict the location of the test sample.

4.2. The Experimental Results

Figure 6 shows an execution of the radio map selection algorithm. It shows that the location $(14, 13)$ is chosen as the index location since it has the largest variance on the RSS values of different radio maps. Furthermore, based on the index location, radio map $M_{3}$ is selected as the best radio map for the ongoing experiment.

Figure 6

An execution of the radio map selection algorithm.

To conduct the experiment, we consecutively collect 20 RSS value samples at each of the 16 testing locations. Totally, there are 320 test samples. Figure 7 shows the four executions of the region point with different lengths of the RSS location signatures.

Figure 7

Performance of region point with different lengths of the RSS location signature.

Figure 7 shows that the longer the RSS location signature, the higher the prediction accuracy. However, the length effect decreases as the length becomes longer. This is evidenced in Figure 7, where the accumulated errors for region point with 9 and 10 components, respectively, are almost the same.

Figure 8 shows the accumulated errors for RADAR, ReRADAR, TREE, CaDet, and region point. It shows that the region point method has the smallest accumulated error compared with RADAR, ReRADAR, TREE, and CaDet. It also shows that the accumulated error for ReRADAR is much less than that of RADAR. The mean errors for the 320 test samples are 0.681, 1.29, 2.11, 2.87, and 2.99 for region point, ReRADAR, RADAR, CaDet, and TREE, respectively. Figure 8 shows the fact that the search region restriction using the min-max distance table effectively reduces the prediction error of RADAR.

Figure 8

Accumulated errors for different methods based on radio map $M_{1}$ .

Figure 9 shows the accumulated errors for different methods based on radio map $M_{2}$ . It again shows that the accumulated error for ReRADAR is much less than that of RADAR. The mean errors are 0.958, 0.991, 2.15, 2.29, and 3.18 meters for region point, ReRADAR, RADAR, CaDet, and TREE, respectively.

Figure 9

Accumulated errors for different methods based on radio map $M_{2}$ .

Figure 10 shows the accumulated errors based on radio map $M_{3}$ which is chosen by the index location. The mean errors are 0.556, 0.822, 1.98, 1.18, and 1.30 meters for region point, ReRADAR, RADAR, CaDet, and TREE, respectively. Note that the errors for different methods based on $M_{3}$ are all less than their corresponding errors in radio map $M_{1}$ and $M_{2}$ , respectively. This shows that the index location method correctly selects the best radio map for online prediction. It is also noted, from Figures 8, 9, and 10, that, although clustering before constructing decision trees helps to promote the prediction accuracy of CaDet, the improvement is not significant.

Figure 10

Accumulated errors for different methods based on radio map $M_{3}$ .

5. Discussion

The fact that the TREE and CaDet methods do not perform well in our experimental environment needs to be carefully studied. To do so we show the decision tree built by CART based on radio map $M_{3}$ in Figure 11. Note that $M_{3}$ contains 8000 samples with 500 samples for each of the 16 locations. The label at the terminal node denotes the class and the number of samples in the training dataset that are classified as this label. For example, the terminal node 1 has label 1_13, which denotes location $(1, 13)$ , and there are 173 samples being classified as location $(1, 13)$ . The classification accuracy for the training dataset of this decision tree is 93.3 percent. Table 5 shows the confusion table for predicting the 320 testing samples based on the decision tree of Figure 11. It shows that 234 out of 320 samples are correctly classified, that is, correctly predicted. For comparison, we show the histogram of prediction errors for both TREE and region point in Figure 12. It shows that the region point has more samples correctly classified than the TREE does, that is, 283 versus 234. Furthermore, for the misclassified samples, the region point tends to classify them to their nearby locations. These two observations account for a less mean prediction error in region point than those of the TREE and CaDet.

Table 5

Confusion table for testing samples.

	1_1	1_13	1_19	1_7	10_1	10_13	10_19	10_7	14_1	14_13	14_19	5_1	5_13	5_19
1_1	20	0	0	0	0	0	0	0	0	0	0	0	0	0
1_13	0	20	0	0	0	0	0	0	0	0	0	0	0	0
1_19	0	0	20	0	0	0	0	0	0	0	0	0	0	0
1_7	0	0	0	2	0	0	0	0	4	0	0	0	14	0
10_1	0	0	0	0	20	0	0	0	0	0	0	0	0	0
10_13	0	0	0	0	0	20	0	0	0	0	0	0	0	0
10_19	0	0	0	0	0	0	20	0	0	0	0	0	0	0
10_7	0	0	0	0	0	0	0	20	0	0	0	0	0	0
14_1	0	0	0	0	0	0	0	0	0	20	0	0	0	0
14_13	0	0	0	0	0	0	0	0	0	20	0	0	0	0
14_19	0	0	0	0	0	0	0	0	0	0	20	0	0	0
14_7	0	0	0	0	0	0	0	0	0	20	0	0	0	0
5_1	0	0	0	0	0	0	0	0	0	0	0	20	0	0
5_13	0	0	0	0	0	0	0	0	0	0	8	0	12	0
5_19	0	0	0	0	0	0	0	0	0	0	0	0	0	20
5_7	0	0	0	17	0	0	0	0	0	0	0	0	3	0

Figure 11

The CART decision tree for radio map $M_{3}$ .

Figure 12

Histograms of prediction errors for TREE and region point.

6. Conclusions

In this paper, we present the implementation of a robust indoor localization system using a wireless sensor network. In this system, we propose three methods to counter the adverse effect of variation on the received signal strength values on location prediction. First, we propose to use an index location to select the best radio map for online location prediction. Second, we propose to use the min-max distance table to confine the search region for online location prediction. Finally, we propose to use the RSS location signature for pattern matching in online location prediction. The experimental results showed that the index location method correctly selects the best radio map for online location prediction. It also showed that the min-max distance table method effectively reduces the prediction error of RADAR, and the region point system offers a higher prediction accuracy than those of the RADAR, TREE, and CaDet.

Footnotes

Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.

References

Krause

Smailagic

Siewiorek

D. P.

Context-aware mobile computing: learning context-dependent personal preferences from a wearable sensor array

IEEE Transactions on Mobile Computing 2006 5 2 113 127

2-s2.0-33646343188

10.1109/TMC.2006.18

A. W. S.

Feng

Valaee

Reyes

Sorour

Markowitz

S. N.

Gold

Eizenman

Indoor tracking and navigation using received signal strength and compressive sensing on a mobile device

IEEE Transactions on Mobile Computing 2012 12 10 2050 2062

10.1109/TMC.2012.175

Huang

Blum

B. M.

Stankovic

J. A.

Abdelzaher

Range-free localization schemes for large scale sensor networks

Proceedings of the 9th Annual International Conference on Mobile Computing and Networking (MobiCom '03)

September 2003

San Diego, Calif, USA

81 95

Capkun

Hamdi

Hubaux

GPS-free positioning in mobile ad-hoc networks

Proceedings of the 34th Annual Hawaii International Conference on System Sciences (HICSS '01)

January 2001

3481 3490

2-s2.0-0034976472

Savvides

Han

C.-C.

Strivastava

M. B.

Dynamic fine-grained localization in ad-hoc networks of sensors

Proceedings of the 7th Annual International Conference on Mobile Computing and Networking (MOBICOM '01)

July 2001

Rome, Italy

166 179

2-s2.0-0034775930

Niculescu

Nath

Ad Hoc Positioning System (APS) using AoA

Proceedings of the 22nd Annual Joint Conference of the IEEE Computer and Communications Societies (INFOCOM '03)

March-April 2003

San Francisco, Calif, USA

1734 1743

Priyantha

N. B.

Chakraborty

Balakrishnan

Cricket location-support system

Proceedings of the 6th Annual International Conference on Mobile Computing and Networking (MOBICOM '00)

August 2000

Boston, Mass, USA

32 43

2-s2.0-0034539094

Bahl

Padmanabhan

V. N.

RADAR: an in-building RF-based user location and tracking system

Proceedings of the 19th Annual Joint Conference of the IEEE Computer and Communications Societies (IEEE INFOCOM '00)

March 2000

775 784

2-s2.0-0033872896

Chai

Yang

Reducing the calibration effort for probabilistic indoor location estimation

IEEE Transactions on Mobile Computing 2007 6 6 649 662

2-s2.0-34247587286

10.1109/TMC.2007.1025

10.

Youssef

M. A.

Agrawala

Shankar

A. U.

WLAN location determination via clustering and probability distributions

Proceedings of the 1st IEEE International Conference on Pervasive Computing and Communications (PerCom '03)

March 2003

143 150

2-s2.0-78649688335

11.

Yao

Wang

F.-Y.

Gao

Wang

Zhao

Location estimation in Zigbee network based on fingerprinting

Proceedings of the IEEE International Conference on Vehicular Electronics and Safety (ICVES '07)

December 2007

2-s2.0-50249171737

10.1109/ICVES.2007.4456358

12.

Youssef

Agrawala

The horus WLAN location determination system

Proceedings of the 3rd International Conference on Mobile Systems, Applications, and Services (MobiSys '05)

June 2005

205 218

2-s2.0-77953820325

10.1145/1067170.1067193

13.

Yin

Yang

L. M.

Learning adaptive temporal radio maps for signal-strength-based location estimation

IEEE Transactions on Mobile Computing 2008 7 7 869 883

2-s2.0-44149119831

10.1109/TMC.2007.70764

14.

Chen

Yan

Yin

Chai

Power-efficient access-point selection for indoor location estimation

IEEE Transactions on Knowledge and Data Engineering 2006 18 7 877 888

2-s2.0-33746609416

10.1109/TKDE.2006.112

15.

Y.-G.

Eun

A.-C.

Byun

Y.-C.

Efficient sensor localization for indoor environments using classification of link quality patterns

International Journal of Distributed Sensor Networks 2013 2013 6

10.1155/2013/701259

701259

16.

Kuo

S.-P.

B.-J.

Peng

W.-C.

Tseng

Y.-C.

Cluster-enhanced techniques for pattern-matching localization systems

Proceedings of the IEEE Internatonal Conference on Mobile Adhoc and Sensor Systems (MASS '07)

October 2007

2-s2.0-50249171256

10.1109/MOBHOC.2007.4428664

17.

The R Project for Statistical Computing

http://www.r-project.org/