EMP: Exploiting Mobility Patterns for Collaborative Localization in Sparse Mobile Networks

Abstract

Location awareness plays an indispensable role in a wide variety of application domains such as environment monitoring and vehicle tracking. In this paper we focus on the localization of mobile users in sparse mobile networks which exist in many practical scenarios where users are distributed over a vast area. The unique characteristics of sparse mobile networks present several challenges for accurate localization, such as constant movement and little information from anchors. By analyzing five large datasets of real users traces with entropy analysis from five sites, we make an important observation that there are strong patterns with user mobility. Motivated by this observation, we propose a localization approach called EMP by exploiting mobility patterns of users for localization in sparse mobile networks. EMP implements a range-free distributed algorithm, with which each user collaboratively estimates its current location by fusing two localization sources, that is, network connectivity with other nodes and mobility patterns. With trace driven simulations, we demonstrate that EMP significantly improves the localization accuracy, comparing with other existing localization approaches.

1. Introduction

Location awareness plays an indispensable role in a wide variety of domains, such as environment monitoring and vehicle tracking. In this paper we focus on the localization of mobile users or devices in sparse mobile networks which exist in many practical scenarios where the users or devices are distributed over a vast area. The localization approach based on the global positioning system (GPS) suffers several limitations. First, GPS may fail in indoor environments or urban areas where urban canyons exist. Second, GPS receivers are power consuming and can easily drain power driven devices.

The unique characteristics of sparse mobile networks present several challenges for accurate localization. First, users are constantly changing their locations and localization should be performed in real time. Second, in a sparse network, anchors that have location information can be few for most of the time and thus localization should usually be performed under little information from anchors.

There has been extensive research on the localization problem [1–6]. Existing approaches could be classified into two categories: range-based and range-free. Examples for range-based approaches include Received Signal Strength (RSS) [7, 8], Angle of Arrival (AOA) [9], Time of Arrival (TOA) [10], and Time Difference of Arrival (TDOA) [11]. The performance of range-based approaches is highly dependent on the accuracy of range techniques, which could vary greatly in practical situations. For range-free approaches [4, 6, 12], mere communication connectivity is used for computing localizations of nodes. A high node density is an indispensable condition for range-free approaches. Unfortunately, the high density condition does not hold for sparse mobile networks.

By analyzing five large datasets of real users traces with entropy analysis from two university campuses (NCSU and KAIST), New York City, Disney World (Orlando), and North Carolina state fair [13, 14], we make an important observation that there is strong patterns with user mobility. More specifically, the future location of a user is highly dependent on the current location. In addition, a user intends to move around a few preferred locations.

Motivated by this observation, in this paper we propose a localization approach called EMP by exploiting mobility patterns of users for localization in sparse mobile networks. EMP implements a range-free distributed algorithm, with which each user collaboratively estimates its current location by fusing two localization sources, that is, network connectivity with other nodes and mobility patterns. Upon meeting another user, the location of that user is used to improve the location estimation of the user. At the same time, the mobility pattern of the user is exploited for helping refine its location estimation, and users are differentiated according to the degrees of their mobility patterns.

The technical contributions of the paper are listed as follows.

(i)

By analyzing the five real-world user traces with entropy analysis, we reveal that there exist strong patterns for user mobility. As a result, the mobility of a user is characterized by a Markov chain.

(ii)

It is the first attempt, to the best of our knowledge, to estimate user locations by fusing network connectivity and mobility patterns.

(iii)

With trace driven simulations, we demonstrate that EMP significantly improves the localization accuracy, comparing with other existing localization approaches, such as Locale [15].

The rest of the paper is organized as follows. In Section 2, we review related work. In Section 3, we give the system model and the problem statement. In Section 4, we introduce our localization algorithm in detail. Section 5 presents evaluation results. The conclusion is given in Section 6.

2. Related Work

A lot of methods have been proposed for localization. They could be classified into two categories: range-based localization and range-free localization. In this section, we review related work under the two categories.

2.1. Range-Based Localization

For range-based localization algorithms, techniques like triangulation or trilateration [16] are very popular, in which the physical distance among nodes is measured. These algorithms require some kinds of special hardware for measurements of distances [8, 17–20].

Received Signal Strength (RSS) [7, 8] is a distance measurement technology based on the relationship between RSS and distance. There are various models to map a RSS to the distance, for example, the free space model, two ray model, shadow fading, and so forth. In a practical situation, however, even if the nodes are from the same distance, their received signal strengths can be different because of the fading effect.

Angle of Arrival (AOA) [9] acts as a complementary measurement to estimate the direction of the received signal. Merely using the AOA technique, however, the localization accuracy is hard to guarantee. Moreover, the cost on the angle detection device is high.

Time of Arrival (TOA) [10] is a technique implemented on the Global Positioning System (GPS). With four or more GPS satellites, the localization accuracy could be guaranteed. Unfortunately, it suffers several limitations. First, GPS may fail in indoor environments or urban areas where urban canyons exist. Second, GPS receivers are power consuming and can easily drain power driven devices.

Time Difference of Arrival (TDOA) [11] is a multilateration technique based on the measurement of the difference in distance to two or more stations whose locations and the broadcast time of signal are both known. Like TOA, however, the implementation is costly.

2.2. Range-Free Localization

Besides range-based localization techniques, there are many range-free localization [2–4, 6, 12, 21–23], which rely on connectivity between different nodes.

Previously, the Centroid algorithm [24] estimates a node's location by simply calculating the center of all seed nodes around the node. Multidimensional Scaling (MDS) [6] is also such a kind of algorithms to solve the range-free localization. In MDS, the relative location of nodes could be calculated with the estimated distances of all pairs. Once the absolute locations of any three nodes, not in a line, are known, the location of the rest nodes could be calculated. After MDS, two extensions are introduced [22, 23]. Besides, MCL [1] is introduced with the consideration of finite movement speeds. The possible location of a node could be reduced from the possible location information provided by the seed nodes within two hops.

In [5], MCL is extended to enhance the localization performance. Obviously, to guarantee the localization accuracy, a large number of uniformly distributed seeds are necessary. That is, the density of nodes should be high enough to guarantee sufficient connectivity.

3. System Model and Problem Statement

3.1. System Model

We consider the localization of a set of mobile users moving within a given region. The set of mobile users are denoted by $M = {1,2, \dots, m}$ . We separate the time into several time slots, and the whole period is represented by $T = {τ_{0}, τ_{1}, \dots, τ_{\max}}$ .

Initially, at $τ_{0}$ a node, $i \in M$ , is located in

\begin{matrix} E_{0}^{i} = (\begin{pmatrix} x_{0} \\ y_{0} \end{pmatrix}) . \end{matrix}

(1)

Since any node's velocity is finite

\begin{matrix} v < v_{\max}, \end{matrix}

(2)

where $v_{\max}$ is the maximum velocity that all nodes can reach. Thus, after a time slot τ, the location of i is within a certain range by

\begin{matrix} \sqrt{{(E_{1}^{i} - E_{0}^{i})}^{T} (E_{1}^{i} - E_{0}^{i})} < v_{\max} τ, \end{matrix}

(3)

where $E_{1}^{i}$ is the estimated location of node i at $τ_{1}$ .

The whole region is divided into grids of equal size $(v_{\max} τ) \times (v_{\max} τ)$ . We use set G to denote the set of the grids and let g denote a grid within G. As an example shown in Figure 1, the KAIST campus is divided into $40 \times 60$ grids.

Figure 1

The KAIST campus is divided into $40 \times 60$ grids.

The trajectory of a node can hence be represented as a series of grids that it travels:

\begin{matrix} ξ = 〈 g_{1}, g_{2}, g_{3}, \dots 〉 . \end{matrix}

(4)

We make three assumptions. First, mobile nodes are equipped with a low accuracy dead-reckoning tracking sensor device. Second, all users have access to their historical traces. The historical trace of a mobile user i is represented by $ξ^{i}$ . Third, all users share the same communication range γ. When the distance between user i and j is smaller than the communication range, $d_{i j} (τ) \leq γ$ , we claim that the two users encounter each other at time slot τ.

3.2. Problem Statement

The goal of our algorithm is to get the location of each mobile user at any time within a period of time of interest. The estimated locations of all mobile nodes are denoted by set $\hat{E}$ . The real locations of mobile nodes are denoted by set E . The location of a mobile user i at time slot τ is represented by

\begin{matrix} E_{τ}^{i} = (\begin{pmatrix} x \\ y \end{pmatrix}) . \end{matrix}

(5)

We define $Δ (\hat{E})$ to represent the localization error between the estimated locations and the real locations of the mobile nodes:

\begin{matrix} Δ (\hat{E}) = {∥ \hat{E} - E ∥}_{F} = {(Σ_{\begin{smallmatrix} i \in M \end{smallmatrix}} Σ_{τ \in T} {({\hat{E}}_{τ}^{i} - E_{τ}^{i})}^{2})}^{1 / 2}, \end{matrix}

(6)

where ${∥ \cdot ∥}_{F}$ is the Frobenius norm.

Thus, our objective of localization of the mobile nodes is as follows:

\begin{matrix} \hat{E} = \arg \min_{\hat{E}} Δ (\hat{E}) . \end{matrix}

(7)

4. Design of EMP

4.1. Overview

EMP is a distributed algorithm designed for localization of nodes in highly sparse mobile networks. In EMP, each node estimates its location jointly based on its own track sensor devices (3D accelerometer, electronic compass, etc.), its own mobility pattern, and estimated locations of its encountered neighbors. As shown in Figure 2, EMP can be divided into three building blocks.

Figure 2

The three major building blocks of EMP and their relationship.

In Exploiting Mobility Pattern, we characterize the mobility pattern of a mobile node with a Markov chain, as introduced in Sections 4.2 and 4.3. In Exploiting Connectivity, the location of a node is consolidated by using the estimated location of an encountered node, as described in Section 4.4. In Localization Fusion, the two localization sources, that is, connectivity and mobility pattern, are fused to derive a better location estimation, as introduced in Section 4.5.

4.2. Characterizing Mobility Pattern

We first show that there are strong mobility patterns with user mobility. To this end, we analyze the real-world user traces from two university campuses (NCSU and KAIST), New York City, Disney World (Orlando), and North Carolina state fair [13, 14] through entropy analysis.

We denote the locations of the nodes as a variable: $X \in G$ . The probability of node i within grid $g_{k}$ is denoted as

\begin{matrix} P (X^{i} = g_{k}) = \frac{num (g_{k})}{| ξ^{i} |}, \end{matrix}

(8)

where $num (g_{k})$ denotes the number of times that $g_{k}$ appeared in the historical trace $ξ^{i}$ .

The marginal entropy can be calculated as

\begin{matrix} H (X^{i}) = - \sum_{\begin{matrix} X^{i} \in G \end{matrix}} ‍ P (X^{i}) \log P (X^{i}) . \end{matrix}

(9)

Similarly, the joint probability $P (X_{τ + 1}^{i} = g_{j}, X_{τ}^{i} = g_{k})$ is calculated by

\begin{matrix} P (X_{τ + 1}^{i} = g_{j}, X_{τ}^{i} = g_{k}) = \frac{num (g_{j}, g_{k})}{| ξ^{i} |}, \end{matrix}

(10)

where $num (g_{j}, g_{k})$ denotes the number of times that $(g_{j}, g_{k})$ appeared in the historical trace $ξ^{i}$ . The conditional probability $P (X_{τ + 1}^{i} = g_{j} | X_{τ}^{i} = g_{k})$ is calculated by

\begin{matrix} P (X_{τ + 1}^{i} = g_{j} | X_{τ}^{i} = g_{k}) = \frac{P (X_{τ + 1}^{i} = g_{j}, X_{τ}^{i} = g_{k})}{P (X_{τ}^{i} = g_{k})} . \end{matrix}

(11)

The conditional entropy could be calculated as follows:

\begin{array}{l} H (X_{τ + 1}^{i} | X_{τ}^{i}) \\ = - \sum_{X_{τ + 1}^{i} \in G, X_{τ}^{i} \in G}^{} P (X_{τ + 1}^{i}, X_{τ}^{i}) \log P (X_{τ + 1}^{i} | X_{τ}^{i}) . \end{array}

(12)

More generally, the conditional entropy is denoted by $H (X_{τ + 1}^{i} | X_{τ}^{i}, X_{τ - 1}^{i}, \dots)$ .

The CDFs of entropies of five real user traces are shown in Figure 3. We observe that the entropies of the users traces are very low. For comparison, we calculate the marginal entropy of a node moving in a random way within a $100 \times 100$ field; it is $13.29$ bits. This result indicates that the mobility of real users has strong spatiotemporal regularity. Thus, we can use the Markov chain model to characterize the mobility patterns of the mobile users.

Figure 3

The CDF of the entropies from five real user traces: NCSU, KAIST, New York City, Orlando Disney World, and North Carolina State.

To determine the number of orders for the Markov chain model, we calculate the conditional mutual information as

\begin{array}{l} I (X_{τ + 1}^{i}; X_{τ - 1}^{i} | X_{τ}^{i}) \\ = H (X_{τ + 1}^{i} | X_{τ}^{i}) - H (X_{τ + 1}^{i} | X_{τ}^{i}, X_{τ - 1}^{i}) . \end{array}

(13)

The result of $I (X_{τ + 1}^{i}; X_{τ - 1}^{i} | X_{τ}^{i})$ is less than $0.2$ bits, which indicates that the first order Markov chain can well model the mobility pattern of a mobile user.

For implementing the first order Markov chain, the state transition matrix, denoted by Q, could be calculated by

\begin{matrix} Q (i, j) = P (X_{τ + 1} = g_{j} | X_{τ} = g_{i}) . \end{matrix}

(14)

4.3. Exploiting Mobility Pattern

This building block estimates the location of a mobile node by exploiting the mobility pattern of a mobile user. After modeling the movement of a mobile node with the first order Markov chain, the estimate of its location can be obtained:

(i)

initially: $π_{0}$

(ii)

after one step: $π_{1} = π_{0} \times Q$

(iii)

after k steps: $π_{k} = π_{0} \times Q^{k}$ .

Note π is the location estimate of the mobile node within the field. For the illustration purpose, we divide a field into $2 \times 2$ grids. Figure 4 shows the location estimation process. The initial state $π_{0} = 〈 1 0 0 0 〉$ corresponds to the state that a node is $100 %$ sure to be located in the northwestern grid of G.

Figure 4

The process of state transition for exploiting mobility patterns.

Clearly, the technique for estimating the location of a mobile node performs well if the mobility pattern of the node is strong. In practice, however, the mobility patterns of some nodes may not be strong. Thus, merely exploiting mobility patterns is insufficient for accurate localization.

4.4. Exploiting Connectivity

This building block aims to estimate the location of a mobile node by exploiting connectivity between nodes. The main idea of this building block is inspired by LOCALE [15] which is a distributed technique for using connectivity for localization of mobile nodes. In this subsection, we first introduce how to represent a location with the location estimate (mean) and the certainty (variance) and then describe how to exchange node's location information with its encountered neighbors.

4.4.1. Location Representation

In probability theory, the central limit theorem (CLT) states that, given certain conditions, the mean of a sufficiently large number of independent random variables, each with finite mean and variance, is approximately Normal distributed. Based on the CLT, we use the location estimate (mean), denoted by E, and the certainty (covariance), denoted by C, to represent the current location of a node.

In the 2-dimensional case, the probability density function of location estimation is

\begin{matrix} P (E) = \frac{1}{2 π \sqrt{| C |}} e^{- (1 / 2) (E - \hat{E}) C^{- 1} (E - \hat{E})}, \end{matrix}

(15)

\begin{matrix} E = (\begin{pmatrix} x \\ y \end{pmatrix}), \end{matrix}

(16)

\begin{matrix} C = (\begin{pmatrix} σ_{x}^{2} & ρ σ_{x} σ_{y} \\ ρ σ_{x} σ_{y} & σ_{y}^{2} \end{pmatrix}), \end{matrix}

(17)

where E denotes the true location and the parameter C denotes the certainty. We can see that only two parameters, C for certainty and $\hat{E}$ for location estimation, are necessary.

When a node moves through a long period of disconnection, it estimates the location by some low accuracy dead-reckoning tracking devices. The devices are influenced by a great deal of factors, for example, battery condition, wind, temperature, and so forth. The location estimation during this period also follows the Normal distribution. Since the movement covariance matrix is oriented in the moving direction denoted by θ, ρ in $C_{L r}$ equals to zero. The covariance matrix in the local coordinate $C_{L r}$ is represented by

\begin{matrix} C_{L r} = (\begin{pmatrix} σ_{x_{r}}^{2} & 0 \\ 0 & σ_{y_{r}}^{2} \end{pmatrix}) . \end{matrix}

(18)

Before the combination of the old estimation distribution $N (E_{o}, C_{o})$ and the relative measurement distribution $N (E_{r}, C_{r})$ , the transformation process is necessary because they are not in the same coordination. The rotation matrix is defined as

\begin{matrix} R (θ) = (\begin{pmatrix} \cos θ & - \sin θ \\ \sin θ & \cos θ \end{pmatrix}) . \end{matrix}

(19)

The covariance matrix in the common coordinate could be calculated by the rotation of the local coordinate:

\begin{matrix} E_{r} = R {(- θ)}^{T} E_{L r}, \\ C_{r} = R {(- θ)}^{T} C_{L r} R (- θ) . \end{matrix}

(20)

We could calculate the new distribution simply by

\begin{matrix} N = N (E_{o} + E_{r}, C_{o} + C_{r}) . \end{matrix}

(21)

Finally, the new location estimation distribution is calculated simply by the linear combination of the old estimation distribution $N (E_{o}, C_{o})$ and the relative measurement distribution $N (E_{r}, C_{r})$ .

4.4.2. Exchanging Location Information with Encountered Nodes

As mentioned before, our algorithm is distributed, where the coordinates of the individual mobile nodes are different from each other, shown in Figure 5. To solve this problem, the coordinate transition process is necessary before the process of merging the location estimation from the neighbor nodes.

Figure 5

The representation of the host and neighbor nodes with different coordinates.

The operation process of exchanging location information with encountered nodes is shown in Figure 6.

Figure 6

The operation process of exchanging location information with encountered nodes.

In Step 1, we transform the location estimation by rotating the local coordinate to the common coordinate by

\begin{matrix} C_{x \in {h, n}} = R {(θ_{o} - θ_{x})}^{T} C_{L x} R (θ_{o} - θ_{x}) . \end{matrix}

(22)

In Step 2, the host location estimation could generate a y uncertainty component in the y-axis.

In Step 3, the location uncertainty of the neighbor also influences the y uncertainty component in the y-axis. We add them to the observation component, too.

In Step 4, in the x-axis, the x uncertainty from the neighbor is also added to the x uncertainty component. In the previous operation we have already transformed the coordinate into the relative one, so ρ in $C_{L o}$ equals to zero and could be calculated by

\begin{matrix} C_{L o} = (\begin{pmatrix} σ_{n}^{2} & 0 \\ 0 & σ_{h}^{2} + 2 σ_{n}^{2} \end{pmatrix}) . \end{matrix}

(23)

When the host and the neighbor node are in the communication range, the distance between them is a random variable. Here we assume that it is a uniform distribution in the 2-dimensional field. Thus, the distance $d = γ / \sqrt{2}$ . The observed location estimation could be calculated by

\begin{matrix} E_{o} = (\begin{pmatrix} x_{n} + d \cos (θ_{o} - θ_{n}) \\ y_{n} + d \sin (θ_{o} - θ_{n}) \end{pmatrix}) . \end{matrix}

(24)

In Step 5, with the help of the observation from the neighbor node, the observed $C_{o}$ could be calculated by

\begin{matrix} C_{o} = R {(- θ_{o})}^{T} C_{L o} R (- θ_{o}) . \end{matrix}

(25)

In Step 6, the node localization accuracy could be improved by merging the host node location information and the transformed location information from the neighbor node. Due to the subjective (tracking sensor devices) and objective (environment factor) influence, the nodes location certainties (C) are different from each other. Therefore, we combine the estimation with respect to their certainties (C) acting as the weight.

The merged certainty is calculated by

\begin{matrix} C_{m} = C_{h} - K C_{h} . \end{matrix}

(26)

The merged estimation is calculated by

\begin{matrix} {\hat{E}}_{m} = {\hat{E}}_{h} + K ({\hat{E}}_{o} - {\hat{E}}_{h}), \end{matrix}

(27)

where the factor K is defined as

\begin{matrix} K = C_{h} {(C_{h} + C_{o})}^{- 1} . \end{matrix}

(28)

Then we need to rotate the new location to the local coordinate by

\begin{matrix} C_{L new} = R {(- θ_{m})}^{T} C_{m} R (- θ_{m}), \end{matrix}

(29)

where $θ_{m}$ could be calculated by

\begin{matrix} θ_{m} = \frac{1}{2} \tan^{- 1} (\frac{2 ν}{κ - μ}), C_{m} = (\begin{pmatrix} μ & ν \\ ν & κ \end{pmatrix}) . \end{matrix}

(30)

We can find that the uncertainty of the merged estimation becomes lower as a result of merging the host node location information and the transformed location information from the neighbor. The main principle of Step 6 is as follows. When two nodes can communicate with each other, the physical distance between the two nodes is smaller than the communication range which is a limited value. Before the host node encounters the neighbor node, its uncertainty of location can be large because of the communicative errors of the internal tracking sensor. When encountering the neighbor, its uncertainty can be largely reduced by referring to the location of the neighbor and the limited communication range. The proposed operations in Step 6 are based on this intuition.

4.5. Localization Fusion

This building block fuses the estimated location by exploiting connectivity and the estimated location by exploiting mobility patterns. So far, we have derived two different kinds of location estimation distributions which are very different from each other. The location estimate from exploiting mobility pattern is a discrete location distribution, while the location estimate from exploiting connectivity is continuous. We transform the continuous location into a discrete one by a sampling method.

Previously, we use function $P (E)$ in (15) to describe the location information. We sample this location distribution by

\begin{matrix} π (i, j) = P (E) where E = (\begin{pmatrix} i \cdot v_{\max} τ \\ j \cdot v_{\max} τ \end{pmatrix}) . \end{matrix}

(31)

After sampling, $π (i, j)$ is still continuous. Then, we introduce the uniform quantization process (UQP). In the UPQ we quantize the probability density into $2^{ν}$ levels, where ν is the bit number to store the quantized value. Then, the length of each quantization region is

\begin{matrix} δ = \frac{1}{2^{ν}} . \end{matrix}

(32)

The quantized values are the midpoints of the quantization regions.

After this process, two quantized discrete location estimation distributions are ready to be fused. In order to fuse those two kinds of location estimates, we utilize the median percent area error (MPAE) proposed in [15]. As shown in Figure 7, the MPAE is defined as the area of the smallest circle that includes $50 %$ certainty of the probability; that is, the circle is the 50% certainty line of the 2-dimensional CDF. From the definition of MPAE, we can see that, when a node's certainty of its location estimation is higher, its MPAE should be smaller. From a rule of thumb, we use the reciprocal as its weight of certainty; the Certainty Weight w is defined as

\begin{matrix} w = \frac{1}{C}, \end{matrix}

(33)

where C is defined as the grid number within the MPAE in the circle which contains $50 %$ certainty as shown in Figure 7.

Figure 7

The distribution of estimated locations, marked with the median percent area error in the red dash circle.

With two distributions, $π_{ρ}$ : from exploiting mobility pattern and $π_{ϱ}$ : from exploiting connectivity, and their weights, $w_{ρ}$ and $w_{ϱ}$ , we can fuse them by calculating their weighted average

\begin{matrix} π_{f} = \frac{w_{ρ} \cdot π_{ρ} + w_{ϱ} \cdot π_{ϱ}}{w_{ρ} + w_{ϱ}} . \end{matrix}

(34)

After the fusion process, the host's $π_{ρ}$ is refined by $π_{f}$ . This process is the key to increasing the accuracy of node localization.

5. Performance Evaluation

5.1. Methodology and Simulation Setup

To evaluate the performance of EMP, we perform evaluation with simulations with two real user traces from the KAIST campus and New York City. The trace datasets are recorded as follows. GPS receivers record the current positions of users every $30$ seconds, which are recorded with a relative distance to a reference point. One user can make one or more daily trace files.

We divide the time of the traces into two segments: one is from the morning to the afternoon, which contains almost $75 %$ period of time, and the other is from afternoon to night, which contains the rest $25 %$ period of time. The first segment acts as the training trace, which is used to generate the mobility patterns of mobile users. The second segment of traces is used to evaluate the localization approaches.

We use the mean absolute error (MAE) as our performance metric, which has been widely used by localization algorithms:

\begin{matrix} MAE = \frac{Δ (\hat{E})}{| M | \cdot | T |} . \end{matrix}

(35)

We compare EMP with the following schemes.

(i)

LOCALE [15]: it is designed for localization in sparse mobile networks, which utilizes the location information from neighbor nodes when they are within the communication range. The big difference of our EMP from LOCALE is that LOCALE does not consider the inherent mobility patterns of users at all.

(ii)

Tracking Sensor ( $T S$ ): it merely utilizes the location information provided by tracking sensor devices.

5.2. Comparison over Time

We first examine a typical run of the localization schemes. Figures 8 and 9 report the comparison of EMP, LOCALE, and $T S$ for the KAIST and the New York City traces.

Figure 8

Mean absolute error (m) versus time (for KAIST).

Figure 9

Mean absolute error (m) versus time (for the New York City).

We can see that the MAEs of the three schemes are initially almost the same. This is because the mobile nodes are initialized with an accurate starting location. When it comes to $100$ time units, the MAE of $T S$ increases greatly, while LOCALE and EMP are almost the same before $40$ time units. After $40$ time units, however, the localization error of LOCALE is greater than EMP. From the results, we can find that EMP achieves more accurate location estimation than LOCALE, and $T S$ do.

5.3. Impact of Number of Nodes

We next investigate the impact of the node density on the localization performance. To examine the impact we vary the number of nodes from 10 to 66 for the KAIST trace and from $5$ to $25$ for the New York City trace.

Figures 10 and 11 report the comparison of the three schemes as the number of nodes is varied for the two traces. We can find that the performance of $T S$ is very poor whatever the number of users is. EMP achieves almost 50% smaller localization error than LOCALE. Both EMP and LOCALE have better performance with the increasing of mobile users. This is reasonable because they all utilize the cooperative localization to estimate user locations. EMP is even better because it takes mobility patterns of mobile users into consideration.

Figure 10

Mean absolute error (m) versus number of users (for KAIST).

Figure 11

Mean absolute error (m) versus number of users (for New York City).

5.4. Impact of Communication Range

We then investigate the impact of the communication range on the localization performance of the three schemes. Figures 12 and 13 show the impact of communication range γ on the location estimation for both traces from the KAIST campus and from the New York City, respectively.

Figure 12

Mean absolute error (m) versus communication range (for KAIST).

Figure 13

Mean absolute error (m) versus communication range (for New York City).

We can see that $T S$ is not sensitive to the variation of the communication range and produce poor localization performance. The performance of LOCALE becomes better when the communication range increases. However, the performance of EMP is always better than LOCALE and $T S$ in both traces.

5.5. Impact of Mobility Pattern

To study the impact of mobility pattern of users on the localization performance, we artificially generate the user traces based on the first order Markov chain. The conditional entropies of our generated traces vary from 0 to 2 bits, which is calculated as

\begin{matrix} H = \frac{\sum_{i \in M}^{} H (E_{τ + 1}^{i} | E_{τ}^{i})}{| M |} . \end{matrix}

(36)

Figure 14 compares EMP and LOCALE. We can find that, for small conditional entropies, EMP significantly outperforms LOCALE. As the conditional entropy becomes larger (i.e., the mobility pattern is not strong), the performance gap between the two schemes becomes smaller. This clearly indicates that EMP can effectively make use of mobility patterns of mobile nodes.

Figure 14

Mean absolute error versus conditional entropy.

6. Conclusion

Location information is valuable for many location-dependent application scenarios in mobile networks. The existing range-based localization is costly because of the hardware for range measurements, and the existing range-free localization requires a high node density. Unfortunately, the high density condition does not hold for sparse mobile networks. By analyzing five large datasets of real users traces with entropy analysis from two university campuses (NCSU and KAIST), New York City, Disney World (Orlando), and North Carolina state fair [13, 14], we have made an important observation that there are strong patterns with user mobility. With this observation, we have presented a localization approach called EMP by exploiting mobility patterns of users for localization of nodes in sparse mobile networks. EMP implements a ranging-free distributed algorithm, with which each user collaboratively estimates its current location by fusing two localization sources, that is, network connectivity and mobility patterns. Upon meeting another user, the location of that user is used to improve the location estimation of the user. At the same time, the mobility pattern of the user is exploited for helping refine its location estimation, and users are differentiated according to the degrees of their mobility patterns. Trace driven simulations show that EMP achieves significantly better localization performance than other existing approaches.

In our future work, we would like to incrementally implement our localization approach in mobile networks where the mobility of the users has strong mobility, for example, in university campuses. However, it should be noted that such an implementation would involve a large number of users. It takes time to accumulate the large number of users.

Footnotes

Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.

Acknowledgments

This research is supported in part by 863 Program (2011AA010500 and 2013AA01A601) and Program for Changjiang Scholars and Innovative Research Team in Universities of China (IRT1158, PCSIRT).

References

Baggio

Langendoen

Monte-carlo localization for mobile wireless sensor networks

Mobile Ad-Hoc and Sensor Networks 2006

Springer

317 328

Doherty

El Ghaoui

Convex position estimation in wireless sensor networks

Proceedings of the 20th Annual Joint Conference of the IEEE Computer and Communications Societies (INFOCOM '01)

April 2001

1655 1663

2-s2.0-0035013232

Huang

Blum

B. M.

Stankovic

J. A.

Abdelzaher

Range-free localizaiton shemes for large scale sensor networks

Proceedings of the 9th annual international conference on Mobile computing and networking (MobiCom '03)

2003

Jin

Xia

Scalable and fully distributed localization with mere connectivity

Proceedings of the IEEE International Conference on Computer Communications (INFOCOM '11)

April 2011

3164 3172

2-s2.0-79960858105

10.1109/INFCOM.2011.5935163

Sheu

Lin

Distributed localization scheme for mobile sensor networks

IEEE Transactions on Mobile Computing 2010 9 4 516 526

2-s2.0-77649270849

10.1109/TMC.2009.149

Shang

Ruml

Zhang

Fromherz

M. P. J.

Localization from mere connectivity

Proceedings of the 4th ACM International Symposium on Mobile Ad Hoc Networking and Computing (MobiHoc '03)

June 2003

201 212

2-s2.0-0242612017

Chipara

Hackmann

Smart

W. D.

Roman

Practical modeling and prediction of radio coverage of indoor sensor networks

Proceedings of the 9th ACM/IEEE International Conference on Information Processing in Sensor Networks (IPSN '10)

April 2010

339 349

2-s2.0-77954486430

10.1145/1791212.1791252

Chintalapudi

Iyer

A. P.

Padmanabhan

V. N.

Indoor localization without the pain

Proceedings of the 16th Annual Conference on Mobile Computing and Networking (MobiCom '10)

September 2010

173 184

2-s2.0-78649291578

10.1145/1859995.1860016

Niculescu

Nath

Ad hoc positioning system (APS)

Proceedings of the IEEE Conference on Computer Communications (INFOCOM '03)

November 2001

2926 2931

2-s2.0-0035684931

10.

Wellenhoff

B. H.

Lichtenegger

Collins

Global Positioning System: Theory and Practice 1997 4th

Springer

11.

Savvides

Han

C.-C.

Strivastava

M. B.

Dynamic fine-grained localization in ad-hoc networks of sensors

Proceedings of the 7th Annual International Conference on Mobile Computing and Networking (MobiCom '01)

July 2001

166 179

2-s2.0-0034775930

12.

Kunz

Localization applying an efficient neural network mapping

Proceedings of the 1st International Conference on Autonomic Computing and Communication Systems

2007

13.

Lee

Hong

Kim

S. J.

Rhee

Chong

SLAW: a mobility model for human walks

Proceedings of the 28th Conference on Computer Communications (INFOCOM '09)

April 2009

Rio de Janeiro, Brazil

855 863

2-s2.0-70349659583

10.1109/INFCOM.2009.5061995

14.

Rhee

Shin

Hong

Lee

Chong

On the levy-walk nature of human mobility

Proceedings of the 27th IEEE Communications Society Conference on Computer Communications (INFOCOM '08)

April 2008

Phoenix, Ariz, USA

1597 1605

2-s2.0-51349144836

10.1109/INFOCOM.2007.145

15.

Zhang

Martonosi

LOCALE: collaborative localization estimation for sparse mobile sensor networks

Proceedings of the International Conference on Information Processing in Sensor Networks (IPSN '08)

April 2008

195 206

2-s2.0-51449099283

10.1109/IPSN.2008.63

16.

Bachrach

Taylor

Stojmenovic

Localization in sensor networks

Handbook of Sensor Networks: Algorithms and Architectures 2005

Wiley

Wiley Series on Parrallel and Disttibuted Computing

17.

Nasipuri

A directionality based location discovery scheme for wireless sensor networks

Proceedings of the 1st ACM International Workshop on Wireless Sensor Networks and Applications (WSNA '02)

September 2002

105 111

2-s2.0-0036986579

18.

Priyantha

N. B.

Miu

A. K. L.

Balakrishnan

Teller

The cricket compass for context-aware mobile applications

Proceedings of the 7th ACM Annual International Conference on Mobile Computing and Networking (MobiCom '01)

July 2001

1 14

2-s2.0-0034781204

19.

Seidel

S. Y.

Rappaport

T. S.

914 MHz path loss prediction models for indoor wireless communications in multifloored buildings

IEEE Transactions on Antennas and Propagation 1992 40 2 207 217

2-s2.0-0026819732

10.1109/8.127405

20.

Whitehouse

Culler

Calibration as parameter estimation in sensor networks

Proceedings of the 1st ACM International Workshop on Wireless Sensor Networks and Applications (WSNA '02)

September 2002

59 67

2-s2.0-0036983255

21.

Galstyan

Krishnamachari

Lerman

Pattem

Distributed online localization in sensor networks using a moving target

Proceedings of the 3rd International Symposium on Information Processing in Sensor Networks (IPSN '04)

April 2004

61 70

2-s2.0-3042819146

22.

Shang

Ruml

Improved MDS-based localization

Proceedings of the IEEE Conference on Computer Communications (INFOCOM '04)

March 2004

2640 2651

2-s2.0-8344247710

10.1109/INFCOM.2004.1354683

23.

Vivekanandan

Wong

V. W. S.

Ordinal MDS-based localisation for wireless sensor networks

International Journal of Sensor Networks 2006 1 3 169 178

24.

Bulusu

Heidemann

Estrin

GPS-less low-cost outdoor localization for very small devices

IEEE Personal Communications 2000 7 5 28 34

2-s2.0-0034291601

10.1109/98.878533