A Location Prediction Algorithm with Daily Routines in Location-Based Participatory Sensing Systems

Abstract

Mobile node location predication is critical to efficient data acquisition and message forwarding in participatory sensing systems. This paper proposes a social-relationship-based mobile node location prediction algorithm using daily routines (SMLPR). The SMLPR algorithm models application scenarios based on geographic locations and extracts social relationships of mobile nodes from nodes' mobility. After considering the dynamism of users' behavior resulting from their daily routines, the SMLPR algorithm preliminarily predicts node's mobility based on the hidden Markov model in different daily periods of time and then amends the prediction results using location information of other nodes which have strong relationship with the node. Finally, the UCSD WTD dataset are exploited for simulations. Simulation results show that SMLPR acquires higher prediction accuracy than proposals based on the Markov model.

1. Introduction

Participatory sensing is a recent appearing sensing technology which emphasize that people participate in the sensing process. Participatory sensing enables individuals and communities to gather, analyze, and share local knowledge and to subsequently make intelligent decisions and also offer social services. The earliest research of participatory sensing is executed by Srivastava et al. who have proposed the conception of urban sensing in a technical report [1] in the year 2006 to discuss the system architecture and technical methods of urban sensing. The MSG (Mobile Sensing Group) laboratory at Dartmouth College also conducts research on this area, including BikeNet [2], SoundSense [3], CenceMe [4, 5], MetroSense [6], and Bubble-Sensing [7].

Participatory sensing systems rely on mobile phone users to sense and transmit data with diverse purposes in the process of monitoring or solving a particular problem. Based on the large number of users, participatory sensing systems have the potential to acquire large amounts of data from various places and address large-scale location-based problems [8–10]. A typical example of location-based participatory sensing systems collects and records air quality measurements to monitor the pollution of a particular location. What is more, the smartphone with Internet connectivity can also contribute to the participatory sensing systems' growth. In [11], a solution based on web services is proposed to permit the interaction between a mobile application and the IPv6 compliant WSNs scenario.

In participatory sensing systems, mobile devices are usually weakly connected. Due to uncertainty of connection, nodes sometimes need encounter opportunities to accomplish data communication and transmission. If the location of mobile nodes can be predicted ahead of several time slots, the service quality and efficiency of the system will be remarkably improved.

In order to solve the location prediction problems, human mobility has been analyzed at different geographic scales [12–14]. In [12] the limits of predictability provided by humans' mobility patterns are examined. The data that is collected by 50,000 anonymous mobile phone users over a period of three months has been used to study the humans' mobility patterns, and different entropy measures are adopted in this research to estimate the potential predictability in human dynamics. Based on their analysis, a 93% potential predictability in user mobility across the whole user base is found in their report. In [13] the location data from 100,000 mobile phone users collected by tracking each person's position for six months is investigated. This research shows that individual mobility patterns of the test persons can be described by a single spatial probability distribution after some corrective preprocessing. Thus the results of this study suggest that humans generally follow simple reproducible patterns.

Based on the human mobility analysis, different techniques based on the Markov model have been applied to location prediction problems of human individuals. In [15], a set of discrete locations has been defined using the WIFI cells on a university campus. Two different kinds of location predictors, the kth order Markov predictor as well as a LZ-based predictor, are tested in predictions to the next location. Based on the test results, the research shows that the second order Markov predictor with a certain fallback feature performs best and provided a median accuracy of 72%. Literature [16] presents a similar extended Markov predictor, which takes the arrival time and residence time into account. Specifically, delay embedding is used to extract location sequences of a certain length from time series. Then these sequences are directly used to predict a user's location, and the prediction result is obtained by comparing the last observed locations to all embedded location sequences.

Meanwhile, hidden Markov models (HMMs) are also considered to predict human mobility. Literature [17] presents a hybrid method on the basis of hidden Markov models. The proposed approach clusters location histories according to their characteristics, and the latter trains an HMM for each cluster. Based on HMMs, location characteristics are considered as unobservable parameters and the effects of each individual's previous actions are also accounted in the process of predictions. Finally, a prediction accuracy of 13.85% can be achieved when considering regions of roughly 1280 square meters.

Except for the Markov model and HMM, there are some other location prediction algorithms that are used to analyze location information in literature [18–26], such as the artificial neural network-based algorithm [18], Bayesian network-based methods [19, 20], mobile-sink-based methods [21, 22], the regression-based method [23], and mobile anchor assisted localization algorithms [24, 25]. These methods predict future positions of nodes from different perspectives, which focus on the behavior of each single node.

However, in fact the behavior of a node is decided not only by location state of the former period but also by the social relationship of mobile users.

In this paper, a social-relationship-based mobile node location prediction algorithm using daily routines (SMLPR) is proposed. In this approach, the social relationship of nodes is used to optimize the location prediction algorithm so that it can better adapt to participatory sensing applications and promote the prediction accuracy. The remainder of this paper is organized as follows: the network model is illustrated in Section 2. Section 3 specifies the mobile node location prediction algorithm. Extensive simulations have been done for performance evaluation in Section 4. Section 5 concludes the paper.

2. Network Model

Human movements often exhibit a high degree of repetition including regular visits to certain places and regular contacts during daily activities. In this paper, a hybrid urban network model is proposed for participatory sensing systems, as illustrated in Figure 1. The map M is partitioned into small regions. In other words, M is represented as a finite set of regions ${a_{1}, \dots, a_{n}}$ such that $⋃_{i = 1}^{n} a_{i} = M$ with $a_{i} ⋂ a_{j} = \emptyset$ ( $i \neq j$ ). The movement possibility of the user from one region to another is represented by a directed graph. In these regions, the sensed data needs to be aggregated to reduce network overhead and to enhance its usefulness among consumers.

Figure 1

Participatory sensing scenario.

As mentioned above, the network has been reinforced as a mixture of an opportunistic network and a centralized infrastructure which is shown in Figure 1. The centralized infrastructure consists of a number of wireless access points (APs) and a backbone connecting the APs. The purpose of this model is to collect data from a peer-to-peer network (scenario 2 in Figure 1) or WIFI APs which is in the vicinity of the consumers (scenarios 1 and 3 in Figure 1), rather than collecting it from any 3G/4G server. Mobile nodes that are carrying smart device can only access to the network when they are walking into the transmission range of any AP, and data transmission can only occur between peer counterparts when they fall into each other's transmission range as in normal opportunistic networks.

Definition 1 (location).

Inside a geographical region, a mobile device closest to a fixed location is selected to perform data collection and the collected data is sent to a set of bounding APs where it is stored and pulled by the consumers. The fixed location is termed as the aggregation location.

Definition 2 (location granularity).

An aggregation location includes a set of APs in adjacent position. The granularity of the location represents the location's transmission range which is decided by the number and the scope of the APs in the same geographical region.

In this study, historical trajectories of users connecting with APs are recorded and they are divided by time granularity. Trajectory of a moving user is defined as a sequence of points ${({i d}^{j}, l_{1}, v t_{1}), ({i d}^{j}, l_{2}, v t_{2}), \dots, ({i d}^{j}, l_{m}, v t_{m})}$ , where ${i d}^{j}$ is the user's identifier and location $l_{1}$ is represented by a set of adjacent APs which users are connecting to at time slot $v t_{i}$ , $1 \leq i \leq m$ . In this paper, a mechanism to construct aggregation location based on the information of APs is proposed, which divides APs into different locations in different granularities.

The urban scenario is transformed into a graph $G = {E, V, W}$ , where each AP is replaced by a vertex $v \in V$ and the relation between two APs is replaced by an edge $e \in E$ , where $E \subset V \times V$ . The relationship matrix of APs is replaced by $W = [w_{i j}]$ , $i \in N$ , $j \in N$ , where $w_{i j}$ represents the relationship between AP i and AP j, which can be calculated by

\begin{matrix} r_{i j} = \frac{2 n_{i j}}{n_{i} + n_{j}} . \end{matrix}

(1)

In formula (1), the frequency of AP i and AP j appearing on all users' devices in the same period is counted, denoted as

n_{i j}

, and the number of times that AP i appears in total is denoted as

n_{i}

(the same to

n_{j}

Using the greedy algorithm of Kruskal [27], the maximum spanning tree from the graph G is easily got, denoted as T. After choosing a weight λ as the location granularity, the edge in T whose weight is less than λ will be cut down, and leave the tree into some separated connected components. One connected component is regarded as a location. Figure 2 shows the process of constructing the aggregation locations, and a construction with granularity 0.6 is represented in Figure 2(c).

Figure 2

Process of location construction.

In most WLAN datasets, connection is recorded by the format (node, contact time, APs, and signal strength), and one or more APs which a user connects to may appear in one item at the same time, which may cause users' location confusion. Therefore, after getting the set of locations in the mobility scenario, estimating which location the user belongs to in the same period is also needed. The signal strength between a user's mobile device and a WIFI AP in the WLAN dataset can help to solve this problem. A weight between the user and location can be calculated by

\begin{matrix} {w e i g h t}_{i} = \frac{\sum_{n}^{j = 1} {s t r e n g t h}_{j}}{n} . \end{matrix}

(2)

In formula (2), n represents the number of aps in location i, and strength j represents the signal strength between the user A and AP j ( ${A P}_{j} \in {l o c a t i o n}_{i}$ ). The user A is considered to be at location i in the time period, if i meets the condition denoted in

\begin{matrix} i = a r g m a x \{{w e i g h t}_{i}\} . \end{matrix}

(3)

3. Algorithm Design

This paper proposes a simple method for predicting the future locations of mobile nodes, on the basis of their previous ways to other locations. The proposed approach considers different daily time periods, which relates to the fact that users present different behaviors and visit different places during their daily routines. Therefore the hidden Markov model is introduced to capture the dynamism of users' behavior resulting from the daily routines.

What is more, users experience a combination of periodic movement that is geographically limited and seemingly random jumps correlated with their social networks. Social relationships can explain about 10% to 30% of all human movement, while periodic behavior explains 50% to 70% [28]. On the basis of this theory prerequisite, social relationship between nodes is also exploited in this paper for optimization and amendment of location prediction result.

3.1. Hidden Markov Prediction Model

In urban scenario, users adopt different behaviors during different periods of daily time, and the users' daily routines may influence users' trajectories. Thus the different daily time periods (same as daily sample) should be considered in order to guarantee a more realistic representation. With filtering and hidden Markov model, this can be done in a simple way.

Hidden Markov model (HMM) is a well-known approach for the analysis of sequential data, in which the sequences are assumed to be generated by a Markov process with hidden states.

Figure 3 shows the general architecture of an instantiated HMM. Each shape in the diagram represents a random variable that can adopt any number of values. The random variable $x (t)$ is the location state at time t. The random variable $e (t)$ is the daily sample state (in this paper, the observed state is called evidence) at time t. The arrows in the diagram denote conditional dependencies. From the diagram, it is clear that the conditional probability distribution of the location variable $x (t)$ at time t, given by the values of the location variable x at all times, depends only on the value of the location variable $x (t - 1)$ , and thus the value at time $t - 2$ and the values before it have no influence. This is called the Markov property. Similarly, the value of the evidence $e (t)$ only depends on the value of the location variable $x (t)$ , at time t.

Figure 3

Example of hidden Markov model.

Given the result of filtering up to time t, one can easily compute the result for $t + 1$ from the new evidence $e_{t + 1}$ . The calculation can be viewed as actually being composed of two parts: first, the current state distribution is projected forward from t to $t + 1$ . Second, it is updated using the new evidence $e_{t + 1}$ . This two-part process emerges using

\begin{array}{l} P (X_{t + 1} | e_{1 : t + 1}) \\ = α P (e_{t + 1} | X_{t + 1}) \sum_{X_{t}} P (X_{t + 1} | X_{t}) P (X_{t} | e_{1 : t}), \end{array}

(4)

where α is a normalizing constant used to make probabilities sum up to 1. Within the summation,

P (X_{t + 1} | X_{t})

is the common transition model and

P (X_{t} | e_{1 : t})

is the current state distribution.

P (e_{t + 1} | X_{t + 1})

is used to update the transition model and it is obtainable directly from the statistical data.

In participatory sensing system, for each application scenario, the hidden Markov model can be used to predict the future location state of each mobile node. Prediction process includes the following steps.

3.1.1. Preparatory Stage

At the beginning, the system needs to collect enough information of user movement trajectories to construct the Markov chain, and therefore a “warm-up” stage is assumed in the prediction system. During preparatory stage, the system only collects historical data and it cannot provide any predicted information. The warm-up stage can last for one day or one week depending on the amount of information collected.

3.1.2. Determination of State Set

The location elements in the collecting data are extracted and they are denoted as set L. As set L contains a number of location elements, the location of higher visiting frequency is chosen as state set of the system, denoted as set E. If there are m locations in the current scene, the state space can be denoted as $E = {X_{1}, X_{2}, \dots, X_{m}}$ , and the location i is the ith status $X_{i}$ of Markov process.

3.1.3. Discretization of Data Set

Statistical data of all users related to state set E is made. Then the data set of each user is processed to be discrete set of the fixed time period, so the set after discretization is denoted as follows:

\begin{matrix} \{(t_{k}, X_{i})\}, k = 1,2, 3 \dots, i \in \{1,2, 3, \dots, m\} . \end{matrix}

(5)

3.1.4. Calculation of 1-Order Transition Probability Matrix

$n_{i j}$ is the frequency that node A departs from location i for location j, then the probability of node A departing from location i for location j is denoted as

\begin{matrix} p_{i j} = \frac{n_{i j}}{n}, \end{matrix}

(6)

where n is the total number of time node A departed location i to visit other locations in data set.

Therefore, suppose that there are m locations in the set, an $m \times m$ transition probability matrix is generated as

\begin{matrix} P = [\begin{bmatrix} p_{11} & p_{12} & \dots & p_{1 m} \\ p_{21} & p_{22} & \dots & p_{2 m} \\ ⋮ & ⋮ & ⋱ & ⋮ \\ p_{m 1} & p_{m 2} & \dots & p_{m m} \end{bmatrix}] . \end{matrix}

(7)

Given $p_{j}^{(l)}$ as the probability of node on state $X_{j}$ at initial moment l and computing the probability of each state, the initial distribution of Markov chain can be obtained as

\begin{matrix} P (l) = (p_{1}^{(l)}, p_{2}^{(l)}, \dots, p_{m}^{(l)}) . \end{matrix}

(8)

For example, assume that the initial state is $X_{2}$ , the initial distribution is as $P (l) = (0,1, 0, \dots, 0)$ , and the absolute distribution at time $l + 1$ is as

\begin{matrix} P (l + 1) = P (l) P = (p_{1}^{(l + 1)}, p_{2}^{(l + 1)}, p_{3}^{(l + 1)}, p_{4}^{(l + 1)}, p_{5}^{(l + 1)}) . \end{matrix}

(9)

3.1.5. Update the Result from the New Evidence

The different daily time periods (daily sample) is regarded as the evidence variable e, so the set of evidences can be defined as $E V = {e^{r}}$ , $r = 1,2, \dots, d$ , where d is the total number of daily time period state. For example, break the day into four daily samples $(d = 4)$ , and the $E V$ is denoted as

\begin{matrix} E V = \{\begin{Bmatrix} a . m ., & n o o n, & p . m ., & e v e n i n g \end{Bmatrix}\} . \end{matrix}

(10)

For each daily sample r, a diagonal matrix $O (e^{r})$ is defined as

\begin{matrix} O (e^{r}) = [o_{i j}], o_{i j} = \{\begin{cases} 0, & i f i \neq j \\ p (e^{r} | X = i), & i f i = j, \end{cases} \end{matrix}

(11)

where

P (e^{r} | X = i) = P (e^{r}, X = i) / P (X = i) = n_{i}^{r} / n_{i}

n_{i}^{r}

represents the frequency of node arriving at location i in the daily sample r, and

n_{i}

represents the number of times that node arrives at location i.

Based on all of the above, calculate the probability of node arriving on location i at next time slot $l + 1$ using

\begin{matrix} P {(l + 1)}^{'} = α O (e_{l + 1}^{r}) P (l + 1) = α O (e_{l + 1}^{r}) P (l) P . \end{matrix}

(12)

It can be considered that the state $X_{j}$ obtained by the system at time $l + 1$ is $X_{j} = a r g m a x {p_{j}^{(l + 1)}}$ .

The formula above incorporates a one-step prediction, and it is easy to derive the following recursive computation for prediction of the state at $t + k + 1$ from a prediction for $t + k$ ; therefore the state $X_{t + k + 1}$ can be obtained by

\begin{matrix} X_{t + k + 1} = a r g m a x \{P (X_{t + k + 1})\} . \end{matrix}

(13)

3.2. Social-Aware Prediction Optimization

In participatory sensing system, a mobile node can be the social node carrying data acquisition equipment. Thus, the social relationship is used to estimate the future locations of mobile nodes and optimize the prediction result of hidden Markov model.

In this paper, capturing the evolution of social interactions in the different periods of time (daily sample) over consecutive days is the aim, by computing social strength based on the average duration of contacts.

Figure 4 shows how social interaction (from the point of view of user A) varies during a day. For instance, it indicates a daily sample (8 a.m.–12 p.m.) over which the social strength of user A to users B and C is much stronger (less intermittent line) than the strength to users D, E, and F. Figure 4 aims to show the dynamics of a social network over a one-day period, where different social structures lead to different behavior when a user moves towards the social community that the user is related to.

Figure 4

Contacts a user A has with a set of users in different daily samples $Δ T_{i} .$

As illustrated in Figure 4, the total contact time of mobile nodes A and B during a daily sample $Δ T_{i}$ in a day k is denoted as

\begin{matrix} M_{i}^{k} = \sum_{c = 1}^{n} (t_{c}^{e} - t_{c}^{s}), c = 1,2, 3, \dots, n, \end{matrix}

(14)

where n is the number of contact times in

Δ T_{i}

t_{c}^{s}

indicates the start time of the cth contact of mobile node A and B, and

t_{c}^{e}

indicates the terminate time of the cth contact of mobile node A and B.

Hence the social strength between any pair of nodes A and B in $Δ T_{i}$ is denoted as

\begin{matrix} W {(A, B)}_{i} = \frac{\sum_{k = 1}^{m} M_{i}^{k}}{m \times Δ T_{i}}, \end{matrix}

(15)

where m is the total number of days in the historical record.

According to formula (15), the social relationship matrix of nodes in $Δ T_{i}$ can be obtained. On the basis of relation matrix mobile nodes can be partitioned as communities, which determine the closer relation nodes as a subgroup. Since the users' proximity is only taken into account, partition-based clustering methods, such as k-means and fuzzy c-means, are not applicable. Therefore use a hierarchical clustering method, namely, complete linkage clustering [29], as the community partition algorithm.

Suppose that it used social relationship to calculate the probability of node A arriving at the location i ( $i = 1,2, \dots, m$ ) at next period. Given that node A belongs to community C and the set of other nodes belonging to C on location i at current time slot is denoted as $S = {S_{1}, \dots, S_{j}, \dots, S_{n}}$ , where $S \subseteq C$ , according to conditional probability then the following formula is proposed:

\begin{matrix} P_{i} (A | S_{j}) = \frac{P_{i} (A, S_{j})}{P_{i} (S_{j})}, j = 1, \dots, n, \end{matrix}

(16)

where

P_{i} (A | S_{j})

represents the probability of node arriving at location i on the condition that node

S_{j}

has already been on the i location;

P_{i} (S_{j})

represents the probability that node

S_{j}

keeps on staying at location, which can be obtained by Markov model calculation;

P_{i} (A, S_{j})

represents the encounter probability of node A and node

S_{j}

on location i and the formula is defined as

\begin{matrix} P_{i} (A, S_{j}) = \frac{f_{i} (A, S_{j})}{\sum_{i = 1}^{m} f_{i} (A, S_{j})}, \end{matrix}

(17)

where

f_{i} (A, S_{j})

represents number of encounter times on location i.

Given the relationship weight of node A and node $S_{j}$ as $W (A, S_{j}) = ρ_{j}$ , the probability of node A arriving on location i at next time slot is

\begin{matrix} P_{i} (A) = \sum_{j = 1}^{n} λ_{j} P_{i} (A | S_{j}), λ_{j} = \frac{ρ_{j}}{\sum_{j = 1}^{n} ρ_{j}}, \end{matrix}

(18)

where

λ_{j}

is the weight of each conditional probability which is calculated by normalization method,

\sum_{j = 1}^{n} λ_{j} = 1

According to the location distribution of all the nodes belonging to C, the probability of node A arriving at different location can be obtained. And combined with the prediction result from hidden Markov model and using weight formula (19) to calculate the probability distribution of node A arriving at all the location in the location set, the location having the maximum of the visiting probabilities is considered as the output of the prediction algorithm:

\begin{matrix} P_{i} = {P_{i}}^{H M M} + d (P_{i}^{s o c i a l} - {P_{i}}^{H M M}), \end{matrix}

(19)

where

{P_{i}}^{H M M}

is location prediction probability of state

X_{i}

using hidden Markov model, and

P_{i}^{s o c i a l}

is the prediction probability of location i based on social relationship, and d is the damping factor which is defined as the probability that the social relation between the nodes helps improve the accuracy of the prediction. This means that the higher the value of d is, the more the algorithm accounts for the social relation between the nodes.

It is beneficial to use social relationship to optimize the prediction result, making the transition probability matrix sparse and improve the accuracy of the prediction model.

4. Experimental Analyses

4.1. Simulation Configuration

In this paper, the experiment data is from the dataset provided by Wireless Topology Discovery (WTD) [30], from which two-month-period data, total 13,215,412 items, is chosen to simulate the prediction algorithm. There are 275 nodes and 524 APs (access points) in the dataset. According to the vicinity of AP positions, the number of locations at which APs are clustered is shown in Figure 5.

Figure 5

Quantity of locations.

Figure 5 shows that when the defined granularity λ becomes bigger, the quantity of the locations in the gained scenario will also become larger. When λ is defined as 1, the quantity of location is equal to the total number of APs. The location granularity λ has been given as 0.5 in following experiment.

4.2. Similar User Clustering

In order to predict the further location of mobile nodes using social relationship, the social network structure in the system should be primarily considered. Based on the quantization formula (15), we calculate the relation strength between any pair of nodes A and B in different daily sample (a.m., noon, p.m., and evening), and the social network structures of the dataset are achieved, illustrated in Figure 6.

Figure 6

Social network structures of the dataset in different daily samples.

A hierarchical clustering method, complete linkage clustering, has been used to cluster mobile users. Figure 6(a) shows the social network clustering result in the a.m. period, and the clustering structures in the period of noon, p.m. and evening are, respectively, illustrated in Figures 6(b), 6(c), and 6(d).

4.3. Prediction Accuracy

In order to evaluate the accuracy of prediction model, the processed node locations can be divided into two parts: using the 50% that has been chosen from the original information to train the Markov model and using the rest as the test case of the prediction model. The prediction precision $P_{r e s u l t}$ is denoted as

\begin{matrix} P_{r e s u l t} = \frac{\sum_{i = 1}^{n} {a c c u r a c y}_{i}}{n} . \end{matrix}

(20)

In formula (20), n represents prediction times, and ${a c c u r a c y}_{i}$ is the prediction result of location i, denoted as

\begin{matrix} {a c c u r a c y}_{i} = \{\begin{cases} 1, & w h e n r e s u l t i s r i g h t \\ 0, & w h e n r e s u l t i s w r o n g . \end{cases} \end{matrix}

(21)

Firstly, the training data set is used to train the prediction model which includes standard Markov model (SMM) and daily-routine-based prediction model (MLPR). Afterward the test cases are used, respectively, to verify the above mentioned two models. The prediction accuracies of the two prediction models are shown in Figure 7, where Figure 7(a) shows the prediction accuracy of nodes from 1 to 92, Figure 7(b) shows the prediction accuracy of nodes from 92 to 184, and Figure 7(c) shows the prediction accuracy of nodes from 185 to 275. From Figure 7, it indicates that the daily-routine-based mobile node location prediction algorithm (MLPR) gains a better performance than standard Markov model. This shows that daily routines can promote the accuracy and improve the algorithm's performance.

Figure 7

Prediction precision of SMM and MLPR.

Then make a comparison among the proposed social-relationship-based mobile node location prediction algorithm using daily routines (SMLPR), O2MM and the ${S M L P}_{N}$ . Among these algorithms, second order Markov predictor (O2MM) has the best performance among Markov order-k predictors [15], and social-relationship-based mobile node location prediction algorithm $({S M L P}_{N})$ has the same even better performance than O2MM, which can be obtained from the previous work in the paper [31]. The comparative result is shown in Figure 8, from which it indicates that SMLPR has better prediction effects after combining with daily routines and social relationship and gains a higher accuracy than O2MM and ${S M L P}_{N}$ . Figure 9 shows the number of users in different precision range among SMM, O2MM, ${S M L P}_{N}$ , and SMLPR, and it illustrates that SMLPR obtained the largest number of node distribution in a higher precision range. For instance, the number of nodes with accuracy greater than 90% in SMLPR is 198 and in O2MM is 114, and SMM only achieves 55 nodes.

Figure 8

Prediction precision of O2MM, ${S M L P R}_{N}$ , and SMLPR.

Figure 9

The number of users in different precision range.

Lastly, the performance of these algorithms is shown as Table 1. The accuracy of SMLPR is 30% higher than the standard Markov model and nearly 10% higher than the second order Markov model. Then a better result could also be obtained in the comparison between SMLPR and ${S M L P}_{N}$ .

Table 1

The algorithm performance comparison.

	SMM	O2MM	SMLP $_{N}$	SMLPR
Prediction accuracy	0.6164	0.8275	0.8488	0.9014
Time complexity	$O (N)$	$O (N^{2})$	$O (N)$	$O (N)$
Storage space	$O (N^{2})$	$O (N^{3})$	$O (N^{2})$	$O (N^{2})$

In the aspects of space cost from Table 1, the complexity of SMLPR is $O (N)$ while O2MM is $O (N^{2})$ , and the memory demand of SMLPR is $O (N^{2})$ while O2MM is $O (N^{3})$ . Thus, it is proved that the SMLPR gets better performance than order-2 Markov predictor at much lower expense, and the SMLPR is more practical than order-2 Markov predictor in the WLAN scenario.

4.4. Impact of Location Granularity

In location-based mobility scenario, location granularity may have a significant influence on the prediction accuracy. In order to evaluate the impact of location granularity, the algorithms' performance is tested by adjusting the granularity value λ, and the result is shown in Figure 10.

Figure 10

The influence of location granularity to prediction accuracy.

As shown in Figure 10, with the increasing of the location granularity λ, due to the number of locations in the scenario, the average accuracies of these four algorithms are relatively decreasing. In these algorithms, SMM and O2MM meet a more significant impact on the factor of location, and the accuracy reduces approximately to 25%. For SMLPR, it shows a relatively moderate downward trend and the location granularity effect to SMLPR is not very obvious.

5. Conclusion

In this paper, the influence of opportunistic characteristic in participatory sensing system is introduced and the problems of sensing nodes such as intermittent connection, limited communication period, and heterogeneous distribution are analyzed. This paper focuses on the mobility model of nodes in participatory sensing systems and proposes the mobile node location prediction algorithm with users' daily routines based on social relationship between mobile nodes. According to the historical information of mobile nodes trajectories, the state transition matrix is constructed by the location as the transition state and hidden Markov model is used to predict the mobile node location with the certain duration. Meanwhile, social relationship between nodes is exploited for optimization and amendment of the prediction model. The prediction model is tested based on the WTD data set and proved to be effective.

Footnotes

Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.

Acknowledgments

This work is supported by the National Natural Science Foundation of China under Grant no. 61272529, the National Science Foundation for Distinguished Young Scholars of China under Grant nos. 61225012 and 71325002; Ministry of Education-China Mobile Research Fund under Grant no. MCM20130391; the Specialized Research Fund of the Doctoral Program of Higher Education for the Priority Development Areas under Grant no. 20120042130003; the Fundamental Research Funds for the Central Universities under Grant nos. N120104001 and N130817003; and Liaoning BaiQianWan Talents Program under Grant no. 2013921068.

References

Srivastava

Hansen

Burke

Wireless urban sensing systems

2006 65

Center for Embedded Networked Sensing at UCLA

Eisenman

S. B.

Miluzzo

Lane

N. D.

Peterson

R. A.

Ahn

G.-S.

Campbell

A. T.

BikeNet: a mobile sensing system for cyclist experience mapping

ACM Transactions on Sensor Networks 2009 6 1, article 6

10.1145/1653760.1653766

2-s2.0-75149165677

Pan

Lane

N. D.

Choudhury

Campbell

A. T.

SoundSense: scalable sound sensing for people-centric applications on mobile phones

Proceedings of the 7th ACM International Conference on Mobile Systems, Applications, and Services (MobiSys ′09)

June 2009

Krakov, Poland

165 178

10.1145/1555816.1555834

2-s2.0-70450267511

Miluzzo

Lane

N. D.

Fodor

Peterson

Musolesi

Eisenman

S. B.

Zheng

Campbell

A. T.

Sensing meets mobile social networks: the design, implementation and evaluation of the CenceMe application

Proceedings of the 6th ACM Conference on Embedded Networked Sensor Systems (SenSys ′08)

November 2008

Raleigh, NC, USA

337 350

10.1145/1460412.1460445

2-s2.0-84866497422

Miluzzo

Lane

Eisenman

Campbell

Kortuem

Finney

Lea

Sundramoorthy

CenceMe â injecting sensing presence into social networking applications

Smart Sensing and Context 2007 4793 1 28 Smart Sensing and Context

10.1007/978-3-540-75696-5_1

Eisenman

S. B.

Lane

N. D.

Miluzzo

MetroSense project: people-centric sensing at scale

Proceedings of the Workshop on World-Sensor-Web

2006

Boulder, Colo, USA

6 11

Lane

N. D.

Eisenman

S. B.

Campbell

A. T.

Bubble-sensing: binding sensing tasks to the physical world

Pervasive and Mobile Computing 2010 6 1 58 71

10.1016/j.pmcj.2009.10.005

2-s2.0-74849102302

Deng

Cox

L. P.

Live compare: grocery bargain hunting through participatory sensing

Proceedings of the 10th Workshop on Mobile Computing Systems and Applications (HotMobile ′09)

February 2009

Santa Cruz, Calif, USA

10.1145/1514411.1514415

2-s2.0-67650652707

Kanjo

NoiseSPY: a real-time mobile phone platform for urban noise monitoring and mapping

Mobile Networks and Applications 2010 15 4 562 574

10.1007/s11036-009-0217-y

2-s2.0-77956615228

10.

Perez

A. J.

Labrador

M. A.

Barbeau

S. J.

G-Sense: a scalable architecture for global sensing and monitoring

IEEE Network 2010 24 4 57 64

10.1109/mnet.2010.5510920

2-s2.0-77954855458

11.

Oliveira

L. M. L.

Rodrigues

J. J. P. C.

Elias

A. G. F.

Han

Wireless sensor networks in IPv4/IPv6 transition scenarios

Wireless Personal Communications 2014 78 4 1849 1862

10.1007/s11277-014-2048-9

12.

Song

Blumm

Barabasi

A.-L.

Limits of predictability in human mobility

Science 2010 327 5968 1018 1021

13.

González

M. C.

Hidalgo

C. A.

Barabási

A.-L.

Understanding individual human mobility patterns

Nature 2008 453 7196 779 782

10.1038/nature06958

2-s2.0-44849122540

14.

Qin

S.-M.

Verkasalo

Mohtaschemi

Hartonen

Alava

Patterns, entropy, and predictability of human mobility and life

PLoS ONE 2012 7 12

e51353

10.1371/journal.pone.0051353

2-s2.0-84871539882

15.

Song

Kotz

Jain

Evaluating location predictors with extensive Wi-Fi mobility data

Proceedings of the 23rd Annual Joint Conference of the IEEE Computer and Communications Societies (INFOCOM ′04)

2004

1414 1424

10.1109/INFCOM.2004.1357026

16.

Scellato

Musolesi

Mascolo

Latora

Campbell

A. T.

NextPlace: a spatio-temporal prediction framework for pervasive systems

Pervasive Computing 2011 6696

Berlin, Germany

Springer

152 169 Lecture Notes in Computer Science

10.1007/978-3-642-21726-5_10

17.

Mathew

Raposo

Martins

Predicting future locations with hidden Markov models

Proceedings of the 14th International Conference on Ubiquitous Computing (UbiComp ′12)

September 2012

911 918

2-s2.0-84879473987

18.

Mozer

M. C.

The neural network house: an environment that adapts to its inhabitants

Proceedings of the AAAI Spring Symposium

1998

Stanford, Calif, USA

110 114

19.

Karimi

H. A.

Liu

A predictive location model for location-based services

Proceedings of the 11th ACM International Symposium on Advances in Geographic Information Systems (GIS ′03)

November 2003

New Orleans, La, USA

126 133

2-s2.0-19544394267

20.

Patterson

J. D.

Liao

Fox

Inferring high-level behavior from low-level sensors

Proceedings of the 5th Annual Conference on Ubiquitous Computing (UbiComp ′03)

2003

Seattle, Wash, USA

73 89

21.

Zhu

Wang

Han

Rodrigues

J. J. P. C.

Lloret

LPTA: location predictive and time adaptive data gathering scheme with mobile sink for wireless sensor networks

The Scientific World Journal 2014 2014 13

476253

10.1155/2014/476253

22.

Zhu

Wang

Han

Rodrigues

J. J. P. C.

Guo

A location prediction based data gathering protocol for wireless sensor networks using a mobile sink

Proceedings of the 2nd Smart Sensor Networks and Algorithms (SSPA ′14), Co-Located with 13th International Conference on Ad Hoc, Mobile, and Woreless Networks (Ad Hoc ′14)

June 2014

Benidorm, Spain

23.

Y.-B.

Fan

S.-D.

Hao

Z.-X.

Whole trajectory modeling of moving objects based on MOST model

Computer Engineering 2008 34 16 41 43

24.

Han

Zhang

Lloret

Shu

Rodrigues

J. J. P. C.

A mobile anchor assisted localization algorithm based on regular hexagon in wireless sensor networks

The Scientific World Journal 2014 2014 13

219371

10.1155/2014/219371

25.

Han

Jiang

Shu

Chilamkurti

The insights of localization through mobile anchor nodes in wireless sensor networks with irregular radio

KSII Transactions on Internet and Information Systems 2012 6 11 2992 3007

2-s2.0-84870706790

26.

Han

Zhang

Liu

Shu

MANCL: a multi-anchor nodes cooperative localization algorithm for underwater acoustic sensor networks

Wireless Communications and Mobile Computing. In press

27.

Kruskal

On the shortest spanning subtree of a graph and the traveling salesman problem

Proceedings of the American Mathematical Society 1956 7 48 50

10.1090/S0002-9939-1956-0078686-7

MR0078686

28.

Cho

Myers

S. A.

Leskovec

Friendship and mobility: user movement in location-based social networks

Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD ′11)

August 2011

ACM

1082 1090

10.1145/2020408.2020579

2-s2.0-80052648113

29.

Punj

Stewart

D. W.

Cluster analysis in marketing research: review and suggestions for application

Journal of Marketing Research 1983 20 2 134 148

10.2307/3151680

30.

McNett

Voelker

G. M.

UCSD Wireless Topology Discovery Project [EB/OL]

2013, http://www.sysnet.ucsd.edu/wtd/wtd.html

31.

Xia

Social-relationship-based mobile node location prediction algorithm in participatory sensing systems

Chinese Journal of Computers 2014 35 6