Abstract
As location-based services have become popular, thereby exposed user locations raised serious privacy concerns. A typical measure for location privacy is to report blurred locations and ensure that other users coexist in the reported region. However, additional knowledge about the user's maximum speed and the territorial information in user's vicinity can allow for the adversary to effectively compromise the user's location privacy. In this paper, we present an anonymization algorithm that effectively counters such attacks while achieving k-anonymity requirements as well as minimum acceptable cloaked region size. We evaluate our anonymization scheme using state-of-the-art simulators for both vehicular movements and pedestrian movements. The experimental results demonstrate the effectiveness and efficiency of our proposed algorithm.
1. Introduction
With the advanced technologies, for example, wireless communication, global positioning system (GPS), and cellular networks, location-based services (LBSs) are eventually available anywhere anytime. They currently attract millions of mobile users by offering many valuable and important services to users. Common examples include Point-of-Interest services (e.g., finding the closest hospital for heart patients), monitoring traffic condition (e.g., warning traffic congestion reported from probe vehicles), location-aware social networking (e.g., users sharing their locations with friends using Facebook Places or Google Latitude), and location-based advertising (e.g., distributing 50% off coupons to all customers within two kilometers of a store). However, because LBSs need users’ locations as well as profiles of users to increase the value of services, problems may arise if the service provider is not trusted for user privacy. For example, an adversary could get user location information from the service provider to locate and track the users. Therefore, users could endanger their privacy by contacting the LBSs directly to send their location information.
One may attempt to preserve users privacy by hiding their identities, using, for example, a pseudonym instead of an identity. However, this is not enough because the user can be reidentified even from anonymized location data [1]. In order to process location-based requests, the LBS needs the location of the user. An attacker, which could be the LBS itself, can infer the identity of the user by associating the location and the request time with a particular individual. This can be easily performed in practice. For example, if one reports her or his location at 3 am every Wednesday, an attacker may then infer who she or he is with the help of a public telephone directory. Another approach to protect the user identity is to employ the concept of location k-anonymity [2]. This approach tries to find a set of at least k users such that the user is indistinguishable from other
In addition to identity, the users want to hide their exact location. Protecting the user privacy with location k-anonymity, however, is not enough to protect her or his location information when the k users are located within an area small enough to locate the user precisely. Location obfuscation approach [3–5] is used to decrease the precision of a position so that the attacker can only receive coarse-grained position information. A cloaked region, which is larger than a user-specified threshold, called minimum acceptable area, will be sent to the service provider instead of the exact location. Therefore, the attacker knows that the user is located in the cloaked region but has no clue where the user is exactly situated. In this paper, we aim to protect the user identity and user location by the methods: location k-anonymity and location obfuscation.
Consider an attacker Bob who wants to keep track of his teenage daughter Alice. One day, on the way to a bar, Alice would like to find the nearest gas station to fuel her car. However, she does not want to disclose her location so she reports a cloaked region to LBS instead of her exact coordinate. In the bar, she makes another query to see if any of her friends nearby is interested in joining her while hiding her coordinate because she also does not want her father to know that she is in a bar after school. By accessing the service provider's data somehow, Bob may obtain the cloaked location information of Alice's two LBS queries. From the first query, he can infer with high probability that Alice is currently driving in the city, so her speed cannot exceed 30 mph. Then he determines the area where Alice can reach maximum movement boundary (MMB) at the time requesting second query by using Alice's maximum speed. Therefore, knowing the area map and the MMB, Bob can limit the obfuscation area to certain locations (i.e., the bar) by removing all the unreachable regions based on the map and the MMB. We call this type of attacks MMB attack with constrained movement (MMB-CM). The MMB attack has been studied in some previous work [6–8]. However, existing approach works only in an open-space environment and its effect is significantly limited for the constrained movements. Attacking based on knowledge about reachable and unreachable areas in victim's vicinity was mentioned in [9]. But combinations of different types of attacks have been rarely considered [10]. We make the following contributions in this paper:
We propose URALP (Unreachable Region Aware Location Privacy), a location-cloaking algorithm against MMB-CM attack. For the first time in the literature, we evaluate location privacy algorithms using state-of-the-art transportation and pedestrian simulators for realistic vehicle and pedestrian movements. We show that the proposed anonymization algorithm is efficient and effective. In particular, we show that the algorithm achieves near-optimal entropy for the user locations.
The rest of this paper is structured as follows. In Section 2, we describe the system architecture, the attack model, MMB-CM attack, and privacy goals. Then we propose our cloaking algorithm in Section 3. Section 4 presents the performance evaluation results of our proposed algorithm. We discuss related work in Section 5, and we conclude in Section 6.
2. Problem Model
In this section, we describe the system architecture, the attack model, MMB-CM attack, and privacy goals.
2.1. System Architecture
For system architecture, we adopt three-tier model comprised of mobile device, trusted anonymizer, and LBS provider (Figure 1). We assume that the user does not trust the LBS provider for respecting user privacy. Therefore, when the user wants to submit an LBS query to the provider, the user instead contacts the anonymizer. On behalf of the user, the anonymizer performs the following tasks: (1) receiving an LBS query from the user, containing user identity, location, and privacy requirements including the privacy level k, the minimum required area

System architecture.
Our system architecture supports both stateful and stateless services. When each query needs to be linked to previous queries by the provider (stateful), the anonymizer keeps the same pseudonym for the same user. When each query is independent (stateless), a fresh pseudonym is generated for each query.
2.2. Adversary Model
Any party with the following capabilities is a potential attacker:
being able to access all or some anonymized LBS queries including pseudonym, time, and cloaked region; having knowledge about the upper bound of the victim's moving speed; having knowledge about reachable and unreachable areas in victim's vicinity.
For example, the attacker can be a malicious LBS provider or anyone who can access the provider's system such as a law enforcement agency or an intruder. We assume that the user trusts the anonymization server.
The upper bound of the victim's moving speed can be estimated based on the victim's transportation means and the road speed limit. If the user is driving in New York, for example, the road speed limit is 40 km/h in residential areas, 104.6 km/h on freeways, and 88.5 km/h in rural areas [11]. In addition, the vehicle type can also help the attacker to estimate the maximum speed more precisely. If the user is travelling on foot, the moving speed may not exceed 6 km/h.
An unreachable area is the area where the victim is unlikely to move through, if not impossible. For example, when driving, the user can only travel on vehicular roads or parking lots, and one can easily identify unreachable areas for the driving user from the area map. If the user is a pedestrian, unreachable regions are places where people cannot walk, such as water, vehicular roads, train tracks, or some dangerous areas.
2.3. MMB-CM Attack
We consider the following MMB-CM attack (Figure 2). Suppose the attacker wants to know the location of the user A. At time

Example of MMB attack with constrained movement.
2.4. Privacy Goals
As the privacy model we adopt k-anonymity. However, k-anonymity itself is not enough to protect the user's location privacy. For example, even if we find more than k users located in the same cloaked region, the location privacy would be compromised if the cloaked region is small enough so that the user feels uncomfortable to reveal it. In summary, we aim to achieve the following privacy goals to meet the user's privacy requirements.
Every successfully anonymized request contains the cloaked region such that at least Every cloaked region should be larger than or equal to the minimum acceptable area size (
3. Anonymization Algorithm
In this section, we describe our anonymization algorithm.
3.1. Overview
For a newly arrived request, we find a set of k requests (including the new one) to be anonymized, submitted by k different users. These users report the same cloaked region to the LBS so that each user can be indistinguishable among other users. So, we aim to find the set of users and the cloaked region that satisfy these conditions as follows:
k meets the k-anonymity requirements of all the users in the set. The cloaked region meets the minimum acceptable area requirement of all the users in the set. The cloaked region is contained inside the MMBs of all the users in the set.
The first and second condition are necessary to meet the privacy goal described in Section 2.4. The last condition is also necessary. Suppose the cloaked region is not contained in the MMB of a user in the set. Then, part of the cloaked region outside the MMB should be removed from the cloaked region as the user cannot appear on that part.
We find such a k-set as follows. First, we find k requests from k different users, such that each user's location is contained in the MMBs of all the other
In the following, we describe the algorithm in more detail.
3.2. Finding k-Anonymity Set
All requests waiting for cloaking (called alive requests) are modeled as an undirected graph

Illustration of graph model.
Given such a graph of alive requests, we can find a k-anonymity set by identifying the maximum clique (or max-clique) in the graph containing the new request node [12]. This ensures that each user is contained in the MMB of each other. We accept this anonymity set if it is larger than the k-anonymity requirement of each user.
Once the anonymity set is found, we set the initial cloaked region to the minimum bounding rectangle (MBR) of all the users in the clique. In Figure 3, v, w, and t form a max-clique, and thus a k-anonymity set.
The cloaking engine processes each arriving request from mobile users in 3 steps. The first step, called detection, is responsible for updating all max-cliques containing the new request. The second step, called generation, involves generating candidate cloaked regions, which satisfy privacy requirements from the max-cliques. The candidate cloaked region that archives the best utility is chosen as the cloaked region. Finally, the graph will be updated in the last step, called updating. If the request is cloaked successfully or failed to be cloaked because of expiry of its life time, it will be dropped. In the following sections, we detail each step of our algorithm.
3.3. Detection
Upon arrival of a new request, we first add the new node to the graph and detect all max-cliques containing the new node. Then, we incrementally detect the max-cliques by using the update max-clique algorithm in [6]. Finally, because we need to find the k-node clique, we only add the max-clique that has number of users greater than the privacy level k of the new request to the max-clique set.
3.4. Generation
After updating all max-cliques, we generate the cloaked regions that meet all the requirements as follows:
The number of users in the candidate cloaked region has to be larger than the privacy levels of all users in the candidate cloaked region. The area of MBR excluding unreachable region has to be larger than the minimum acceptable area of all users in the candidate cloaked region.
First, we sort the max-clique set in descending order of clique size. For each max-clique, we generate a candidate cloaked region as the MBR of the cells containing all the clique-members (see Figure 4). Then, we check if the two conditions are met. If the first condition is not satisfied, that is, the clique size is less than a clique-member's privacy level k, then we repeatedly remove the user from the clique until the condition is satisfied. If the second condition is not met, that is, the current cloaked region is less than a clique-member's minimum acceptable area size, the MBR is expanded as follows.

Display unreachable region and the MBR in the grid.
At each step, we extend the MBR by one cell-width towards one of four directions that increases the reachable area the most, but not beyond the intersection of the MMBs (see Figure 5). We repeat this step until the MBR is as large as the minimum acceptable area. If we fail to satisfy any of the two conditions, the anonymization fails.

Display reachable area outside the MBR at the first expanding step.
Algorithm 1 gives the pseudocode to generate the cloaked region. The procedure for extending the MBR is described in Algorithm 2.
(1) sort (2) (3) find max_k, min_k, max_ (4) (5) (6) (7) (8) add (9) (10) add returned result of Algorithm 2 to (11) (12) not satisfy; break; (13) (14) sort the requests in (15) (16) drop the request with the highest k from (17) update (18) (19) add new (20) (21) add returned result of Algorithm 2 to (22)
(1) (2) maximum and it does not reach the intersection of the MMBs (3) (4) (5) extend (6) update (7)
We define MCSet to be max-clique set involving the new request u, CR being the cloaked region, canCS being each max-clique in MCSet, max_k (min_k) being the maximum (minimum) privacy level k of all users in canCS, max_
3.5. Updating
Each request specifies its own life time
4. Evaluation
In this section, we present a set of experiments that show the effectiveness and efficiency of our proposed algorithm. In particular, we evaluate our algorithm for users moving by car and on foot. To our best knowledge, this is the first attempt in evaluating a location privacy method with realistic pedestrian movement data. In the following, we describe the experimental setup, evaluation metrics, and the results.
4.1. Experiment Setup
We used an off-the-shelf transportation simulator Paramics [13] to generate vehicular traffic in the map of Manhattan, New York. The input road map was extracted from the Tiger/Line files [14]. Table 1 shows the experiment parameters. In each experiment, we randomly choose user-specific k-anonymity requirements, minimum area, maximum speed, and number of users from the given range. We assume that each user follows an exponential distribution with the mean of 20 seconds for query submission. The whole map is divided into 100 m by 100 m cells.
System parameters in vehicle simulation.
For pedestrian movements, we used a pedestrian simulator SimWalk [15]. Table 2 shows the experiment parameters. The experiment contains 200 people walking in a bus station (Figure 6). The area is 164 m by 80 m large, divided into 1.6 m by 1.6 m cells. All experiments are run on an i7-2600 3.4 GHz machine with 4 GB of RAM.
System parameters in pedestrian simulation.

Bus terminal scenario.
We compare our proposed algorithm URALP with the algorithm proposed by Pan et al. [6]. Their algorithm is designed for MMB attacks, but we evaluate it against MMB-CM attacks. We denote this algorithm by MMBPan.
4.2. Evaluation Metrics
We evaluate the scheme by cloaking time and anonymization time for performance, cloaked region size for utility, and success rate and entropy for privacy as follows:
Cloaking time is the time used to update all max-cliques and generate a cloaked region. Anonymization time is the time between the periods when the request arrives and when the request is cloaked successfully. It includes the cloaking time and the waiting time to be cloaked. Cloaked region size is the size of the final cloaked area when successful. Success rate is the ratio of the number of successfully anonymized requests to the total number of requests. Entropy is as shown below.
We measure the entropy of the mobile user's locations as follows. Given a set of cloaked regions
4.3. Experiment Results
In the experiments, we vary the maximum movement speed (Figure 7), request life time (Figures 8 and 9), minimum acceptable area (Figures 11 and 12), number of users (Figure 13), privacy level (Figures 14 and 15), and cell size (Figure 16). We also analyze the achieved privacy using the metric of entropy in Figure 10.

Effect of speed in vehicle simulation on the following: (a) success rate, (b) cloaking time, (c) anonymization time, and (d) cloaking area.

Effect of life time in vehicle simulation on the following: (a) success rate, (b) cloaking time, and (c) anonymization time.

Effect of life time in pedestrian simulation on the following: (a) success rate, (b) cloaking time, and (c) anonymization time.

Effect of (a) life time, (b) minimum acceptable area, and (c) privacy level on entropy in pedestrian simulation.

Effect of minimum acceptable area changing from 0.001, 0.01, 0.1, 0.5, 1, 2, 3, and 4 to 5 km2 in vehicle simulation on the following: (a) success rate, (b) cloaking time, (c) anonymization time, and (d) cloaking area.

Effect of minimum acceptable area in pedestrian simulation on the following: (a) success rate, (b) cloaking time, (c) anonymization time, and (d) cloaking area.

Effect of number of users in vehicle simulation on the following: (a) success rate, (b) cloaking time, (c) anonymization time, and (d) cloaking area.

Effect of privacy level in vehicle simulation on the following: (a) success rate, (b) cloaking time, (c) anonymization time, and (d) cloaking area.

Effect of privacy level in pedestrian simulation on the following: (a) success rate, (b) cloaking time, (c) anonymization time, and (d) cloaking area.

Effect of cell size in vehicle simulation on the following: (a) success rate, (b) cloaking time, (c) anonymization time, and (d) cloaking area.
In summary, the proposed algorithm URALP outperforms MMBPan in success rate, cloaking time, anonymization time, and entropy, whether vehicular or pedestrian movements.
As shown in Figures 7(b), 8(b), 9(b), 11(b), 12(b), 13(b), 14(b), 15(b), 7(c), 8(c), 9(c), 11(c), 12(c), 13(c), 14(c), and 15(c), cloaking time and anonymization time of URALP are less than MMBPan for all settings tested. That indicates that our proposed algorithm achieves higher efficiency than MMBPan. The reason is that the complex computations to calculate the reachable area in each cell were done before doing anonymization. So, in cloaking process, we only use addition operation to calculate the reachable area in cloaked region. Otherwise, without cell generation, MMBPan has to take more time to finish anonymization.
(1) Effect of Speed. Figure 7(a) shows that the higher user speed increases the success rate. The reason is that since user's velocity is proportional to the MMB's size of the user, a faster speed leads to a larger size of MMB. Therefore, having an edge existing between two nodes is more possible to achieve. It means the possibility of finding a cloaked region is much easier.
Figures 7(b), 7(c), and 7(d) show that, for the two algorithms, the cloaking time, the anonymization time, and the cloaked area are not affected by the change of the speed.
(2) Effect of Life Time. Figures 8(a) and 9(a) study the effect of the life time
From Figures 8(b) and 9(b), as the anonymizer can take more time to extend the MBR, the cloaking time increases with longer
(3) Effect of Minimum Acceptable Area. This section gives the effect of more strict privacy profile on system performance by increasing the value of minimum acceptable area
Note that URALP generates larger cloaked regions than MMBPan. This is because URALP expands the cloaked area to meet the
(4) Effect of Number of Users. As shown in Figure 13(a), the success rate decreases with increasing number of users. This is mainly because of the increased workload (see Figure 13(b)), which makes more requests expire and fail to be cloaked. From Figures 13(b) and 13(c), the cloaking time and the process time increase with increasing number of users, since more users imply more max-cliques and longer search time. The average cloaked areas of two algorithms in Figure 13(d) drop by increasing the number of users. The reason is that the higher the user density, the smaller the cloaked region.
(5) Effect of Privacy Level. We now evaluate the effect of privacy level on the performance of cloaking algorithms. From Figures 14(a) and 15(a), the success rates decrease with more constrained privacy requirement. In contrast, as shown in Figures 14(b), 15(b), 14(c), and 15(c), the cloaking time and the anonymization time increase with higher privacy level. The reason is that more requests could be removed from the clique whose number of users is smaller than request's privacy level. Those requests have to wait to be cloaked in other cliques. The average cloaked areas increase with higher privacy level since the request needs to find more neighbors to satisfy the privacy level requirement (Figures 14(d) and 15(d)).
(6) Effect of Cell Size. This section gives the effect of changing cell size on the performance of URALP. Figures 16(c) and 16(b) show that, with the small side length of cell, from 5 m to 100 m, the anonymization time and cloaking time fall significantly as the expanding of cloaked regions takes much less time to reach the minimum area requirement. It causes less requests to expire before their anonymization succeeds. Therefore, the success rate in Figure 16(a) increases greatly. After that, from 100 m to 800 m, it increases slightly. Downside of large cell is, obviously, to have larger cloaked area sizes as shown in Figure 16(d) since MBR is generated from all cells that contain users in max-clique. In other words, it could decrease the quality of service. So, to maximize the success rate while minimising the cloaked region size, the optimal cell size can be chosen based on a specific map. In this paper, we pick the cell size of 100 m and 1.6 m for vehicles and pedestrian, respectively.
(7) Entropy Analysis. In this section, we evaluate the algorithm by entropy. For the definition of the metric, see Section 4.2. In this experiment, we divided each reachable region into
In Figure 10, we notice that entropy of our proposed algorithm is significantly higher than that of MMBPan under various conditions such as time, area, and privacy level. The reason is that, in MMBPan, cloaked region is MBR of users in max-clique, so users are mostly located in the boundary of cloaked region. Otherwise, URALP expands the cloaked area to meet the
5. Related Work
The concept of location k-anonymization was first studied by Gruteser and Grunwald [2]. Under the centralized location anonymization architecture, they introduced a scheme in which the spatial and temporal accuracy of location information is reduced such that at least k users are indistinguishable. This basic model has been extended in several ways. For example, Bettini et al. [16] have introduced the concept of historical k-anonymity for preserving privacy. Mokbel et al. [17] have integrated both anonymity and obfuscation to protect privacy using a centralized system. They calculated the obfuscation area of the k users in their Casper framework based on the user-defined values of k and an area value
The MMB attack has been studied in some previous work. The problem was first introduced by Reynold et al. [8]. They proposed two simple solutions, namely, patching and delaying. Patching solution enlarges the previous cloaked region to cover the current one so that the overlapped area with the MMB is at least as large as the current cloaked region. Its disadvantage is that the size of cloaked regions is increasing when time goes past. That will result in expensive query process cost. If there is a constraint Amax, which is maximum cloaking region size a user can tolerate, the increasing cloaked region is easy to exceed Amax. Therefore, success rate of anonymizing will be low. Delaying solution postpones the query until the MMB covers the current cloaked region. However, the user may have moved to another location and the current cloaked region also is changed. If we treat the new cloaked region as the old one, it will affect the accuracy of query result. Time delaying also results in bad services quality. Ghinita et al. [7] proposed spatial and temporal transformations to preserve user privacy. Moreover, they considered the scenario that the attacker knows the placement of sensitive locations. However, in these papers, the identity of the user is known, and the objective is to protect the exact location of the user. In other words, the privacy model is only the granularity of cloaked regions, without considering the location k-anonymity. So they fail to protect the user identity in case there is only one user in the cloaked region. These studies cannot work effectively in the situation of constrained movements.
Gedik and Liu [12, 19] proposed a personalized k-anonymity model and proposed CliqueCloak, which constructs an undirected graph for all the requests that have not been anonymized yet to combine users that can share the same cloaked region. Our work also employs a graph model but differs from the underlying problems and the methods for finding cliques. The proposed algorithm exhaustively searches the graph for cliques covering the new request to generate candidate cloaked region. In contrast, based on [6], to reduce the computational complexity, our algorithm incrementally maintains maximal cliques. Their cloaking algorithm protects location privacy against snapshot location attacks, while our proposed algorithm can prevent the MMB attack with knowledge of all the cloaked location updates.
Similarly, Pan et al. [6] proposed an incremental clique-based cloaking algorithm, called ICliqueCloak to make sure that all users in the cloaked region are in the MMB of each other. Therefore, the intersection area between the cloaked region and the MMBs is the cloaked region itself. In other words, the cloaked area is not reduced by MMB attack. They used the graph model to solve location k-anonymity problem. However, different from our work, this study works in an open-space environment only and its effect is significantly limited for the constrained movements. Moreover, they also performed wrong implementation because of miscalculating the reachable areas of users. In their implementation, moving objects can only move on the roads instead of anywhere; but they calculate that the reachable area is the MBR of the users.
In [20], the authors adapted location entropy measure defined in Cranshaw et al. [21] to study the privacy of a location. Location entropy measures the frequency of users’ visits to a given location. They investigated how entropy impacts user perceptions of location privacy by showing that users are more comfortable in sharing high entropy locations than low entropy locations. Xu et al. [22] also developed a location cloaking technique to resist MMB attack. They built the cloaked regions based on two privacy metrics: entropy and minimum acceptable cloaking area. However, they did not consider the k-anonymity to protect user identity.
6. Conclusion
In this work we have proposed a greedy algorithm against the MMB attack that may infer user's exact location with knowledge of regions in the map. To address this problem, we have employed a grid structure to extend the cloaked region that does not satisfy minimum area requirement. We showed that the existing algorithm against the MMB attack cannot work effectively with considering the accurate calculation of the reachable areas. The presented experiment results examine the effectiveness and efficiency of our proposed algorithm under various settings.
Footnotes
Conflict of Interests
The authors declare that there is no conflict of interests regarding the publication of this paper.
Acknowledgment
This work was supported by ICT R&D Program of MSIP/IITP (B0101-15-1272, Development of Device Collaborative Giga-Level Smart Cloudlet Technology).
