Sage Journals: Discover world-class research

Abstract

As location-based services have become popular, thereby exposed user locations raised serious privacy concerns. A typical measure for location privacy is to report blurred locations and ensure that other users coexist in the reported region. However, additional knowledge about the user's maximum speed and the territorial information in user's vicinity can allow for the adversary to effectively compromise the user's location privacy. In this paper, we present an anonymization algorithm that effectively counters such attacks while achieving k-anonymity requirements as well as minimum acceptable cloaked region size. We evaluate our anonymization scheme using state-of-the-art simulators for both vehicular movements and pedestrian movements. The experimental results demonstrate the effectiveness and efficiency of our proposed algorithm.

1. Introduction

With the advanced technologies, for example, wireless communication, global positioning system (GPS), and cellular networks, location-based services (LBSs) are eventually available anywhere anytime. They currently attract millions of mobile users by offering many valuable and important services to users. Common examples include Point-of-Interest services (e.g., finding the closest hospital for heart patients), monitoring traffic condition (e.g., warning traffic congestion reported from probe vehicles), location-aware social networking (e.g., users sharing their locations with friends using Facebook Places or Google Latitude), and location-based advertising (e.g., distributing 50% off coupons to all customers within two kilometers of a store). However, because LBSs need users’ locations as well as profiles of users to increase the value of services, problems may arise if the service provider is not trusted for user privacy. For example, an adversary could get user location information from the service provider to locate and track the users. Therefore, users could endanger their privacy by contacting the LBSs directly to send their location information.

One may attempt to preserve users privacy by hiding their identities, using, for example, a pseudonym instead of an identity. However, this is not enough because the user can be reidentified even from anonymized location data [1]. In order to process location-based requests, the LBS needs the location of the user. An attacker, which could be the LBS itself, can infer the identity of the user by associating the location and the request time with a particular individual. This can be easily performed in practice. For example, if one reports her or his location at 3 am every Wednesday, an attacker may then infer who she or he is with the help of a public telephone directory. Another approach to protect the user identity is to employ the concept of location k-anonymity [2]. This approach tries to find a set of at least k users such that the user is indistinguishable from other $k - 1$ nearby users.

In addition to identity, the users want to hide their exact location. Protecting the user privacy with location k-anonymity, however, is not enough to protect her or his location information when the k users are located within an area small enough to locate the user precisely. Location obfuscation approach [3–5] is used to decrease the precision of a position so that the attacker can only receive coarse-grained position information. A cloaked region, which is larger than a user-specified threshold, called minimum acceptable area, will be sent to the service provider instead of the exact location. Therefore, the attacker knows that the user is located in the cloaked region but has no clue where the user is exactly situated. In this paper, we aim to protect the user identity and user location by the methods: location k-anonymity and location obfuscation.

Consider an attacker Bob who wants to keep track of his teenage daughter Alice. One day, on the way to a bar, Alice would like to find the nearest gas station to fuel her car. However, she does not want to disclose her location so she reports a cloaked region to LBS instead of her exact coordinate. In the bar, she makes another query to see if any of her friends nearby is interested in joining her while hiding her coordinate because she also does not want her father to know that she is in a bar after school. By accessing the service provider's data somehow, Bob may obtain the cloaked location information of Alice's two LBS queries. From the first query, he can infer with high probability that Alice is currently driving in the city, so her speed cannot exceed 30 mph. Then he determines the area where Alice can reach maximum movement boundary (MMB) at the time requesting second query by using Alice's maximum speed. Therefore, knowing the area map and the MMB, Bob can limit the obfuscation area to certain locations (i.e., the bar) by removing all the unreachable regions based on the map and the MMB. We call this type of attacks MMB attack with constrained movement (MMB-CM). The MMB attack has been studied in some previous work [6–8]. However, existing approach works only in an open-space environment and its effect is significantly limited for the constrained movements. Attacking based on knowledge about reachable and unreachable areas in victim's vicinity was mentioned in [9]. But combinations of different types of attacks have been rarely considered [10]. We make the following contributions in this paper: (i)

We propose URALP (Unreachable Region Aware Location Privacy), a location-cloaking algorithm against MMB-CM attack.

(ii)

For the first time in the literature, we evaluate location privacy algorithms using state-of-the-art transportation and pedestrian simulators for realistic vehicle and pedestrian movements.

(iii)

We show that the proposed anonymization algorithm is efficient and effective. In particular, we show that the algorithm achieves near-optimal entropy for the user locations.

The rest of this paper is structured as follows. In Section 2, we describe the system architecture, the attack model, MMB-CM attack, and privacy goals. Then we propose our cloaking algorithm in Section 3. Section 4 presents the performance evaluation results of our proposed algorithm. We discuss related work in Section 5, and we conclude in Section 6.

2. Problem Model

In this section, we describe the system architecture, the attack model, MMB-CM attack, and privacy goals.

2.1. System Architecture

For system architecture, we adopt three-tier model comprised of mobile device, trusted anonymizer, and LBS provider (Figure 1). We assume that the user does not trust the LBS provider for respecting user privacy. Therefore, when the user wants to submit an LBS query to the provider, the user instead contacts the anonymizer. On behalf of the user, the anonymizer performs the following tasks: (1) receiving an LBS query from the user, containing user identity, location, and privacy requirements including the privacy level k, the minimum required area $A_{m i n}$ , and the life time of the query $d t$ ; (2) anonymizing the query by replacing the user identity with a pseudonym and cloaking the location into a cloaked region, which meets the privacy requirements; (3) submitting the anonymized LBS query to the LBS provider; (4) refining the query results from the provider by leaving out irrelevant results caused by location cloaking; and (5) sending the refined results to the user.

Figure 1

System architecture.

Our system architecture supports both stateful and stateless services. When each query needs to be linked to previous queries by the provider (stateful), the anonymizer keeps the same pseudonym for the same user. When each query is independent (stateless), a fresh pseudonym is generated for each query.

2.2. Adversary Model

Any party with the following capabilities is a potential attacker: (i)

being able to access all or some anonymized LBS queries including pseudonym, time, and cloaked region;

(ii)

having knowledge about the upper bound of the victim's moving speed;

(iii)

having knowledge about reachable and unreachable areas in victim's vicinity.

For example, the attacker can be a malicious LBS provider or anyone who can access the provider's system such as a law enforcement agency or an intruder. We assume that the user trusts the anonymization server.

The upper bound of the victim's moving speed can be estimated based on the victim's transportation means and the road speed limit. If the user is driving in New York, for example, the road speed limit is 40 km/h in residential areas, 104.6 km/h on freeways, and 88.5 km/h in rural areas [11]. In addition, the vehicle type can also help the attacker to estimate the maximum speed more precisely. If the user is travelling on foot, the moving speed may not exceed 6 km/h.

An unreachable area is the area where the victim is unlikely to move through, if not impossible. For example, when driving, the user can only travel on vehicular roads or parking lots, and one can easily identify unreachable areas for the driving user from the area map. If the user is a pedestrian, unreachable regions are places where people cannot walk, such as water, vehicular roads, train tracks, or some dangerous areas.

2.3. MMB-CM Attack

We consider the following MMB-CM attack (Figure 2). Suppose the attacker wants to know the location of the user A. At time $t_{i}$ , A makes a query; then later at $t_{i + 1}$ A makes another query. At time $t_{i}$ , users A, B, and C cloaked their location as region $R_{A} (t_{i})$ , and at time $t_{i + 1}$ users A, D, and E cloaked their location as region $R_{A} (t_{i + 1})$ . As the attacker estimates the maximum speed of A as $v_{A}$ , he/she can confine A's location at $t_{i + 1}$ within the maximum movement boundary $M M B_{A} (t_{i + 1})$ , the bound enclosing $R_{A} (t_{i})$ with enlargement by $v_{A} * (t_{i + 1} - t_{i})$ . Therefore, A's real location should be inside the intersection of $M M B_{A} (t_{i + 1})$ and $R_{A} (t_{i + 1})$ . Knowing the geometry of the lake, which is unreachable by the user, the attacker can further confine A's location at time $t_{i + 1}$ to the smaller area indicated by red color in Figure 2. If the confined area is smaller than the minimum acceptable area set by the user, we conclude that A's location at time $t_{i + 1}$ is compromised.

Figure 2

Example of MMB attack with constrained movement.

2.4. Privacy Goals

As the privacy model we adopt k-anonymity. However, k-anonymity itself is not enough to protect the user's location privacy. For example, even if we find more than k users located in the same cloaked region, the location privacy would be compromised if the cloaked region is small enough so that the user feels uncomfortable to reveal it. In summary, we aim to achieve the following privacy goals to meet the user's privacy requirements. (i)

Every successfully anonymized request contains the cloaked region such that at least $k - 1$ other users submit requests with the same cloaked region. Each user may have different k value.

(ii)

Every cloaked region should be larger than or equal to the minimum acceptable area size ( $A_{m i n}$ ) set by the user.

3. Anonymization Algorithm

In this section, we describe our anonymization algorithm.

3.1. Overview

For a newly arrived request, we find a set of k requests (including the new one) to be anonymized, submitted by k different users. These users report the same cloaked region to the LBS so that each user can be indistinguishable among other users. So, we aim to find the set of users and the cloaked region that satisfy these conditions as follows: (i)

k meets the k-anonymity requirements of all the users in the set.

(ii)

The cloaked region meets the minimum acceptable area requirement of all the users in the set.

(iii)

The cloaked region is contained inside the MMBs of all the users in the set.

The first and second condition are necessary to meet the privacy goal described in Section 2.4. The last condition is also necessary. Suppose the cloaked region is not contained in the MMB of a user in the set. Then, part of the cloaked region outside the MMB should be removed from the cloaked region as the user cannot appear on that part.

We find such a k-set as follows. First, we find k requests from k different users, such that each user's location is contained in the MMBs of all the other $k - 1$ users. That means that all the other users can appear at the location of the user, which makes them indistinguishable to the attacker. Once we find such a set, the cloaked region is the minimum bounding rectangle of the k users. As a result, this rectangle contains the k users’ locations and is contained in the MMBs of all k users. The latter is true because each MMB is a bounding box of the k users. If the cloaked region is not large enough to meet the minimum acceptable area requirements of all k users, we strategically expand the region until the condition is met.

In the following, we describe the algorithm in more detail.

3.2. Finding k-Anonymity Set

All requests waiting for cloaking (called alive requests) are modeled as an undirected graph $G (V, E)$ , where V is the set of all the alive requests and E is the set of edges. An edge $(u, v)$ exists in E if and only if the two requests u and v are issued by different users (also denoted by u and v), and their locations fall in the MMB of each other. For example, consider four requests $u, v, t$ , and w from different users in Figure 3. Note that u is contained in $M M B_{v}$ and v is contained in ${M M B}_{u}$ ; thus an edge exists between u and v. Likewise, there is an edge between v and w, w and t, and t and v.

Figure 3

Illustration of graph model.

Given such a graph of alive requests, we can find a k-anonymity set by identifying the maximum clique (or max-clique) in the graph containing the new request node [12]. This ensures that each user is contained in the MMB of each other. We accept this anonymity set if it is larger than the k-anonymity requirement of each user.

Once the anonymity set is found, we set the initial cloaked region to the minimum bounding rectangle (MBR) of all the users in the clique. In Figure 3, v, w, and t form a max-clique, and thus a k-anonymity set.

The cloaking engine processes each arriving request from mobile users in 3 steps. The first step, called detection, is responsible for updating all max-cliques containing the new request. The second step, called generation, involves generating candidate cloaked regions, which satisfy privacy requirements from the max-cliques. The candidate cloaked region that archives the best utility is chosen as the cloaked region. Finally, the graph will be updated in the last step, called updating. If the request is cloaked successfully or failed to be cloaked because of expiry of its life time, it will be dropped. In the following sections, we detail each step of our algorithm.

3.3. Detection

Upon arrival of a new request, we first add the new node to the graph and detect all max-cliques containing the new node. Then, we incrementally detect the max-cliques by using the update max-clique algorithm in [6]. Finally, because we need to find the k-node clique, we only add the max-clique that has number of users greater than the privacy level k of the new request to the max-clique set.

3.4. Generation

After updating all max-cliques, we generate the cloaked regions that meet all the requirements as follows: (i)

The number of users in the candidate cloaked region has to be larger than the privacy levels of all users in the candidate cloaked region.

(ii)

The area of MBR excluding unreachable region has to be larger than the minimum acceptable area of all users in the candidate cloaked region.

First, we sort the max-clique set in descending order of clique size. For each max-clique, we generate a candidate cloaked region as the MBR of the cells containing all the clique-members (see Figure 4). Then, we check if the two conditions are met. If the first condition is not satisfied, that is, the clique size is less than a clique-member's privacy level k, then we repeatedly remove the user from the clique until the condition is satisfied. If the second condition is not met, that is, the current cloaked region is less than a clique-member's minimum acceptable area size, the MBR is expanded as follows.

Figure 4

Display unreachable region and the MBR in the grid.

At each step, we extend the MBR by one cell-width towards one of four directions that increases the reachable area the most, but not beyond the intersection of the MMBs (see Figure 5). We repeat this step until the MBR is as large as the minimum acceptable area. If we fail to satisfy any of the two conditions, the anonymization fails.

Figure 5

Display reachable area outside the MBR at the first expanding step.

Algorithm 1 gives the pseudocode to generate the cloaked region. The procedure for extending the MBR is described in Algorithm 2.

Algorithm 1: Generate the cloaked region.

Input: the new request u, max-clique set $M C S e t$ involving u

Output: cloaked region $C R$

(1) sort $M C S e t$ in descending order of clique size

(2) for each $c a n C S$ do

(3) find max_k, min_k, max_ $A_{\min}$ for $c a n C S$

(4) $c a n C R \leftarrow MBR$ of cells of users in $c a n C S$

(5) $a r e a \leftarrow c a n C R - (c a n C R \cap U R)$

(6) if $|c a n C S|$ ≥ max_k then

(7) if $a r e a$ ≥ max_ $A_{\min}$ then

(8) add $c a n C R$ to $C R s e t$ ; break;

(9) else

(10) add returned result of Algorithm 2 to $C R s e t$ if it does not return false; break;

(11) if $|c a n C S|$ < min_k then

(12) not satisfy; break;

(13) else

(14) sort the requests in $c a n C S$ in descending order of their privacy level k;

(15) while $|c a n C S|$ < max_k do

(16) drop the request with the highest k from $c a n C S$

(17) update $c a n C R$ , $|c a n C S|$ , max_k, max_ $A_{\min}$ , and $a r e a$

(18) if $|c a n C S|$ ≥ max_k and $a r e a$ ≥ max_ $A_{\min}$ then

(19) add new $c a n C R$ to $C R s e t$ ; break

(20) if $a r e a$ < max_ $A_{\min}$ then

(21) add returned result of Algorithm 2 to $C R s e t$ if it does not return false; break;

(22) return the cloaked region that has minimum reachable area

Algorithm 2: Extend the MBR.

Input: the MBR of cells of users in a max-clique

Output: candidate cloaked region $c a n C R$

(1) while $a r e a$ < max_ $A_{\min}$ do

(2) $d i r$ ← direction where reachable area on the cell line bounding outside the MBR is

maximum and it does not reach the intersection of the MMBs $c a n C S$

(3) if no $d i r$ then

(4) return false

(5) extend $c a n C R$ 1 row or 1 column of cell on direction $d i r$

(6) update $a r e a$

(7) return $c a n C R$

We define MCSet to be max-clique set involving the new request u, CR being the cloaked region, canCS being each max-clique in MCSet, max_k (min_k) being the maximum (minimum) privacy level k of all users in canCS, max_ $A_{m i n}$ being the maximum $A_{m i n}$ of all users in canCS, canCR being the area of candidate cloaked region, CRset being set of candidate cloaked regions, UR being the unreachable area, and area being the reachable area.

3.5. Updating

Each request specifies its own life time $d t$ . If the request is cloaked successfully or its deadline has passed, it will be dropped. The max-clique set also has to be updated. After removing the leaving request from the max-cliques involving it, we need to check whether the update cliques are still maximal. If the update clique is a subset of any other clique, it is not a max-clique. So it will be removed from the max-clique set.

4. Evaluation

In this section, we present a set of experiments that show the effectiveness and efficiency of our proposed algorithm. In particular, we evaluate our algorithm for users moving by car and on foot. To our best knowledge, this is the first attempt in evaluating a location privacy method with realistic pedestrian movement data. In the following, we describe the experimental setup, evaluation metrics, and the results.

4.1. Experiment Setup

We used an off-the-shelf transportation simulator Paramics [13] to generate vehicular traffic in the map of Manhattan, New York. The input road map was extracted from the Tiger/Line files [14]. Table 1 shows the experiment parameters. In each experiment, we randomly choose user-specific k-anonymity requirements, minimum area, maximum speed, and number of users from the given range. We assume that each user follows an exponential distribution with the mean of 20 seconds for query submission. The whole map is divided into 100 m by 100 m cells.

Table 1

System parameters in vehicle simulation.

Parameter	Range
Privacy level k	$[2 – 5], [5 – 8], [8 – 11]$
Life time $d t$ (s)	0.5, 1, 3, 5, 7, 9, 11, 13, 15
Minimum area $A_{\min}$ (km²)	0.001, 0.01, 0.1, 0.5, 1, 2, 3, 4, 5
Speed (km/h)	20, 40, 60, 80, 100, 120
Number of users	1000, 2000, 3000, 4000, 5000, 6000, 7000, 8000, 9000, 10000

For pedestrian movements, we used a pedestrian simulator SimWalk [15]. Table 2 shows the experiment parameters. The experiment contains 200 people walking in a bus station (Figure 6). The area is 164 m by 80 m large, divided into 1.6 m by 1.6 m cells. All experiments are run on an i7-2600 3.4 GHz machine with 4 GB of RAM.

Table 2

System parameters in pedestrian simulation.

Parameter	Range
Privacy level k	$[2 – 5], [5 – 8], [8 – 11]$
Life time $d t$ (s)	0.5, 1, 3, 5, 7, 9, 11, 13, 15
Minimum area $A_{\min}$ (m²)	30, 50, 100, 200, 300, 400, 500

Figure 6

Bus terminal scenario.

We compare our proposed algorithm URALP with the algorithm proposed by Pan et al. [6]. Their algorithm is designed for MMB attacks, but we evaluate it against MMB-CM attacks. We denote this algorithm by MMBPan.

4.2. Evaluation Metrics

We evaluate the scheme by cloaking time and anonymization time for performance, cloaked region size for utility, and success rate and entropy for privacy as follows: (i)

Cloaking time is the time used to update all max-cliques and generate a cloaked region.

(ii)

Anonymization time is the time between the periods when the request arrives and when the request is cloaked successfully. It includes the cloaking time and the waiting time to be cloaked.

(iii)

Cloaked region size is the size of the final cloaked area when successful.

(iv)

Success rate is the ratio of the number of successfully anonymized requests to the total number of requests.

(v)

Entropy is as shown below.

We measure the entropy of the mobile user's locations as follows. Given a set of cloaked regions ${{C R}_{1}, \dots, {C R}_{n}}$ reported to the LBS server, let $R_{i}$ be the reachable part of ${C R}_{i} (i = 1, \dots, n)$ . Then, for a fixed number m, we evenly partition $R_{i}$ into m tiles ${t_{i 1}, \dots, t_{i m}}$ . For each $R_{i}$ , we compute the probability $P_{i j}$ of each tile $t_{i j}$ as $P_{i j} = l_{i j} / \sum_{k} ‍ l_{i k}$ where $l_{i j}$ is the number of times that a user appeared in $t_{i j}$ . Then, $\sum_{j} ‍ P_{i j} = 1 (1 \leq i \leq n)$ . Now we can compute the average entropy of the user locations by $\hat{H} = \sum_{i = 1}^{n} ‍ H_{i} / n$ where

\begin{matrix} H_{i} = \sum_{j = 1}^{m} P_{i j} \log P_{i j} . \end{matrix}

(1)

When the algorithm makes all the users appear in a specific tile, say the center tile, then the entropy becomes zero, and the attacker can always pinpoint where the user is. As an opposite case, suppose each user appears in each tile with the same probability; then the entropy is maximized (i.e., entropy =

l o g (m)

) and the attacker cannot guess the user locations except random guess. Our experiments show that our algorithm achieves near-optimal entropy.

4.3. Experiment Results

In the experiments, we vary the maximum movement speed (Figure 7), request life time (Figures 8 and 9), minimum acceptable area (Figures 11 and 12), number of users (Figure 13), privacy level (Figures 14 and 15), and cell size (Figure 16). We also analyze the achieved privacy using the metric of entropy in Figure 10.

Figure 7

Effect of speed in vehicle simulation on the following: (a) success rate, (b) cloaking time, (c) anonymization time, and (d) cloaking area.

Figure 8

Effect of life time in vehicle simulation on the following: (a) success rate, (b) cloaking time, and (c) anonymization time.

Figure 9

Effect of life time in pedestrian simulation on the following: (a) success rate, (b) cloaking time, and (c) anonymization time.

Figure 10

Effect of (a) life time, (b) minimum acceptable area, and (c) privacy level on entropy in pedestrian simulation.

Figure 11

Effect of minimum acceptable area changing from 0.001, 0.01, 0.1, 0.5, 1, 2, 3, and 4 to 5 km² in vehicle simulation on the following: (a) success rate, (b) cloaking time, (c) anonymization time, and (d) cloaking area.

Figure 12

Effect of minimum acceptable area in pedestrian simulation on the following: (a) success rate, (b) cloaking time, (c) anonymization time, and (d) cloaking area.

Figure 13

Effect of number of users in vehicle simulation on the following: (a) success rate, (b) cloaking time, (c) anonymization time, and (d) cloaking area.

Figure 14

Effect of privacy level in vehicle simulation on the following: (a) success rate, (b) cloaking time, (c) anonymization time, and (d) cloaking area.

Figure 15

Effect of privacy level in pedestrian simulation on the following: (a) success rate, (b) cloaking time, (c) anonymization time, and (d) cloaking area.

Figure 16

Effect of cell size in vehicle simulation on the following: (a) success rate, (b) cloaking time, (c) anonymization time, and (d) cloaking area.

In summary, the proposed algorithm URALP outperforms MMBPan in success rate, cloaking time, anonymization time, and entropy, whether vehicular or pedestrian movements.

As shown in Figures 7(b), 8(b), 9(b), 11(b), 12(b), 13(b), 14(b), 15(b), 7(c), 8(c), 9(c), 11(c), 12(c), 13(c), 14(c), and 15(c), cloaking time and anonymization time of URALP are less than MMBPan for all settings tested. That indicates that our proposed algorithm achieves higher efficiency than MMBPan. The reason is that the complex computations to calculate the reachable area in each cell were done before doing anonymization. So, in cloaking process, we only use addition operation to calculate the reachable area in cloaked region. Otherwise, without cell generation, MMBPan has to take more time to finish anonymization.

(1) Effect of Speed. Figure 7(a) shows that the higher user speed increases the success rate. The reason is that since user's velocity is proportional to the MMB's size of the user, a faster speed leads to a larger size of MMB. Therefore, having an edge existing between two nodes is more possible to achieve. It means the possibility of finding a cloaked region is much easier.

Figures 7(b), 7(c), and 7(d) show that, for the two algorithms, the cloaking time, the anonymization time, and the cloaked area are not affected by the change of the speed.

(2) Effect of Life Time. Figures 8(a) and 9(a) study the effect of the life time $d t$ of requests on the two algorithms. With prolonging $d t$ , the success rates of all algorithms increase because fewer requests could be dropped due to expiry of their life time. A longer life time $d t$ means that a request can wait longer to be cloaked successfully. Moreover, the anonymizer in our algorithm has to take time to extend the MBR. Therefore, when $d t$ is prolonged from 0.5 s to 3 s, the success rate of URALP greatly increases from 74% to 77.7% and from 73% to 87.4% in vehicle simulation and pedestrian simulation, respectively. After that, from 3 s to 15 s, it slightly grows since not many requests need such long time to be cloaked.

From Figures 8(b) and 9(b), as the anonymizer can take more time to extend the MBR, the cloaking time increases with longer $d t$ . Figures 8(c) and 9(c) show that the anonymization time increases since more requests could be waiting for longer life time.

(3) Effect of Minimum Acceptable Area. This section gives the effect of more strict privacy profile on system performance by increasing the value of minimum acceptable area $A_{m i n}$ . From Figures 11(a) and 12(a), because two algorithms have to find a larger cloaked region to meet an increased $A_{m i n}$ requirement, their success rates drop with larger $A_{m i n}$ . Figures 11(b), 12(b), 11(c), and 12(c) show that the cloaking time and anonymization time increase with larger $A_{m i n}$ . The reason is that it requires more time to find a larger cloaked region and the requests could be waiting for a longer time. Figures 11(d) and 12(d) show that the average cloaking area increases to satisfy the minimum acceptable area requirement.

Note that URALP generates larger cloaked regions than MMBPan. This is because URALP expands the cloaked area to meet the $A_{m i n}$ requirement.

(4) Effect of Number of Users. As shown in Figure 13(a), the success rate decreases with increasing number of users. This is mainly because of the increased workload (see Figure 13(b)), which makes more requests expire and fail to be cloaked. From Figures 13(b) and 13(c), the cloaking time and the process time increase with increasing number of users, since more users imply more max-cliques and longer search time. The average cloaked areas of two algorithms in Figure 13(d) drop by increasing the number of users. The reason is that the higher the user density, the smaller the cloaked region.

(5) Effect of Privacy Level. We now evaluate the effect of privacy level on the performance of cloaking algorithms. From Figures 14(a) and 15(a), the success rates decrease with more constrained privacy requirement. In contrast, as shown in Figures 14(b), 15(b), 14(c), and 15(c), the cloaking time and the anonymization time increase with higher privacy level. The reason is that more requests could be removed from the clique whose number of users is smaller than request's privacy level. Those requests have to wait to be cloaked in other cliques. The average cloaked areas increase with higher privacy level since the request needs to find more neighbors to satisfy the privacy level requirement (Figures 14(d) and 15(d)).

(6) Effect of Cell Size. This section gives the effect of changing cell size on the performance of URALP. Figures 16(c) and 16(b) show that, with the small side length of cell, from 5 m to 100 m, the anonymization time and cloaking time fall significantly as the expanding of cloaked regions takes much less time to reach the minimum area requirement. It causes less requests to expire before their anonymization succeeds. Therefore, the success rate in Figure 16(a) increases greatly. After that, from 100 m to 800 m, it increases slightly. Downside of large cell is, obviously, to have larger cloaked area sizes as shown in Figure 16(d) since MBR is generated from all cells that contain users in max-clique. In other words, it could decrease the quality of service. So, to maximize the success rate while minimising the cloaked region size, the optimal cell size can be chosen based on a specific map. In this paper, we pick the cell size of 100 m and 1.6 m for vehicles and pedestrian, respectively.

(7) Entropy Analysis. In this section, we evaluate the algorithm by entropy. For the definition of the metric, see Section 4.2. In this experiment, we divided each reachable region into $100 \times 100$ tiles, that is, $m = 10000$ . Therefore, the maximum entropy is $l o g (10000) = 13.287$ .

In Figure 10, we notice that entropy of our proposed algorithm is significantly higher than that of MMBPan under various conditions such as time, area, and privacy level. The reason is that, in MMBPan, cloaked region is MBR of users in max-clique, so users are mostly located in the boundary of cloaked region. Otherwise, URALP expands the cloaked area to meet the $A_{m i n}$ requirement; thereby entropy of URALP is higher. This indicates that URALP provides more uncertainty so as to reduce the chance of being discovered by adversaries. Particularly, the entropy that URALP achieves is from 12 to 12,6 nearly reaching the maximum entropy. It shows that the distribution of users in cloaked region is almost uniform when using URALP method.

5. Related Work

The concept of location k-anonymization was first studied by Gruteser and Grunwald [2]. Under the centralized location anonymization architecture, they introduced a scheme in which the spatial and temporal accuracy of location information is reduced such that at least k users are indistinguishable. This basic model has been extended in several ways. For example, Bettini et al. [16] have introduced the concept of historical k-anonymity for preserving privacy. Mokbel et al. [17] have integrated both anonymity and obfuscation to protect privacy using a centralized system. They calculated the obfuscation area of the k users in their Casper framework based on the user-defined values of k and an area value $A_{m i n}$ indicating that the user wants to hide his/her location within an area size of at least $A_{m i n}$ . Recently, Niu et al. [18] proposed a dummy location selection algorithm to achieve k-anonymity for users in LBS. They have considered both entropy and cloaked region to maintain the entropy while carefully choosing dummy locations to achieve k-anonymity.

The MMB attack has been studied in some previous work. The problem was first introduced by Reynold et al. [8]. They proposed two simple solutions, namely, patching and delaying. Patching solution enlarges the previous cloaked region to cover the current one so that the overlapped area with the MMB is at least as large as the current cloaked region. Its disadvantage is that the size of cloaked regions is increasing when time goes past. That will result in expensive query process cost. If there is a constraint Amax, which is maximum cloaking region size a user can tolerate, the increasing cloaked region is easy to exceed Amax. Therefore, success rate of anonymizing will be low. Delaying solution postpones the query until the MMB covers the current cloaked region. However, the user may have moved to another location and the current cloaked region also is changed. If we treat the new cloaked region as the old one, it will affect the accuracy of query result. Time delaying also results in bad services quality. Ghinita et al. [7] proposed spatial and temporal transformations to preserve user privacy. Moreover, they considered the scenario that the attacker knows the placement of sensitive locations. However, in these papers, the identity of the user is known, and the objective is to protect the exact location of the user. In other words, the privacy model is only the granularity of cloaked regions, without considering the location k-anonymity. So they fail to protect the user identity in case there is only one user in the cloaked region. These studies cannot work effectively in the situation of constrained movements.

Gedik and Liu [12, 19] proposed a personalized k-anonymity model and proposed CliqueCloak, which constructs an undirected graph for all the requests that have not been anonymized yet to combine users that can share the same cloaked region. Our work also employs a graph model but differs from the underlying problems and the methods for finding cliques. The proposed algorithm exhaustively searches the graph for cliques covering the new request to generate candidate cloaked region. In contrast, based on [6], to reduce the computational complexity, our algorithm incrementally maintains maximal cliques. Their cloaking algorithm protects location privacy against snapshot location attacks, while our proposed algorithm can prevent the MMB attack with knowledge of all the cloaked location updates.

Similarly, Pan et al. [6] proposed an incremental clique-based cloaking algorithm, called ICliqueCloak to make sure that all users in the cloaked region are in the MMB of each other. Therefore, the intersection area between the cloaked region and the MMBs is the cloaked region itself. In other words, the cloaked area is not reduced by MMB attack. They used the graph model to solve location k-anonymity problem. However, different from our work, this study works in an open-space environment only and its effect is significantly limited for the constrained movements. Moreover, they also performed wrong implementation because of miscalculating the reachable areas of users. In their implementation, moving objects can only move on the roads instead of anywhere; but they calculate that the reachable area is the MBR of the users.

In [20], the authors adapted location entropy measure defined in Cranshaw et al. [21] to study the privacy of a location. Location entropy measures the frequency of users’ visits to a given location. They investigated how entropy impacts user perceptions of location privacy by showing that users are more comfortable in sharing high entropy locations than low entropy locations. Xu et al. [22] also developed a location cloaking technique to resist MMB attack. They built the cloaked regions based on two privacy metrics: entropy and minimum acceptable cloaking area. However, they did not consider the k-anonymity to protect user identity.

6. Conclusion

In this work we have proposed a greedy algorithm against the MMB attack that may infer user's exact location with knowledge of regions in the map. To address this problem, we have employed a grid structure to extend the cloaked region that does not satisfy minimum area requirement. We showed that the existing algorithm against the MMB attack cannot work effectively with considering the accurate calculation of the reachable areas. The presented experiment results examine the effectiveness and efficiency of our proposed algorithm under various settings.

Footnotes

Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.

Acknowledgment

This work was supported by ICT R&D Program of MSIP/IITP (B0101-15-1272, Development of Device Collaborative Giga-Level Smart Cloudlet Technology).

References

Zang

Bolot

Anonymization of location data does not work: a large-scale measurement study

Proceedings of the 17th Annual International Conference on Mobile Computing and Networking (MobiCom ′11)

September 2011

ACM

145 156

10.1145/2030613.2030630

2-s2.0-80053584816

Gruteser

Grunwald

Anonymous usage of location-based services through spatial and temporal cloaking

Proceedings of the 1st International Conference on Mobile Systems, Applications and Services (MobiSys ′03)

May 2003

San Francisco, Calif, USA

ACM

31 42

10.1145/1066116.1189037

Ardagna

C. A.

Cremonini

Damiani

de Capitani di Vimercati

Samarati

Location privacy protection through obfuscation-based techniques

Data and Applications Security XXI 2007 4602

Berlin, Germany

Springer

47 60 Lecture Notes in Computer Science

10.1007/978-3-540-73538-0_4

Chow

C.-Y.

Mokbel

M. F.

Liu

Spatial cloaking for anonymous location-based services in mobile peer-to-peer environments

GeoInformatica 2011 15 2 351 380

10.1007/s10707-009-0099-y

2-s2.0-79952449865

T. C.

Zhu

W. T.

Protecting user anonymity in location-based services with fragmented cloaking region

Proceedings of the IEEE International Conference on Computer Science and Automation Engineering (CSAE '12)

May 2012

Zhangjiajie, China

227 231

10.1109/csae.2012.6272944

2-s2.0-84867088767

Pan

Meng

Protecting location privacy against location-dependent attacks in mobile services

IEEE Transactions on Knowledge and Data Engineering 2012 24 8 1506 1519

10.1109/tkde.2011.105

2-s2.0-84863471596

Ghinita

Damiani

M. L.

Silvestri

Bertino

Preventing velocity-based linkage attacks in location-aware applications

Proceedings of the 17th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems (ACM SIGSPATIAL GIS ′09)

November 2009

ACM

246 255

10.1145/1653771.1653807

2-s2.0-74049132493

Reynold

Zhang

Bertino

Prabhakar

Danezis

Golle

Preserving user location privacy in mobile data management infrastructures

Privacy Enhancing Technologies 2006 4258

Berlin, Germany

Springer

393 412 Lecture Notes in Computer Science

10.1007/11957454_23

Krumm

Inference attacks on location tracks

Pervasive Computing 2007

Berlin, Germany

Springer

127 143

10.

Wernke

Skvortsov

Dürr

Rothermel

A classification of location privacy attacks and approaches

Personal and Ubiquitous Computing 2014 18 1 163 175

10.1007/s00779-012-0633-z

2-s2.0-84891850039

11.

Laws of New York, http://www.safeny.ny.gov/spee-vt.htm

12.

Gedik

Liu

Location privacy in mobile systems: a personalized anonymization model

Proceedings of the 25th IEEE International Conference on Distributed Computing Systems (ICDCS ′05)

June 2005

IEEE

620 629

2-s2.0-27944454864

13.

Quadstone Paramics application http://www.paramicsonline.com/

14.

The Tiger/Line files http://www.census.gov/geo/www/tiger/

15.

Simwalk application http://www.simwalk.com/

16.

Bettini

Sean Wang

Jajodia

Protecting privacy against location-based personal identification

Secure Data Management 2005

Berlin, Germany

Springer

185 199

17.

Mokbel

M. F.

Chow

C.-Y.

Aref

W. G.

The new Casper: query processing for location services without compromising privacy

Proceedings of the 32nd International Conference on Very Large Data Bases

2006

763 774 VLDB Endowment

18.

Niu

Zhu

Cao

Achieving k-anonymity in privacy-aware location-based services

Proceedings of the IEEE International Conference on Computer Communications (INFOCOM ′14)

April-May 2014

Toronto, Canada

IEEE

754 762

10.1109/INFOCOM.2014.6848002

19.

Gedik

Liu

Protecting location privacy with personalized k-anonymity: architecture and algorithms

IEEE Transactions on Mobile Computing 2008 7 1 1 18

10.1109/tmc.2007.1062

2-s2.0-36549043405

20.

Toch

Cranshaw

Drielsma

P. H.

Tsai

J. Y.

Kelley

P. G.

Springfield

Cranor

Hong

Sadeh

Empirical models of privacy in location sharing

Proceedings of the 12th International Conference on Ubiquitous Computing (UbiComp ′10)

September 2010

ACM

129 138

10.1145/1864349.1864364

2-s2.0-78649998804

21.

Cranshaw

Toch

Hong

Kittur

Sadeh

Bridging the gap between physical location and online social networks

Proceedings of the 12th ACM International Conference on Ubiquitous Computing (UbiComp ′10)

September 2010

ACM

119 128

10.1145/1864349.1864380

2-s2.0-78649992556

22.

Tang

Privacy-conscious location-based queries in mobile environments

IEEE Transactions on Parallel and Distributed Systems 2010 21 3 313 326

10.1109/tpds.2009.65

2-s2.0-76749103805

URALP: Unreachable Region Aware Location Privacy against Maximum Movement Boundary Attack

Abstract

1. Introduction

2. Problem Model

2.1. System Architecture

2.2. Adversary Model

2.3. MMB-CM Attack

2.4. Privacy Goals

3. Anonymization Algorithm

3.1. Overview

3.2. Finding k-Anonymity Set

3.3. Detection

3.4. Generation

Algorithm 1: Generate the cloaked region.

Algorithm 2: Extend the MBR.

3.5. Updating

4. Evaluation

4.1. Experiment Setup

4.2. Evaluation Metrics

4.3. Experiment Results

5. Related Work

6. Conclusion

Footnotes

Conflict of Interests

Acknowledgment

References