A New Distributed User-Demand-Driven Location Privacy Protection Scheme for Mobile Communication Network

Abstract

With the development of mobile communication networks and intelligent terminals, recent years have witnessed a rapid popularization of location-based service (LBS). While obtaining convenient services, the exploitation of mass location data is inevitably leading to a serious concern about location privacy security. Obviously, high quality of service (QoS) will result in poor location privacy protection, so that a trade-off is needed to fulfill users' individual demands for both sides. Although existing methods perform well in certain scenarios, few have considered the abovementioned balance problem. Therefore, by combining k-anonymity-based cloaking technique and obfuscation method, a new distributed user-demand-driven (DUDD) location privacy protection scheme is put forward in this paper. The basic idea is still to select a subcloaking area within the cloaking area generated by Location Anonymization Server. Moreover, by using the improved LBS system model, this paper constructs a distributed framework, in which location privacy protection is wholly occupied in server side and LBS provider is only dedicated to QoS-guarantee. In addition, normalized privacy demand and QoS metrics are given and a user-defined weight parameter is introduced to ensure location privacy security without decreasing QoS. The feasibility of the proposed method is proved through simulation.

1. Introduction

Recently, the popularization of intelligent terminals such as smart phones and the growing of mobile network coverage have provided location-based service (LBS) with various fields of applications. LBS employs mobile communication networks or Global Positioning System (GPS) to obtain the location information of mobile users. With the help of Geographic Information System (GIS) platform, it offers users many value-added services, including navigation services, life assisted services, social services, and commercial push services [1].

In order to get access to all these convenient services, it is a prerequisite for users to furnish their own location information to service providers. The acquisition as well as exploitation of mass location data inevitably leads to location privacy issues because sensitive data like location information can easily expose users' work places, places of residence, travel destinations, and even their precise locations at a certain moment. Associated with some available background information, it is enough for an adversary to infer more important information about users, which may constitute a threat for their lives and security of properties [2–4].

In particular, when users arrive at some sensitive places such as hospitals and banks, they would like to preserve their location privacy and obtain services simultaneously. However, location privacy protection and QoS are checks and balances, which means that the improvement of location privacy protection must be at the expense of dropping QoS. And conversely, the accurate location service is bound to bring several location privacy threats. Therefore, for mobile wireless communication networks, the QoS-guaranteed location privacy protection has become a big challenge. Also, it is a key factor affecting the future prospects of mobile wireless communication networks and LBS service [5].

During the past few years, for the purpose of handling the aforementioned problem, numerous approaches regarding location privacy protection have been proposed and are gradually showing a personalized trend with the increase of user demands [6]. Although those existing methods have performed very well, there exist some unavoidable limitations. The anonymity-based method is commonly used in LBS, but it is vulnerable to the background information attack (edge information attack) and, moreover, it has a limited protection degree while being applied alone [7]. Nevertheless, the computational complexity of obfuscation-based method is considerably high. What is more, to the best of our knowledge, few approaches take users' individual demands for location privacy and QoS into consideration.

Hence, in order to improve the effectiveness of location privacy protection and to reduce the computational complexity, this paper proposes a new algorithm of location privacy protection to meet users' individual demands and to seek the balance between location privacy protection and QoS, which is in combination with k-anonymity-based cloaking technique and obfuscation method. First, users initialize and send their LBS queries to Location Anonymization Server and, at the same time, forward a parameter called type of service (ToS) to LBS provider to declare the types of services. Then, the server and LBS provider will return to users, respectively, their restrictions on the distance between the real user location and the subcloaking area. Meanwhile, in the first step, Location Anonymization Server generates a large cloaking area according to users' desired privacy protection degree and returns it to users. Second, users select a subcloaking area a(c) within this large cloaking area by applying a selection algorithm and then send it back to LBS provider. Finally, LBS provider returns service responses back to users, and then the entire LBS process is completed. Since the subcloaking area selection method has randomness, the real user location might be contained in the subcloaking area or not. Hence, even if an adversary manages to know about the cloaking area reported to LBS provider, it is still quite difficult to find out the real user location. In this way, an effective location privacy protection will be achieved. The main contributions of this paper can be summarized as follows: (1)

By using an improved LBS system model, this paper constructs a distributed framework in order to isolate Location Anonymization Sever and LBS provider, which will prevent user information disclosure through untrusted server or LBS provider and realize a function separation of QoS-guarantee and location privacy protection to improve the system efficiency.

(2)

This paper designs a subcloaking area selection algorithm, which enables users to seek a specific cloaking area within a large cloaking area on the condition of satisfying their individual demands, making it truly user-demand-driven.

(3)

This paper proposes new quantitative and normalized representations of QoS as well as location privacy demand and introduces two boundary parameters, namely, location privacy security distance $d_{1}$ and QoS-guaranteed distance $d_{2}$ to determine the value of QoS demand and location privacy protection demand, which makes the subcloaking area selection more reasonable.

(4)

The entire cloaking area computation process in this paper does not need to assume that the server has known users' next locations, which is more realistic.

The remainder of this paper is organized as follows: The second section summarizes related works on location privacy protection. The third section defines problems and introduces the traditional system model. The fourth section presents the improved system model and three core algorithms of location privacy protection. The simulation analysis of feasibility is detailed in the fifth section, and section six concludes this paper and puts forward some future research directions.

2. Related Works

As shown in Figure 1, existing location privacy protection methods can be roughly divided into four categories: regulatory method, privacy policy-based method, anonymity-based method, and obfuscation-based method [8]. Among them, the development of anonymity-based and obfuscation-based methods is more thorough, making them two major directions of research on location privacy protection.

Figure 1

Categories of location privacy protection methods.

Anonymity-based method is also named time-space cloaking technology, among which k-anonymity method is the most well-known and is applied in location privacy protection by Gruteser and Grunwald for the first time [9]. By employing the quadtree data structure, k-anonymity method is able to guarantee that a cloaking area of one user contains at least $k - 1$ other users. Thus, k users in the same area are indistinguishable from each other. In this way, the possibility of distinguishing each user is reduced to 1/k from 1. Inspired by the k-anonymity method, Wang et al. came up with a perception-based method, where Location Anonymization Server was able to meet users' diversified privacy demands in both temporal and spatial dimensions [8]. Reference [10] proposed a new incremental clique-based cloaking algorithm using k-anonymity method to cope with the scenario where different location-based queries are continuously launched. Hu and Xu proposed a method in which users' concrete location information was not necessary [11]. Reference [12] designed a subcloaking area selection algorithm to protect location privacy by returning a smaller cloaking area. Reference [13] proposed a privacy personalization framework, where k is user-defined. However, this method was only suitable for the situations where the value of k was small, and, furthermore, the long delay and low anonymity success rate were two main defects of it. Bhuvan et al. adopted the privacy grid of Bottom-up and Top-down [14]. In this method, users could appoint their own privacy demands k and the minimum cloaking area A and realize a high anonymity success rate. However, it was time-consuming and with high update cost. In contrast, the circular partitioning method proposed by Zhao et al. can improve the anonymity degree but is less realistic [15].

Obfuscation-based method protects location privacy by producing a virtual user location or by separating locations from identities. Duckham and Kulik pointed out that the obfuscation-based method was one of the most important methods in location privacy protection and proposed a new obfuscation approach, in which users could balance privacy demands and QoS through the negotiation with the LBS provider [16]. Reference [17] proposed a dummy method, in which the server produced virtual location information according to real user locations and mixed them to protect location privacy. However, attackers were able to distinguish real users from virtual users through long-term tracking. The algorithm improved by You et al. [18] increased the continuity of virtual location generation, making those virtual locations more authentic. Ghinita et al. came up with a new Private Information Retrieval based framework [19] to divide the whole space into different modules. In this way, users could extract the necessary information without leaking concrete locations because users could only get access to the information stored in their own modules. The two algorithms mentioned above have achieved remarkable results in privacy protection. However, the calculation complexity and communication cost of their works were just high. To handle the source-consuming issue, [20] put forward a distributed obfuscation method, which was able to reduce one half computation expense, but it was still high-cost compared with k-anonymity method.

3. System Model and Problem Definition

3.1. Problem Definition

3.1.1. Edge Information Attack

Edge information attack frequently occurs while applying k-anonymity method [12]. Considering the scenario shown in Figure 2 and supposing that user $u_{1}$ defines a location privacy protection of degree 3, that is, $k = 3$ , when $t = t_{1}$ , the cloaking area after anonymization is the blue shadow zone in Figure 2(a) and when $t = t_{2}$ , the cloaking area changes to the orange shadow zone in Figure 2(b). Therefore, the common user set is ${u_{1}, u_{4}}$ , so actually, k is reduced to 2.

Figure 2

Edge information attack illustration (revised from [12]).

The concept of edge information attack is that, by comparing the user information in two different cloaking areas, such as users' location coordinates and LBS query traces, those who are not in the common user set (here ${u_{3}, u_{5}}$ ) will be eliminated; that is, the edge user information will be detected and removed, which results in a decrease of k and reducing the privacy protection degree.

3.1.2. Limitations of Existing Approaches

To counter the edge information attack, literature [8] proposed a cloaking area algorithm based on the common user set. In this algorithm, a cloaking area should meet not only the individual demand for k, but also the demand of all the users in the common user set for k. Although the algorithm is thoughtful, on the one hand, the calculation cost will dramatically increase due to the consideration of all the users in a set; on the other hand, since each user's location in the next moment will affect the calculation of the coverage area, the premise of using this algorithm is to know users' future locations, which is obviously unrealistic.

In summary, although existing approaches such as the method proposed in [8] have acquired good results, limitations still cannot be ignored: (1)

Calculating the public privacy value k in unit of user-set, instead of considering the different privacy levels demanded by different users, cannot meet individual demands for privacy security.

(2)

Assumption that next-moment location of each user is known by Location Anonymization Server is not feasible.

(3)

The purpose of this algorithm is to find the smallest coverage area, namely, maximizing the QoS, which neglects the trade-off between QoS and the degree of privacy protection.

(4)

While applying this algorithm at high population density places (e.g., classrooms or meeting rooms), it is still likely to expose users' locations, because, in this kind of scenario, the value of k is easily fulfilled.

(5)

Excessive calculation may be caused, especially while considering that some users may have continuous privacy demands.

3.2. Traditional LBS System Model

The traditional LBS system is composed of three parts [8], including user, Location Anonymization Server, and LBS provider, as shown in Figure 3.

Figure 3

Traditional LBS system model.

In this model, Location Anonymization Server acts as a bridge, which is responsible for cloaking area generation and link connection between users and LBS provider. LBS query is represented in the form of $(u_{i}, P M, t, r)$ , where $u_{i}$ represents the user, t indicates the time when a query is launched, $P M = (x_{i}, y_{i})$ is a location tracker built into mobile devices, $(x_{i}, y_{i})$ expresses the two-dimension coordinates of user location, and r represents a user-desired degree of location privacy protection. Terminal users firstly send their LBS queries to Location Anonymization Server, and then the server will generate a cloaking area using k-anonymity method and send it to LBS provider. Later, according to the cloaking area received, LBS provider will return service responses back to users via Location Anonymization Server.

However, several drawbacks exist in this model. On the one hand, the model assumes by default that Location Anonymization Server is trusted. But in reality, the trustworthiness cannot always be assured. LBS queries containing users' location information might be divulged to LBS provider easily via malicious servers. On the other hand, because of the central position of Location Anonymization Server, service responses cannot be returned to users directly, which will lead to a longer service delay. Furthermore, Location Anonymization Server has to allocate some extra resources to deal with the information transit, causing a waste of resource.

4. Proposed Location Privacy Protection Scheme

4.1. Improved LBS System Model

Hence, in order to make up for those defects present in the traditional system model, this paper draws on the experience of [12] and introduces an improved LBS system model as shown in Figure 4(a).

Figure 4

Improved LBS system model.

The improved model transforms from Location Anonymization Server-centric into user-centric, which highlights the dominant position of users. In this model, the LBS query is expressed in the form of $(u_{i}, P M, t, r, T o S)$ . Here, the denotation “LBS query” is denoted exclusively as the query to Location Anonymization Server. It is a query launched by the user when he/she wants to obtain some location-based services. In the same time of initiating a LBS query, the user will extract the parameter called type of service (ToS) from the query $(u_{i}, P M, t, r, T o S)$ and then send it to LBS provider.

There are two differences from the conventional form. (i)

First, $r = (k, A_{m i n})$ where k indicates the privacy protection degree, namely, the minimal number of users in a cloaking area, and $A_{m i n}$ denotes the minimal size of a cloaking area.

(ii)

Second, an extra parameter type of service (ToS) is added to inform LBS provider that the service required by the user, and then LBS provider will generate the QoS-guaranteed distance $d_{2}$ according to ToS. The parameter $d_{2}$ indicates the maximum distance between the subcloaking area and user position so as to get the most optimal QoS.

$A_{m i n}$ is denoted to avoid the situations where the population density is too high to easily satisfy the value of k in a small area. Sometimes, in places such as classrooms and cinemas, the number of users in the smallest square could be more than k. In this way, without a lower bound $A_{m i n}$ , the cloaking area sent to LBS provider would not be large enough to protect the location privacy of mobile users.

In this new system model, Location Anonymization Server provides independent service for each user. Firstly, when the server records a user's LBS query, at this moment, it will produce a cloaking area c and return it to the user. Further, the user selects a smaller area called subcloaking area a(c) within c by using a selection algorithm and then sends it to LBS provider. Finally, LBS provider directly returns the query response to the user. Thus, based on the system mentioned above, the bridge between users and LBS provider is no longer necessary. It is the user who solely screens the cloaking area and directly gets access to the service returned from LBS provider, isolating Location Anonymization Server and LBS provider and avoiding user information disclosure. According to the selection algorithm, a(c) may or may not contain the real user location.

Compared to the conventional LBS system model, there are several advantages of the improved one. (i)

First, Location Anonymization Server provides independent service for each user and users can completely express their individual demands through the selection of subcloaking area.

(ii)

Second, realizing the isolation of Location Anonymization Server and LBS provider can strongly reduce the probability of user information disclosure caused by malicious servers or providers.

(iii)

Third, by applying the improved model, a function separation between Location Anonymization Server and LBS provider is realized, and a distributed system framework is formed, as shown in Figure 4(b). Location Anonymization Server is specifically responsible for location privacy protection, and LBS provider is dedicated to QoS-guarantee. Moreover, the balance between the two sides is determined by users, which will significantly reduce the service delay and save system resources.

4.2. Data Structure

Similar to the approach presented in [8], a quadtree T is used to recursively partition the whole spatial domain into squares, as shown in Figure 5.

Figure 5

Spatial domain partitioning method using a quadtree (revised from [8]).

The whole spatial domain is partitioned into L levels (L is predefined by Location Anonymization Server, without further change). Every layer is partitioned into $4^{n - 1}$ squares, where n indicates the number of levels. When $n > 1$ , every large square in level $n - 1$ is able to be partitioned into 4 smaller squares in level n. In the end, there is only one large square in level 1 and $4^{L - 1}$ small squares in the most bottom level.

For each user location $P M = (x_{i}, y_{i})$ , there is one and only one square area corresponding to it in each level, which is possible to be the cloaking area c returned by Location Anonymization Server, whereas which level is the exact cloaking area c depends on users' demands of privacy.

4.3. Definition of Parameters

Here again, in order to be clear, the definitions of several parameters are given. First of all, two practical scenarios are necessary to be introduced to facilitate the comprehension.

Scenario 1.

When a user asks for the nearest supermarket location, even the cloaking area sent to LBS provider that is only a few meters away from his/her real location, the service result may be far from the reality, which means the probability of generating errors of LBS response is high.

Scenario 2.

When a user asks about the weather, even the cloaking area reported to LBS provider that is several kilometers away, the feedback might be right.

Seen from the two different scenarios above, it is evident that, for different types of service, the requirements for QoS and location privacy protection are also different. For example, service like navigation requires for a small cloaking area to achieve a certain precision, and, therefore, leading to a higher location privacy protection degree; service like weather query requires for a relative large cloaking area, therefore resulting in a lower location privacy degree.

Hence, we define a parameter called type of service (ToS) to indicate which type of service is being asked for by the user. By allocating different values to ToS, the service type is declared.

And when Location Anonymization Server receives ToS, it will generate a parameter called the location privacy security distance $d_{1}$ , which means that, in order to protect the location privacy, the distance between the subcloaking area and the real user position should be no less than $d_{1}$ .

Similarly, when LBS provider receives ToS, it will generate another parameter called the QoS-guaranteed distance $d_{2}$ , which means that, in order to guarantee QoS, the distance between the subcloaking area and the real user position should be no more than $d_{2}$ .

It is worth mentioning that, in this paper, $d_{1}$ and $d_{2}$ are two empirical bound values predefined without calculation. By means of analyzing massive actual instances, it is assumed that Location Anonymization Server and LBS provider could estimate empirically the two boundary values according to the type of service.

4.4. Quantification and Normalization of Privacy Demand Value and QoS

The design of cloaking-area-based location privacy protection algorithm must take into consideration the three factors below: privacy demand value (denoted by P), QoS demand (denoted by Q), and the attack ability of adversaries [21]. From the point of view of users, the larger and farther the cloaking area obtained by LBS provider is and the greater the number of users in it is, the higher degree of privacy protection is, but oppositely the quality of service is lower. In a word, the privacy protection demand value increases with the number of users in the cloaking area; however the QoS descends instead. In addition, the higher the adversary ability is, the worse the reliability of location privacy protection is. In this paper, in order to simplify the analysis of the relation between location privacy protection and QoS, it is assumed that the attack ability of every adversary is equal.

(i) The Quantitative Representation of Privacy Demand Value P. The quantitative representation form of privacy demand value is given as below:

\begin{matrix} d_{p} = d i s t a n c e (a (c), (x, y)), \\ P = \{\begin{cases} 1, & if d_{p} \geq d_{1} \\ \frac{d_{p}}{d_{1}}, & i f d_{p} < d_{1}, \end{cases} \end{matrix}

(1)

where

d i s t a n c e (a (c), (x, y))

denotes the Euclidean distance between the center of the cloaking area and the real user location

P M = (x_{i} {, y}_{i})

(ii) The Quantitative Representation of QoS Demand Q. In this paper, the QoS demand value is calculated using the distance between the real user location and the farthest point in the subcloaking area $a (c)$ , as given below:

\begin{matrix} d_{q} = \max \{{d i s t a n c e}^{'} (a (c), (x, y))\}, \\ Q = \{\begin{cases} 1, & if d_{q} \leq d_{2} \\ \frac{|d_{q} - d_{2}|}{d_{q}}, & if d_{q} > d_{2}, \end{cases} \end{matrix}

(2)

where

m a x ({d i s t a n c e}^{'} (a (c), (x, y))) = m a x (|d_{A}|, |d_{B}|, |d_{C}|, |d_{D}|)

and

|d_{A}|, |d_{B}|, |d_{C}|, |d_{D}|

denotes, respectively, the Euclidean distance between the real user location

P M = (x_{i}, y_{i})

and four vertexes of

a (c)

. The lemma and proof are detailed in [12], and we do not recount them here again.

(iii) The Quantitative Representation of LBS Demands R. In order to balance the privacy demand and the QoS demand, a parameter $α (α \in [0,1])$ is introduced. Users can adjust dynamically the priority of privacy protection by defining the value of α. Here, two different cases should be differentiated. (1)

When $d_{1} < d_{2}, d_{p} \geq d_{1}$ , and $d_{q} \leq d_{2}$ , then P and Q are both equal to 1, that is, the most optimal situation where QoS and location privacy protection are both completely satisfied and guaranteed. As a result, there is no need for the user to define α anymore because the object of protecting location privacy without decreasing QoS has been already achieved.

(2)

When $d_{1} > d_{2}$ , $d_{p} < d_{1}$ , and $d_{q} > d_{2}$ then P and Q both belong to (0,1) and there will be a conflict between privacy protection and QoS-guarantee. In this time, P and Q are not both optimal, and therefore the user can adjust the value of α to express individual demands, meaning that the user can choose to attach more importance to location privacy protection or to QoS. Actually, a greater value of α represents a higher priority of privacy location protection. And, in this case, LBS demand R is given as below. The object is to find the minimal subcloaking area $a (c)$ which maximizes R:

\begin{matrix} R = d_{2} + α (d_{1} - d_{2}) = α d_{1} + (1 - α) d_{2} = α \cdot d_{p} \cdot P + (1 - α) \cdot d_{q} \cdot (1 - Q) . \end{matrix}

(3)

4.5. Algorithm Design and Analysis

In this section, the three core algorithms applied in DUDD location privacy protection scheme will be presented in detail, namely, parameters generation algorithm (composed of Algorithms 1 and 2), subcloaking area selection algorithm (composed of Algorithms 3 and 4), and user side process algorithm.

Algorithm 1: Parameter ( $d_{2}$ ) generation on LBS provider side.

Input: ToS;

Output: QoS-guaranteed distance $d_{2}$ ;

(1) Receive the parameter ToS from user side.

(2) LBS provider generates $d_{2}$ according to ToS;

(3) Return $d_{2}$ to user directly.

Algorithm 2: Parameters (cloaking area c, $d_{1}$ ) generation on Location Anonymization Server side.

Input: LBS query ( $u_{i}$ , $P M$ , t, r, ToS); A quadtree T; The total number of levels L;

The privacy demand value and the minimum area required (k, $A_{\min}$ ) is included in r.

Output: Cloaking Area c and $d_{1}$ .

(1) Initialize a LBS request ( $u_{i}$ , $P M$ , t, r, ToS), send it entirely to Location Anonymization Server.

(2) According to ToS, Location Anonymizatiton Server generates $d_{1}$ .

(3) Find the leaf area $c_{i}$ (n) (i.e. the minimum square area) in quadtree T where is located the user location $P M$ and n

represents the current level in the quadtree T. At the beginning, n is supposed to be L, that is the bottom level of T.

(4) while the number of users in $c_{i} < k$ , do

(5) Let $n = L - l$ , that is move upwards l levels until k is fulfilled.

(6) end while

(7) while A (the area of $c_{i}$ ) < $A_{\min}$ , do

(8) Let $n = n - 1$ , that is move upwards one level until $A_{\min}$ is fulfilled.

(9) end while

(10) Let $c = c_{i}$ (n).

(11) Return parameters $d_{1}$ and c to user.

Algorithm 3: Subcloaking area $a (c)$ selection from two candidates subcloaking areas.

Input: A quadtree T; Parameters [ $d_{1}, d_{2}, c$ ] and the number of level n obtained from Algorithms 1 and 2,

User location $P M = (x, y)$ . Two sub-cloaking areas $c_{i}$ and $c_{j}$ .

Output: Sub-cloaking Area $a (c)$ .

(1) Let the user choose a value from [0, 1] for parameter α according to his/her individual demand for location privacy protection.

(2) Let $d_{p i} = distance (c_{i}, (x, y))$ , $d_{p j} = distance (c_{j}, (x, y))$ .

(3) if $d_{p i} \geq d_{1}$ , then Let $P (c_{i}) = 1$ ; else Let $P (c_{i}) = d_{p i} / d_{1}$ .

(4) if $d_{p j} \geq d_{1}$ , then Let $P (c_{j}) = 1$ ; else Let $P (c_{j}) = d_{p j} / d_{1}$ .

(5) Let $d_{q i} = \max \{{distance}^{'} (c_{i}, (x, y)) \}$ , $d_{q j} = max⁡ \{{diatance}^{'} (c_{j}, (x, y)) \}$

(6) if $d_{q i} \leq d_{2}$ , then Let $Q (c_{i}) = 1$ ; else Let $Q (c_{i}) = | d_{q i} - d_{2} | / d_{q i}$ .

(7) if $d_{q j} \leq d_{2}$ , then Let $Q (c_{j}) = 1$ ; else Let $Q (c_{j}) = | d_{q j} - d_{2} | / d_{q j}$ .

(8) if $P (c_{i}) = 1 & Q (c_{i}) = 1$ , then Let $a (c) = c_{i}$ .

(9) break. that is algorithm terminates.

(10) else if $P (c_{j}) = 1 & Q (c_{j}) = 1$ , then Let $a (c) = c_{j}$ .

(11) break. that is algorithm terminates.

(12) else Let $R (c_{i}) = α P (c_{i}) \cdot d_{p i} + (1 - α) \cdot d_{q i} \cdot (1 - Q (c_{i}))$ ,

(13) $R (c_{j}) = α P (c_{j}) \cdot d_{p j} + (1 - α) \cdot d_{q j} \cdot (1 - Q (c_{j}))$ .

(14) if $R (c_{i}) > R (c_{j})$ , then Let $a (c) = c_{i}$ .

(15) else Let $a (c) = c_{j}$ .

(16) Return $a (c)$ .

Algorithm 4: Iterative selection process.

Input: The list of sub-cloaking areas $C = [c_{1} {, c}_{2} {, c}_{3} {, c}_{4}]$ ; Iterative variable k.

Output: Sub-cloaking Area $a (c)$ .

(1) Let $c_{i} = C [0]$ , $c_{j} = C [1]$ , $k = 2$ .

(2) while $k \leq 4$ , do {

(3) $a (c) =$ Algorithm 3 $(c_{i}, c_{j}$ ). that is call Algorithm 3.

(4) if $a (c) = c_{i}$ , then Let $c_{j} = C [k]$ ;

(5) else Let $c_{i} = c_{j}$ , $c_{j} = C [k]$ ;

(6) k++. }

(7) end while.

(8) Return $a (c)$ .

4.5.1. Algorithm of Parameters Generation

Employing the k-anonymity method, Location Anonymization Server generates and returns a cloaking area c for every user. Beginning with the smallest square which contains the user location, Location Anonymization Sever will move upwards one level every time until the value of k is satisfied. The server will then check if $A_{m i n}$ is guaranteed. In the end, the server will return to the user a cloaking area c under the premise of satisfying k and $A_{m i n}$ simultaneously. Moreover, according to the parameter ToS in the LBS query, Location Anonymization Server and LBS provider will, respectively, generate their restrictions on the distance between users' actual location and the subcloaking area, as described in Algorithms 1 and 2.

4.5.2. Algorithm of Subcloaking Area Selection and User Side Process

After receiving the cloaking area c returned from Location Anonymization Server, users will select a subcloaking area a(c) within c and send $a (c)$ to LBS provider. Actually, $a (c)$ is selected from ${{c}_{1}, c_{2}, c_{3}, c_{4}}$ , where ${{c}_{1}, c_{2}, c_{3}, c_{4}}$ is a set of four candidate subcloaking areas of cloaking area c. It is worth mentioning that $c_{1}, c_{2}, c_{3}, c_{4}$ are four smaller squares obtained by dividing c according to the data structure presented in Section 4.2.

The algorithms of subcloaking area selection and user side process are detailed in Algorithms 3 and 5.

Algorithm 5: User side process.

Input: $P M = (x_{i}, y_{i})$ ; Current time t; LBS query ( $u_{i}$ , PM, t, r, ToS).

(1) Initialize current location PM and let t = current time.

(2) Send a LBS query ( $u_{i}$ , PM, t, r, ToS) to Location Anonymization server,

where r represents the anonymity level which contains the couple (k, $A_{\min}$ ), ToS represents the type of service.

(3) Wait for a moment until get the cloaking area c from Location Anonymization server.

(4) Call Algorithm 3 and 4 to generate sub-cloaking area $a (c)$ .

(5) if the cloaking area required by LBS provider is $b (c) < a (c)$ :

(6) Make a circle of area $b (c)$ within $a (c)$ and send it to LBS provider.

(7) else send $a (c)$ to LBS provider.

(8) Let $P M$ = ( $x_{i}^{'}, y_{i}^{'})$ , that is if the user location changes, update $P M$ .

(9) if $P M \subseteq c$

(10) break;

(11) else back to step (1).

Since there are four candidates in the set of subcloaking area ${{c}_{1}, c_{2}, c_{3}, c_{4}}$ , an iterative selection is necessary so as to finally pick out a(c). The iterative selection algorithm is detailed in Algorithm 4.

By Algorithms 3 and 4, users can balance the privacy demand and QoS by adjusting α when there is a conflict between privacy protection and QoS-guarantee. The subcloaking area $a (c)$ is generated randomly, so it might contain the real user location or not. Thus, even if the adversary gets the subcloaking area $a (c)$ , it is not easy to tell if this area covers the real user location or not.

After the execution of Algorithms 3 and 4, the most optimal subcloaking area a(c) will be picked out. In general, the user's next move is to send a(c) to LBS provider and to get the service response in return. However, sometimes, in order to guarantee QoS, LBS provider will have its own restriction on the size of a(c), and generally it demands a smaller area than a(c). Therefore, in line 5 of Algorithm 5, considering that LBS provider might have restriction on the subcloaking area size, it is necessary to select a smaller region $b (c)$ within a(c) according to the requirement of LBS provider. Here we use a circle to represent $b (c)$ . If LBS provider has no specific restriction, then step 5 will be neglected.

In lines 6 to 9 of Algorithm 5, it is considered that sometimes users might send the same LBS query for several times during a short time interval but they are not leaving the cloaking area c. Therefore, in order to not waste the computation time of the server, a location tracker PM is set to verify if the user location is out of boundaries of the cloaking area c. If the user location is no longer within the range of the cloaking area c, it is allowed to send the same query again. If not, the response of the last query will be returned to the user.

5. Feasibility Simulation Results and Analysis

The feasibility simulation results of the proposed DUDD location privacy protection scheme are given in this section. The software MATLAB is used to conduct the simulation. Here, the value of the parameter ToS is predefined in order to simplify the simulation. The simulation parameters are shown in Table 1.

Table 1

Simulation parameters designed for LBS query.

User location $(x, y)$	Time stamp t	Privacy demand k	Minimum area $A_{\min}$	Type of service ToS
A: (4.5, 8.6) B: (4.8, 8.9)	System current time	53	${\geq  s}_{4}^{*}$	$ToS = 1$ , navigation service.

$^{*} s_{4}$ represents the area of one single square in level 4.

5.1. Location Anonymization Server Side

On Location Anonymization Server side, a 5-level quadtree model is constructed within a square region, and the whole square region is partitioned using the method presented in Section 4.1, as shown in Figure 6.

Figure 6

A quadtree with 5 levels. Pictures (a) to (e) contain, respectively, 1/4/16/64/256 squares corresponding to level 1 till level 5 in a quadtree.

Then, 53 points are generated at random to represent 53 different user locations and they are distributed randomly within a square whose vertex coordinates are $(4,8), (6,8), (6,10)$ , and (4,10), as shown in Figure 7(a). Suppose that, at present, the location coordinate of user A is (4.5,8.6), that is, one of the 53 points. When this user sends a LBS query to Location Anonymization Server, a cloaking area c is returned afterwards, that is, the red zone in Figure 7(a). Within the cloaking area c, 4 subcloaking areas are presented and, respectively, named as $c_{1}, c_{2}, c_{3}, c_{4}$ in the counterclockwise direction, just as shown in Figure 7(b).

Figure 7

Cloaking area c returned to user (4.5,8.6) and its 4 subcloaking areas. (b) is the zoom-in of the shadow area in (a).

5.2. User Side

On user side, according to the value of α, a specific subcloaking area $a (c)$ is selected from ${c_{1}, c_{2}, c_{3}, c_{4}}$ . Here, two different situations are distinguished.

(i) LBS Provider Has No Restriction on the Subcloaking Area Size Selected by the User. Suppose that user A is still at point (4.5,8.6); α is set as 0.5; that is, the privacy protection demand and the QoS demand are equal. By comparing the value of R in every subcloaking area, that is, $c_{1}, c_{2}, c_{3}, c_{4}$ in Figure 7(b), the program chooses to return the subcloaking area $c_{4}$ where R performs a maximum, as shown in Figure 8.

Figure 8

Subcloaking area selection result for user (4.5,8.6) without any size restriction from LBS provider. After selection, the subcloaking area $c_{4}$ is selected and returned to LBS server.

(ii) LBS Provider Has Restriction on the Subcloaking Area Size Selected by the User. Suppose that LBS provider requires that the subcloaking area offered by the user should be no more than 0.8. Then, a circle area of 0.8 will be selected within $a (c)$ and returned to LBS provider, as shown in Figure 9.

Figure 9

Subcloaking area selection result for user $(4.5, 8.6)$ with size restriction 0.8 from LBS provider. After selection, a smaller circle area within the subcloaking area $c_{4}$ is selected and returned to LBS server.

5.3. Validation of Edge Information Attack Prevention

5.3.1. Two-User Situation

Take 15 uneven distributed values between 0 and 1 as α and select two points (4.5,8.6) and (4.8,8.9) to repeat the aforementioned simulation operation. The simulation results are presented in Tables 2(a) and 2(b).

Table 2

(a) Subcloaking area selection results for user position (4.5, 8.6) under different values of α. (b) Subcloaking area selection results for user position (4.8, 8.9) under different values of α.


(a)
α	0.010	0.040	0.070	0.100	0.200	0.300	0.400	0.500	0.600	0.700	0.800	0.900	0.930	0.960	0.990
$R_{\max}$	1.283	1.283	1.680	2.185	3.883	5.580	7.277	8.974	10.670	12.370	14.066	15.760	16.270	16.781	17.291
Subcloaking area	$c_{2}$	$c_{2}$	$c_{4}$	$c_{4}$	$c_{4}$	$c_{4}$	$c_{4}$	$c_{4}$	$c_{4}$	$c_{4}$	$c_{4}$	$c_{4}$	$c_{4}$	$c_{4}$	$c_{4}$

(b)
α	0.01	0.04	0.07	0.10	0.20	0.30	0.40	0.50	0.60	0.70	0.80	0.90	0.93	0.96	0.99
$R_{\max}$	0.88	1.02	1.47	1.81	2.95	4.09	5.23	6.37	7.52	8.66	9.80	10.94	11.28	11.62	11.97
Subcloaking area	$c_{2}$	$c_{2}$	$c_{3}$	$c_{3}$	$c_{3}$	$c_{3}$	$c_{3}$	$c_{3}$	$c_{3}$	$c_{3}$	$c_{3}$	$c_{3}$	$c_{3}$	$c_{3}$	$c_{3}$

Through data analysis, several conclusions can be made: (1)

The randomness of algorithms:

(i)

for point (4.5,8.6), when α gets different values in $[0,1]$ as shown in the table, the subcloaking area is either $c_{2}$ or $c_{4}$ , randomly distributed; for point (4.8,8.9), the subcloaking area is either $c_{2}$ or $c_{3}$ , randomly distributed;

(ii)

for the same value of α, the result is different based on user's location. And, for two points in the same area (e.g., the two points selected here are both in area $c_{2}$ ), even when they are very close to each other, the distribution of subcloaking area selection result is different when α changes.

Because of this kind of randomness, it becomes more difficult for LBS provider to find out the real user location, making location privacy protection more effective. (2)

The assignment of α:

(i)

for a single user, when α is assigned a small value, it tends to get the subcloaking area nearer to the real user position; however, when α is assigned a large value, it tends to get the subcloaking area farther from the real user position.

Therefore, if the user chooses location privacy protection over QoS, a larger α value should be selected (usually greater than 0.5) and if the user needs a better QoS, a smaller α value should be selected (usually less than 0.5).

5.3.2. Multiuser Situation

In Section 5.3.1, only two users located in $c_{2}$ are taken, for example, to simulate the two-user situation. Hence, here in this subsection, all the different user locations in area $c_{1}$ and $c_{3}$ are selected to simulate the multiuser situation. For each user located in area $c_{1}$ , the same simulation process is conducted and a subcloaking area will be finally picked out from ${{c}_{1}, c_{2}, c_{3}, c_{4}}$ . Then the number of users falling in each subcloaking area ${{c}_{1}, c_{2}, c_{3}, c_{4}}$ will be counted, that is, how many users selected $c_{1}$ as the final subcloaking area, how many users selected $c_{2}$ , and so forth. In the end, the proportion of users falling in each subcloaking area is calculated, respectively, and is described with bar graph, as illustrated in blue in Figure 10(a), the same for the users located in $c_{3}$ , as illustrated in red in Figure 10(a).

Figure 10

Subcloaking area selection result comparison under multiuser situation.

Several conclusions can be made: (1)

When α gets identical values, for users in different areas, the subcloaking area selection results are unevenly distributed; when α gets different values, for users in identical areas, the subcloaking area selection results are still unevenly distributed.

(2)

In reality, α is defined all by users according to their own actual situations, which will bring a great randomness to the subcloaking area selection. Therefore, even if the adversary can manage to know the subcloaking area reported to LBS provider, it is extremely difficult to find out the real user location, which will prevent the edge attack effectively and realize the location privacy protection.

5.4. Subcloaking Area Selection Rule

Firstly, several definitions are given below.

(i) Initial Subcloaking Area. It is the subcloaking area where users are when they launch their LBS queries.

(ii) Diagonal Subcloaking Area. It is the subcloaking area which is in the diagonal direction of the initial subcloaking area.

(iii) Clockwise Neighbor Subcloaking Area. It is the adjacent subcloaking area of the initial subcloaking area, in the clockwise direction.

(iv) Counterclockwise Neighbor Subcloaking Area. It is the adjacent subcloaking area of the initial subcloaking area, in the counterclockwise direction.

For example, as shown in Figure 7(b), if $c_{3}$ is defined as the initial subcloaking area, then $c_{1}$ is the diagonal subcloaking area of $c_{3}$ , $c_{2}$ , and $c_{4}$ which are, respectively, the clockwise and counterclockwise subcloaking areas.

Then, upon analyzing the simulation results of all the 53 users, a broken line graph illustrating the probability that the subcloaking area falls into initial and diagonal subcloaking areas at different α values is given in Figure 11.

Figure 11

Probability distribution of initial and diagonal subcloaking areas selection under different values of α.

Similarly, the probability that the subcloaking area falls into clockwise and counterclockwise subcloaking areas at different α values is described in Figure 12.

Figure 12

Probability distribution of clockwise and counterclockwise subcloaking areas selection under different values of α.

With the analysis of Figures 11 and 12, a few conclusions can be conducted: (1)

In the interval $[0,1]$ , when α approaches to 0, it is more likely to select the initial subcloaking area.

(2)

When α approaches to 1, it is more likely to select the diagonal subcloaking area.

(3)

When $α \in [0.1,0.9]$ , the probability of falling into the diagonal subcloaking area augments dramatically and, in contrast, that of falling into the initial subcloaking area falls sharply.

(4)

With the increase of α, the probability of falling into the clockwise and counterclockwise subcloaking areas shows the same tendency of a first increase then decrease, reaching the maximum when α fluctuates around 0.5.

Those abovementioned conclusions fit well with the actual situation. In reality, when α is small, it means that users pay more attention to QoS rather than to location privacy protection. Therefore, in order to gain a high QoS, the initial subcloaking area should be returned because it is closer to the real user location. Otherwise, when α augments, it means that now users attach more importance to location privacy protection. As a consequence, the diagonal subcloaking area will be returned because it is farther from the real user location. When α fluctuates around 0.5, users hope to preserve their location privacy and obtain a high QoS at the same time. Hence, adjacent subcloaking areas will be returned.

5.5. Simulation Result When k Increases

When k gets bigger, the cloaking area c generated via Algorithm 2 at the same user location (4.5,8.6) might be the same as before or not. If it is not the same, the level of the new cloaking area c must become higher, which means that the subcloaking area corresponding will also change.

Here, it is assumed that k equals 100, and the level of cloaking area c moves upwards only one to level 3. Then, according to Algorithm 2, the new cloaking area c becomes the square whose vertex coordinates are (4,8), (8,8), (8,12), (4,12), as shown in Figure 13(a). Four subcloaking areas are, respectively, named as $c_{1}^{'}, c_{2}^{'}, c_{3}^{'}, c_{4}^{'}$ , and each of them corresponds to a square area in level 4, as shown in Figure 13(b).

Figure 13

Cloaking area returned to user (4.5,8.6) and its 4 subcloaking areas when $k = 100$ . (b) is the zoom-in of the shadow area in (a).

Take all the different user locations in $c_{3}^{'}$ to conduct the same simulation process. The simulation result is shown in Figure 14(a).

Figure 14

(a) Subcloaking area selection results comparison of users located in $c_{3}^{'}$ when $k = 100$ . (b) Subcloaking area selection results comparison of users located in $c_{3}$ when $k = 53$ .

Seen from Figure 14, several conclusions can be made: (1)

When k increases, the subcloaking area selection result after the anonymization process is still unevenly distributed; that is, the randomness of algorithm remains unchanged no matter the size of k.

(2)

However, it is obvious that the probability of choosing diagonal subcloaking area significantly increases with the augmentation of k. It is because the augmentation of k means that a higher degree of location privacy protection is in need, so the subcloaking area will be farther from the real user location in order to realize a higher location privacy protection degree.

6. Conclusion and Future Works

With the increasing importance of location privacy protection, various methods have been proposed over the past few years to preserve the location privacy. However, it is observed that few approaches take individual demands into consideration and seek a balance between the privacy protection and QoS, which matters a lot in real scenarios. Hence, in this paper, a new distributed user-demand-driven location privacy protection scheme is proposed, an improved LBS system model is introduced, and a user-defined weight parameter is used to realize a balance between location privacy protection and QoS. A subcloaking area will be found under the premise of protecting location privacy and guaranteeing QoS. The feasibility simulation results prove the effectiveness and feasibility of the proposed method. Nevertheless, there are still lots of future work to do. First, which value should be allocated to ToS in accordance with the type of service requires further consideration. Second, the accuracy of the two quantitative models should be improved in the future research. Third, some integrated metrics are needed to evaluate if user demands are really fulfilled.

Footnotes

Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.

Acknowledgments

This paper is supported by New Century Excellent Talents in University (no. NCET-13-0657) and State Key Laboratory of Rail Traffic Control and Safety (no. RCS2014ZT31) and is also partially supported by NSFC (no. U1261109).

References

Sundaramurthy

M. C.

Chayapathy

S. N.

Kumar

Akopian

Wi-Fi assistance to SUPL-based assisted-GPS simulators for indoor positioning

Proceedings of the IEEE Consumer Communications and Networking Conference (CCNC ′11)

January 2011

Las Vegas, Nev, USA

918 922

10.1109/ccnc.2011.5766641

2-s2.0-79957901613

Wang

Wong

A. K.-S.

Kong

Mobility tracking using GPS, Wi-Fi and Cell ID

Proceedings of the 26th International Conference on Information Networking (ICOIN ′12)

February 2012

Bali, Indonesia

IEEE

171 176

10.1109/icoin.2012.6164371

2-s2.0-84860689578

Patel

Palomar

Privacy preservation in location-based mobile applications: research directions

Proceedings of the 9th International Conference on Availability, Reliability and Security (ARES ′14)

September 2014

Fribourg, Switzerland

227 233

10.1109/ares.2014.37

2-s2.0-84920626368

Zheng

Yang

Tian

Gan

Wang

Xiao

Data gathering with compressive sensing in wireless sensor networks: a random walk based approach

IEEE Transactions on Parallel and Distributed Systems 2015 26 1 35 44

10.1109/tpds.2014.2308212

2-s2.0-84919680956

Zhang

Cui

Yuan

Wang

The location privacy protection research in location-based service

Proceedings of the 18th International Conference on Geoinformatics, Geoinformatics

June 2010

Beijing, China

1 4

10.1109/geoinformatics.2010.5568118

2-s2.0-77958036694

Pei

Choi

Zhou

Private search on key-value stores with hierarchical indexes

Proceedings of the 30th IEEE International Conference on Data Engineering (ICDE ′14)

April 2014

Chicago, Ill, USA

628 639

10.1109/icde.2014.6816687

2-s2.0-84901748695

Shokri

Theodorakopoulos

Troncoso

Protecting location privacy: optimal strategy against localization attacks

Proceedings of the 19th Conference on Computer and Communications Security (CCS ′12)

October 2012

Raleigh, NC, USA

617 627

10.1145/2382196.2382261

Wang

Zhang

L2P2: location-aware location privacy protection for location-based services

Proceedings of the IEEE Conference on Computer Communications (INFOCOM ′12)

March 2012

Orlando, Fla, USA

IEEE

1996 2004

10.1109/infcom.2012.6195577

2-s2.0-84861601388

Gruteser

Grunwald

Anonymous usage of location-based services through spatial and temporal cloaking

Proceedings of the First International Conference on Mobile Systems, Applications, and Services (ACM MobiSys ′03)

May 2003

San Francisco, Calif, USA

31 42

10.1145/1066116.1189037

10.

Pan

Meng

Protecting location privacy against location-dependent attacks in mobile services

IEEE Transactions on Knowledge and Data Engineering 2012 24 8 1506 1519

10.1109/TKDE.2011.105

2-s2.0-84863471596

11.

Non-exposure location anonymity

Proceedings of the IEEE 25th International Conference on Data Engineering (ICDE ′09)

March-April 2009

Shanghai, China

1120 1131

12.

Zheng

Tan

Zou

Niu

Zhu

A cloaking-based approach to protect location privacy in location-based services

Proceedings of the 33rd Chinese Control Conference (CCC ′14)

July 2014

Nanjing, China

5459 5464

10.1109/chicc.2014.6895872

2-s2.0-84907943902

13.

Gedik

Liu

Protecting location privacy with personalized k-anonymity: architecture and algorithms

IEEE Transactions on Mobile Computing 2008 7 1 1 18

10.1109/tmc.2007.1062

2-s2.0-36549043405

14.

Bhuvan

Ling

Peter

Wang

Supporting anonymous location queries in mobile environments with privacy grid

Proceedings of the 17th International Conference on World Wide Web (ACM ′08)

April 2008

Beijing, China

237 246

15.

Zhao

Zhang

Zhou

K-anonymity location privacy protection using circular partitioning method

Journal of Beijing Jiaotong University 2013 37 5 13 19

2-s2.0-84887528521

16.

Duckham

Kulik

A formal model of obfuscation and negotiation for location privacy

Pervasive Computing 2005 3468

Berlin, Germany

Springer

152 170 Lecture Notes in Computer Science

10.1007/11428572_10

17.

2PASS: bandwidth-optimized location cloaking for anonymous location-based services

IEEE Transactions on Parallel and Distributed Systems 2010 21 10 1458 1472

10.1109/tpds.2010.26

2-s2.0-77956175701

18.

You

T.-H.

Peng

W.-C.

Lee

W.-C.

Protecting moving trajectories with dummies

Proceedings of the 8th International Conference on Mobile Data Management (MDM ′07)

May 2007

Mannheim, Germany

278 282

10.1109/mdm.2007.58

2-s2.0-48649103506

19.

Ghinita

Kalnis

Khoshgozaran

Shahabi

Tan

Private queries in location based services: anonymizers are not necessary

Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD ′08)

June 2008

Vancouver, Canada

121 132

20.

Liu

Guo

Fang

A game-theoretic approach for achieving k-anonymity in location based services

Proceedings of the 32nd IEEE Conference on Computer Communications (INFOCOM ′13)

April 2013

Turin, Italy

IEEE

2985 2993

10.1109/infcom.2013.6567110

2-s2.0-84883091913

21.

Cai

Feeling-based location privacy protection for location-based services

Proceedings of the 16th ACM Conference on Computer and Communications Security (CCS ′09)

November 2009

Chicago, Ill, USA

ACM

348 357

10.1145/1653662.1653704

A New Distributed User-Demand-Driven Location Privacy Protection Scheme for Mobile Communication Network

Abstract

1. Introduction

2. Related Works

3. System Model and Problem Definition

3.1. Problem Definition

3.1.1. Edge Information Attack

3.1.2. Limitations of Existing Approaches

3.2. Traditional LBS System Model

4. Proposed Location Privacy Protection Scheme

4.1. Improved LBS System Model

4.2. Data Structure

4.3. Definition of Parameters

Scenario 1.

Scenario 2.

4.4. Quantification and Normalization of Privacy Demand Value and QoS

4.5. Algorithm Design and Analysis

Algorithm 1: Parameter ( d 2 ) generation on LBS provider side.

Algorithm 2: Parameters (cloaking area c, d 1 ) generation on Location Anonymization Server side.

Algorithm 3: Subcloaking area a ( c ) selection from two candidates subcloaking areas.

Algorithm 4: Iterative selection process.

4.5.1. Algorithm of Parameters Generation

4.5.2. Algorithm of Subcloaking Area Selection and User Side Process

Algorithm 5: User side process.

5. Feasibility Simulation Results and Analysis

5.1. Location Anonymization Server Side

5.2. User Side

5.3. Validation of Edge Information Attack Prevention

5.3.1. Two-User Situation

5.3.2. Multiuser Situation

5.4. Subcloaking Area Selection Rule

5.5. Simulation Result When k Increases

6. Conclusion and Future Works

Footnotes

Conflict of Interests

Acknowledgments

References

Algorithm 1: Parameter ( $d_{2}$ ) generation on LBS provider side.

Algorithm 2: Parameters (cloaking area c, $d_{1}$ ) generation on Location Anonymization Server side.

Algorithm 3: Subcloaking area $a (c)$ selection from two candidates subcloaking areas.