Abstract
In moving environment, the positions of moving objects cannot be located accurately. Apart from the measuring instrument errors, movement of the objects is the main factor contributing to this uncertainty. This uncertainty makes dominant relationship of data instable, which will affect skyline operator. In this paper, we mainly study the continuous probabilistic skyline query for uncertain moving objects in road network. The query point is deemed to be stationary while moving objects are treated as targets with uncertainty described by a probability density function. After defining the notion of dominant probability and probabilistic skyline, we put forward a novel algorithm to deal with continuous probabilistic skyline query on road network. Firstly, we compute the dominant probability and skyline probability to get initial permanent p-skyline set. Then we define events to predict the time when dominant relationship between moving objects may change. Furthermore, we track and calculate events to update the probabilistic skyline in an incremental way. Two pruning strategies are proposed to cancel invalid events and objects in a bid to diminish search space. Finally, an extensive experimental evaluation on real datasets shows that probabilistic skyline sets in road network can be updated by the proposed algorithm. It demonstrates both efficiency and effectiveness.
1. Introduction
Skyline query aims to find a subset where all objects are not dominated by any other object in the dataset, helping users in multicriteria decision making, data mining and visualizing for database, and so forth.
In mobile database, a moving object reports its position and velocity to database service through wireless communication interface. Because of time delay and other technical limitations, the position obtained usually deviates from actual one. The deviation causes what is called uncertainty, which leads to instability of dominant relationship between objects. In real world, objects are restricted in road network, such as track, road, or highway. Position uncertainty of moving objects including its environment should be considered in skyline operation.
Until recently, a lot of work on skyline query had been focused on a static dataset, where the distances from query point to target objects are invariant. With rapid development of GPS technology and mobile devices, the location-based service (LBS) [1], which is pervasive in daily life, is becoming one of the most important applications in spatiotemporal database. However, in moving environment, moving objects whose position is detected by GPS system alternately are always in motion. Delay or error of the sensor in GPS, especially the self-motion of moving objects, makes the location of moving objects uncertain. Despite the existing uncertainty, the position is often treated as being accurate to be queried in many studies [2, 3]. Study on continuous skyline query on moving objects was first proposed by Huang et al. [4]. In his work, position of moving objects is considered as the data with certainty.
In fact, because of the sensors’ error and the objects’ movement, the position of objects in moving environment is uncertain and vague. Recently, a few researchers have been aware of this imprecision and made some contributions [5–7]. In our approach, the location of query object is exactly known as a stationary object, but the location of moving target objects is characterized by a certain distribution instead of a precise point. For example, a weapon-related crime takes place somewhere, which is located in point O, as Figure 1 shows. Police cars should be dispatched to intervene. The adequate police strength including the number of policemen and equipment in the dispatched car is needed. Also a car as near to the crime scene as possible is another factor for consideration in order to arrive in time. In Figure 1, table shows the parameters of each police car, including position of the cars, number of policeman, equipment level, and network distance to the query point O, where the value of equipment level is used to quantify the level of equipment, with greater value corresponding to higher level. If the police cars are treated as certain points, cars A and E are skyline objects. However, the cars are moving quickly, so their positions are uncertain to some extent. For example, suppose that the true position of car A is at point

An example of skyline query for uncertain data.
In this paper, we address the problem of continuous probabilistic skyline query for uncertain moving objects in road network. In our study case, the query object stays still, while the target objects are moving with uncertainty which is characterized by both a closed uncertainty region and a probability density function (pdf). The main contributions are as follows.
Probabilistic skyline on uncertain moving object in road network is introduced as a new important issue in decision support or navigation systems. By analyzing the dominant relationship between moving objects in uncertain position environment, the dominant probability and skyline probability are defined in case the query is stationary with targets moving. Based on trigger events which are presented to track the change of dominant relationship, a novel algorithm Probabilistic Skyline query with Uncertainty in Road network (PSUR) is proposed to deal with the dynamic skyline query with uncertainty. A series of pruning strategies are introduced to optimize and fasten this incremental algorithm. A great deal of experiments on two truth datasets are conducted to analyze the uncertain region, numbers of static dimensions, velocity of moving objects, and dataset size that affect the algorithm. Contrast experiments from literature [8] are made to validate the proposed PSUR. The experimental results show that PSUR is effective and efficient.
The rest of this paper is organized as follows. In Section 2, we summarize the related work of skyline computation. In Section 3, after describing basic notion of skyline and uncertain model for moving objects, we define dominant probability and probabilistic skyline. Section 4 introduces trigger events to track how dominant probability relationship varies when objects are moving with uncertainty. Two pruning strategies are adopted to reduce the search space. Section 5 describes PSUR algorithm to update p-skyline set. Section 6 gives experimental evaluation on two real datasets. Finally, the conclusion is reached in Section 7.
2. Related Work
Skyline queries are hot areas of current database research which recently have attracted more and more attention. It is introduced into relational database firstly by Borzsony et al. [9] with two proposed processing algorithms Block-Nested-loops algorithm (BNL) and Extended Divide & Conquer algorithm (D&C). As an improved method of BNL proposed by Dr. Chomicki et al. Sort-Filter-Skyline algorithm (SFS) [10] constructs the multidecision dominative order chains for the ordered data. NN algorithm [11], based on R-tree, searches the nearest neighbor recursively. It speeds up the skyline query by reducing the comparison numbers of BNL among objects. Nevertheless, it will cost more time and space in searching subspace repeatedly. Overcoming this limitation, the sorted R-tree based BBS [12] is proposed. It is one of the best skyline query methods in centralized datasets.
In recent years, skyline query has been extended to the dynamic datasets. The R-tree based I-Eager and I-Lazy algorithms were brought forth by Tao and Papadias [13] who firstly studied how to update and maintain the skyline results in dynamic datasets.. Tian et al. [14] introduced GICSC updating skyline query sets dynamically which better fits for low dimension metrics. Kontaki et al. [15] exerted efforts to maintain k-dominant skyline objects with maximum user's preference. Huang et al. [4] probed into the continuous skyline query for certain moving objects. It assumes that all points including the query point move in a predictable way. After analyzing dominating relationship between points in the space, Huang presented a continuous tracking algorithm-CSQ to maintain skyline sets dynamically.
In sensor networks and moving environment, the characters of the objects are not exactly known due to the limitations of measuring equipment and objects’ movement. A lot of work has focused on uncertain data. The authors in [5] studied the execution of probabilistic range and nearest-neighbor queries in mobile environment. The author in [16] focused on the situation in which the location of a query object is not exactly known. In research of skyline query field, Fiedler [17] firstly proposed skyline operator on uncertain data in his dissertation. Pei et al. [18] proposed a probabilistic skyline model for multiinstance data, where each object is part of the skyline with a certain probability. They presented two algorithms BUM and TDM to study skyline query on uncertain data of the possible world model (PWM). Lian and Chen [19] studied reverse-skyline query. They modeled the probabilistic reverse skyline query on uncertain data, in both monochromatic and bichromatic case, and proposed two effective pruning methods, MPRS and BPRS, to reduce the search space of query processing. Zhang et al. [20] explored how to maintain skyline sets when uncertain data are updated. An AR-tree is constructed for all uncertain data, and the maintenance is inserting, deleting, and updating on this AR-tree. The authors in [21] designed partitioning method to compute skyline probabilities for discrete data with uncertainty. The authors in [8] investigated skyline probabilities with a parametric form pdf (e.g., a Gaussian function or a Gaussian mixture model). The authors in[22] proposed a sliding window skyline model to study the execution of the probabilistic skyline query over uncertain data streams. [23]The authors studied a new problem of range-based skyline queries. Two novel algorithms I-SKY and N-SKY were presented to solve the probabilistic and continuous range-based skyline queries.
In road network, Huang and Jensen [24] assumed that the user's movement is constrained to a road network. The authors defined route nearest-neighbour skyline queries to consider the computation efficiency. Deng et al. [25] studied multisource skyline query in road networks. Three different road networks of multisource query methods were presented. The authors in [26] proposed a new method to process continuous skyline in road network based on precomputing the shortest range data of targets. The authors in [27] introduced route skyline computation in a multiattribute graph. Top routes are computed iteratively in an efficient way and pruning technique is adopted in order to reduce the search space. The authors in [28] focused on extracting the path skylines and proposed PathSL to generate an optimal skyline for moving objects.
In spite of much work focusing on uncertainty of skyline queries or route skyline queries, there is little work completed for skyline query concerning uncertainty of moving objects. The authors in[4, 22–25] ignored the uncertainty of moving objects, while the authors [8, 18, 21] dealt with discrete data with independent dominant relationship between objects. It is the first time to compute continuous probabilistic skyline query with regard to uncertainty of moving objects in road network.
3. The Probabilistic Skyline for Uncertain Moving Objects
3.1. Skyline on Certain Points
Suppose that the point set is
3.2. Uncertain Model for Moving Objects
After Wolfson et al. [1] firstly studied the uncertainty of moving objects, a lot of work has focused on this field [5–7]. An uncertain region [5] of a moving object
3.3. The Dominant Probability
What we mainly study on is as follows. There is a set of moving target objects whose centers are
Dissimilar to the traditional skyline query, in moving environment, the spatial location is changing with time. All attributes are divided into dynamic attributes and static ones. Let us suppose that there are m dynamic attributes, k static attributes, and n attributes where
Definition 1 (dominant probability).
Let
According to the definition of dominance in (1), we maintain that

Dominant probability.
3.4. The Skyline Probability
Definition 2 (skyline probability).
The skyline probability of object
Definition 3 (p-skyline).
Let
3.5. Uncertain Model in Road Network
The road network can be treated as a nondirection graph
In road network, the objects are limited movement, so the uncertain domain is denoted by a line segment with

Uncertain model in road network.
Denote
Definition 4 (minimum distance function).
The minimum distance between uncertain object
Definition 5 (maximum distance function).
The maximum distance between uncertain object
In continuous movement, the dominant relationship of two objects will change. For two moving objects

Transformation of dominant relationship.
4. Tracking by Events
The probabilistic skyline set is updated in an incremental way. The key step is how to predict the time when the dominant relationship changes. It's hard to know the accurate time when p-skyline just change. But we can estimate period of time during which the p-skyline may change.
4.1. Trigger Events
As shown in Figure 4, from time
4.2. The Pruning Strategy for Events
By analyzing relationship of objects we know that if static characters of any two objects
Pruning Strategy 1. If the dominant relationship in static attributes between two moving objects
Proof.
Suppose
Pruning Strategy 2. Suppose
Proof.
Because
5. Continuous Probabilistic Skyline Queries Algorithm
5.1. Initialization
The initialization framework is presented in this section. Each object has static and dynamic attributes, so we should compute the dominance relation on static attributes firstly. For two objects
(1) (2) for any point (3) compute_path( (4) compute_distance( (5) (6) for any point (7) compare( (8) if (9) flag( (10) push (11) else (12) flag( (13) if the (14)
In any case objects move,
5.2. PSUR Algorithm
Not all skyline probability of objects will change at one moment. If maximum/minimum distance function of one object cuts that of others in Cartesian coordinates, skyline probability of this object maybe vary (Algorithm 2). In other words, trigger events include all possible variations of p-skyline for each moving object.
(1) initialization(); (2) // compute events; (3) for any point (4) for each point (5) compute begin time s_time and end time e_time of each event; (6) if s_time (7) push it into (8) //handle events; (9) time = (10) while (time (11) while ( (12) e = (13) while ( (14) (15) e = (16) handle( (17) time++;
In order to simplify the problem, we suppose that all moving objects preserve their velocity with uniform speed. If not, the precomputing cannot be processed in advance. All events should be recomputed again. The movement of moving objects to query point can be picked up with its shortest path to query point. The intersecting time for the distance function can be recomputed and put into event queue sorted by time in ascending order. For each time, skyline probabilities of moving objects under trigger events need to be updated.
5.3. Algorithm Analysis and Discussion
The cost incurred by our method consists of three components: initialization, events computing, and updating by tracking.
In initialization period, computing static dominant relationship will cost
In second step, the worst cost of comparison for static dominant relationship between objects is
Handling events generates at most
In conclusion, the total cost is added together for these three periods, which is equivalent to
6. Experimental Evaluation
6.1. Datasets
Two real road networks are used to test the effect and efficiency of the proposed algorithm PSUR. One is the famous seashore city Oldenburg in Germany, which includes 6105 nodes and 7036 edges. The other is Cixi city of China, which contains 244 intersection nodes and 407 edges, much smaller than the first one. We assume that the uncertain area is segment along the road. Two distributions of uniform and Gaussian distribution for probability density function are adopted because these two functions are by far the most important and commonly used in statistics. Moving objects are generated randomly. Baseline and priority algorithms proposed in [8] are used to compare with our method PSUR to verify efficiency and effect. The main parameters are segment length Len, static dimension number d, numbers of moving objects Num, and time interval t, and probabilistic threshold p is 0.5 in all experiments.
We conducted our experiments on desktop PC running on Windows XP professional. The PC has Intel Core 2Duo 2.93 GHz and 3 GB RAM memory. All experiments were coded in Visual C++ 2008.
6.2. Numbers of Moving Objects
In this experiment, suppose

Numbers of moving objects effects of the performance in Cixi city
If there is the same number of moving objects in two real networks, the density of moving objects in Cixi is bigger than that in Oldenburg, because Cixi is smaller than Oldenburg. Therefore the chances for interaction of moving objects in Cixi are more those that in Oldenburg. Trigger events generated in Cixi are evidently more than those in Oldenburg, as Tables 2 and 3 show. The density of moving objects leads to this difference, because it is more sparsely distributed in Oldenburg than in Cixi for the same number of objects. As a result, by observing the performance between two city networks under the same data model, such as Figures 5(a) and 6(a), it is noted that the runtime in the smaller city Cixi costs a little more than that in Oldenburg.
Event size in Cixi road network (
Event size in Oldenburg road network (

Numbers of moving objects effect of the performance in Oldenburg city
The pruning strategy has done effective work on event size, as Tables 2 and 3 show. Event size will grow with the number of moving objects, because the number of the intersection of distance function rises.
6.3. The Effect of Uncertain Segment Length
In this section, parameters are as follows:

Segment length effects of the performance in Cixi city

Segment length effects of the performance in Oldenburg city
The magnitude of the uncertain length may also affect the performance of algorithm PSUR. It is seen that the longer the segment length is, the more runtime is required. If the length becomes much longer, for any two objects
It is shown that the segment length of uncertainty affects event's size, as shown in Tables 4 and 5. With increase of segment length, the number of events becomes great.
Event size in Cixi road work (
Event size in Oldenburg road work (
6.4. The Effect of Static Attribute Dimensionality
We will discuss the effect that multidimensional attributes impact on our algorithm over real road network. The result is shown in Figures 9 and 10. It is known that the higher dimension results in less run- time. This is because the computation of dominance relation on static attributes is carried out only once at the beginning. The higher the static dimensions are, the less objects that dominant

Static attribute dimensionality effects of the performance in Cixi city

Static attribute dimensionality effects of the performance in Oldenburg city
Tables 6 and 7 show the event size changes with static dimensionality of PSUR. Unlike preceding two experiments, in this case, when static dimensionality d increases, the event size reduces instead of increasing. This is due to the fact that, when d increases, the number of moving objects which have dominant relationship in static dimensionality decreases. The distance function will interact less, so will valid events.
Event size in Cixi road network (Len = 20; Num = 50;
Event size in Oldenburg road network (Len = 20; Num = 50;
6.5. The Effect of Time Span
Baseline and priority are somewhat static algorithms. They need to recompute skyline probability for each moving object, so runtime is proportional to the time span. However, as a dynamic method, PSUR can update the p-skyline set by tracking event size to decide which one might vary, so the time cost is approximately the same as in each timestamp, which saves more time than other two algorithms. Figures 11 and 12 show this evidence.

Time span effects of the performance in Cixi city

Time span effects of the performance in Oldenburg city
Tables 8 and 9 show the event size changes with time span length of PSUR. It is obvious that the event size grows with increasing time span length. Nevertheless, the difference is not significant. The runtime of PSUR varies a little in Figures 11 and 12.
Event size in Cixi road network (Len = 20; Num = 100;
Event size in Oldenburg road network (Len = 20; Num = 100;
7. Conclusion
To the best of our knowledge, this is the first work to compute probabilistic skyline queries for uncertainty in road network. In this paper, we have addressed the problem of continuous probabilistic skyline query for moving objects. Firstly attributes of objects are divided into static and dynamic to define dominant probability and skyline probability with continuous data. In order to update the skyline set continuously, trigger events are introduced to compute varying skyline probability of objects. Two pruning ways are proposed to save search space and speed up this computation. At last, the continuous probabilistic skyline query algorithm with uncertainty in road network named PSUR is proposed. Finally, a series of experiments are devised to verify the effectiveness and efficiency of PSUR. The results of our experimental study for different scales of datasets on two real road networks are very encouraging. The investigation demonstrates that the proposed updating method is more effective and efficient than periodic methods.
Footnotes
Conflict of Interests
The authors declare that there is no conflict of interests regarding the publication of this paper.
Acknowledgments
This research is partly supported by the Zhejiang Natural Science Foundation (LY12F02020), Ningbo Natural Science Foundation (2013A610063, 2012A610066), and the K. C. Wong Magna Fund in Ningbo University.
