Abstract
Owning to the proliferation of cost-effective sensors, there has been an increased growth in a number of applications of wireless sensor networks (WSNs). In addition, the skyline operator as well as its variants such as the dynamic skyline and reverse skyline operator has attracted increasing attention since those are useful for multicriteria decision making applications. Since the energy efficiency is utmost important issue to prolong the network lifetime, in this paper, we proposed efficient algorithms to process a reverse skyline query over a sliding window in WSN environments. We first devise our algorithm for the data stream environments and extend it to WSN environments. To compute the reverse skyline, we partition the data space into several orthants with respect to a query point. And, in each orthant, we compute the reverse skyline independently using two buffers. In our experiment study, we demonstrate that our algorithm is much better than other algorithms.
1. Introduction
Since being introduced in the database community, the skyline operator [1] and its variants such as dynamic skyline [2] and reverse skyline [3] operators have attracted increasing attention in multicriteria decision making applications such as product recommendations [4, 5], querying wireless sensor networks [6], and graph analysis [7].
Given a d-dimensional point set P, a point
In general, a wireless sensor network (WSN) is considered as a cost effective platform to monitor environments. A WSN consists of spatially distributed devices with various sensors and a powered base station which serves as an access point for users to pose ad hoc queries. The research for diverse types of queries over WSNs, for example data gathering [8], aggregation queries [9, 10], join queries [11, 12], and skyline queries [6, 13, 14], has been conducted to satisfy the diverse application demands. Among the diverse types of queries, a reverse skyline query is very useful for environmental monitoring applications. For example, in an application of monitoring the forest environment, a lot of sensors are deployed in a forest to collect sensor readings such as temperature and humidity. Assume a query point q represents the thresholds of a possible fire disaster on different attributes.
A naive method to detect a forest fire with a query point q is that only the sensor nodes with sensor readings exceeding thresholds report their sensor reading. For instance, let each point in Figure 1(a) represent the sensor reading of each sensor node. Since many sensor readings, represented by dotted circles in Figure 1(a), exceed the thresholds, many sensor nodes consume much energy to transmit a lot of sensor readings. Because each sensor node is battery-powered and located in hazardous or hard-to-reach place, it is impossible or very difficult to change the batteries of sensor nodes. Thus, in WSN environments, the energy efficiency is the utmost important issue to prolong the network lifetime.

An example for a forest fire.
In contrast to the naive method, the reverse skyline operator considers the dominance relationship for attributes with respect to a query point q which indicates a potential fire disaster as shown in Figure 1(b). The reverse skyline points are represented by dotted circles in Figure 1(b). A point
In this paper, we investigate the problem of energy-efficient in-network reverse skyline computation in WSN environments. In particular, WSN can be considered as a source of data streams. The data stream can be broken into possibly overlapping partitions by specifying a window and computation can be carried out in each partition. While efficient processing techniques for window queries have been proposed in the area of data streams, most of the previous work on data stream processing assumes that query processing is conducted at a centralized server. On contrary, in-network processing is commonly used in sensor network where each sensor calculates a partial result. Therefore, in our work, we devise an energy efficient algorithm to compute the reverse skyline considering sliding window queries which return repeatedly reverse skyline points during a given time interval. In this paper, we consider the sensor readings that arrived in a sliding window with size w. Specifically, a sensor reading generated at time
Our Contributions. Our work has the following combination of contributions to perform the reverse skyline operator over a sliding window.
To make an efficient algorithm of the reverse skyline, we analyze the properties of the reverse skyline theoretically. At first, we divide the d-dimensional data space into Each sensor node can be regarded as a source of stream data since each sensor node measures its environment repeatedly. Thus, we first proposed an effective algorithm which computes reverse skyline for a sliding window over a data stream. The devised algorithm is running on each sensor node to generate partial result. To compute the reverse skyline progressively, our algorithm maintains two buffers We devise in-network reverse skyline processing technique in WSN environments. Each node in a WSN only transmits small number of points to its parent node when the points become newly dynamic skyline points or the states of them are changed. Accordingly, the energy consumption of each node decreases.
To evaluate our proposed algorithms, we implemented our algorithms. In our experiments, we use the synthetic data set and the real-world data set to show the effectiveness of our algorithms in data stream environments and WSN environments. In data stream environments, we measure the processing time of each algorithm and, in WSN environments, we measure the total energy consumption of every node. Our comprehensive empirical evaluation demonstrates that our algorithm delivers the best performance in all situations.
The rest of the paper is organized as follows. Section 2 introduces the various skyline operators and wireless sensor networks. Section 3 contains related work. We present the basic features of the reverse skyline and propose our basic algorithm to process the reverse skyline in Section 4. In Section 5, we present the energy efficient in-network processing technique to compute reverse skyline over sliding window in WSN environments. Section 6 presents the empirical evaluation results and Section 7 summarizes the paper.
2. Preliminaries
2.1. Various Skyline Operators
Formally, given a d-dimensional data set
Definition 1 (skyline).
Given a d-dimensional data set
Given a d-dimensional query point q, the dominance relationship extended the dynamic dominance relationship. We say that a point
Definition 2 (dynamic skyline).
Given a d-dimensional data set P and a query point q, the dynamic skyline of P with respect to q is represented by
Based on the dynamic skyline, the notion of the reverse skyline was proposed in [3]. The reverse skyline is defined as the follows.
Definition 3 (reverse skyline).
Given a d-dimensional data set P and a query point q, the reverse skyline, represented by
Example 4.
Consider the data set

An example of the dynamic skyline.
2.2. Wireless Sensor Networks
We consider a sensor network consisting of n stationary sensor nodes

A simple sensor network.
To form a routing tree, the base station first sends a request message which contains a hop-counter indicating the hop distance from the base station. When each node
3. Related Work
To reduce the energy consumption of WSNs, research on diverse types of queries such as aggregation, data gathering, join, and skyline has been conducted. One of the well-known approaches to reduce the energy consumption of WSNs is in-network processing. In the in-network processing techniques, the partial results are progressively merged at intermediate nodes on their way to the base station according to the tree routing.
3.1. Aggregation
The pioneering TAG work by Madden et al. in [9] studied in-network aggregation for reducing communication overhead using summary data (e.g., SUM) and/or exemplary data (e.g., MIN and MAX). In TAG, as climbing up a routing tree from leaf nodes to the base station, partial aggregation values are computed. Approximate aggregation techniques have been also proposed to reduce the energy consumption. The work of Considine et al. [18] was based on the FM sketch. Shrivastava et al. [19] developed the q-digest structure to support approximate processing for quantile queries such as MEDIAN. In [20], an effective aggregation technique for the situation that sensor nodes can detect an object duplicately was presented. To identify the duplicates eagerly, a variant of bloom filers was utilized in [20]. Refer to [21] for the summary of in-network aggregation.
3.2. Data Gathering
For the situations that require the sensor readings rather than an aggregate value, some approximate sensor data gathering techniques have been proposed since most applications of sensor networks do not require highly accurate data. Some correlations appear among sensor readings. Such correlations can be captured by standard techniques like the linear regression and statistical distribution functions. Basically, each sensor estimates its readings independently with its own model. And the mirror model for each sensor is in the base station. Thus, if a sensor node does not transmit a sensor reading, the base station can obtain an approximate reading using the mirror model.
BBQ [22] uses the multivariate Gaussian to model the sensor readings instead of data interval. Chu et al. [23] extends BBQ by partitioning the sensor field to cliques in order to utilize the spatial correlation. Since the optimal partitioning is NP-hard, Ken uses the greedy heuristics. Jain et al. suggested dual Kalman filter [24] which is based on the Kalman filter. In addition, Min and Chung proposed EDGES [25] based on a variant of the Kalman filter, that is, multimodel Kalman filter. In [8], by utilizing the spatial correlation such that the change patterns of sensor readings of the neighbor sensors are the same or similar, an effective data gathering technique was presented.
3.3. Join
In some applications, a user wants to identify the relationship between sensor readings in different regions. This regional correlation can be expressed as a join query of sensor readings in two regions. Thus, recently, research on in-network join processing has been proposed to reduce the communication overhead. Some works [26, 27] study how to find an optimal join location using the cost models. In these works, the optimal join location is near to the weighted centroid of three points: the center points of two regions and the base station.
Some in-network join techniques utilize a semijoin operator which filters out one of the relations based on the join attribute values of the other relation. However, due to a large number of join attribute values, a lot of energy is consumed. To alleviate this overhead, some work utilizes the synopsis of join attribute values. In [28], a histogram based semijoin approach is proposed. Stern et al. propose the method called SENS-Join [29], which is similar to that of [28], in order to avoid shipping tuples through the network that do not participate in the joins. As the compact representation, they use pointless quadtree representation.
3.4. Skyline
After Börzsönyi et al. [1] proposed the skyline operator, various techniques [30, 31] have been presented to improve the performance of skyline queries. The sort-filter skyline (SFS) algorithm [30] improves BNL using presorted data set according to the scores computed by a monotone function. By exploiting R*-tree, Kossmann et al. [31] presented an improved algorithm, called NN, based on the nearest neighbor search. The dynamic skyline was introduced by Papadias et al. [2]. Later on, the reverse skyline was proposed by Dellis and Seeger [3].
Since the skyline operator as well as its variants is useful to detect interesting events, there are some studies for in-network skyline processing in WSN environments. In [13], a filtering technique was proposed to reduce the energy consumption of WSNs in which some filter points are broadcasted to every sensor node and the data points dominated by the filter points are not transmitted since they cannot be in the skyline. Recently, a multiple filter-based algorithm called SKYFILTER was proposed to processing skyline over the sliding window in [14]. However, to compute filter points, every sensor node wastes its energy. The most related literature to our work is [6]. To obtain the reverse skyline points in WSNs, the 2-Skyband query that retrieves every point which is dominated by at most one point wrt q is used. However, this technique calculates the reverse skyline with the currently generated points only. In other words, the reverse skyline processing over a sliding window is not supported. In contrast to previous work, we investigate effective reverse skyline processing techniques over a sliding window in data stream environments as well as WSN environments.
4. Reverse Skyline Processing over Sliding Windows
Before the presentation of the overall behavior of our proposed in-network reverse skyline processing, we first present the properties of the reverse skyline in Section 4.1. Since each sensor node in WSNs generates its readings continuously, each sensor node can be considered as a source of stream data. Thus, in Section 4.2, we present our basic algorithm in the context of stream data. Our in-network processing technique based on the basic algorithm will be presented in Section 5.
4.1. Properties of Reverse Skyline
In this section, we present the properties of the reverse skyline. Park et al. [32] showed that, when the d-dimensional space is divided into

Subsets
Lemma 5.
Given a data set P, a query point q, and an orthant o, if and only if a point
Proof.
(:⇐) For
(⇒:) If a point
By Lemma 5, we have
The following lemma addresses that every reverse skyline point wrt q is also a dynamic skyline point wrt q but not vice versa.
Lemma 6.
Given an orthant o and a query point q,
Proof.
For the purpose of contradiction, we assume
From Lemma 6, in order to compute the reverse skyline of each
Lemma 7.
Given an orthant o and a query point q,
Proof (by contradiction).
Assume that
However, when
Example 8.
Consider a data set P in Figure 4. By Lemma 5, we can compute the reverse skyline on each orthant o independently.
Given a data set

Midpoints in an orthant.
4.2. Computing
over Sliding Windows
In this section, we present our algorithm, called RSPW, to compute reverse skyline over sliding windows in WSNs by utilizing the properties of reverse skyline presented in Section 4.1. Basically, RSPW is working on each sensor node to generate partial reverse skyline. In Section 5, we will present how to integrate the partial reverse skyline generated by each sensor node.
To compute the skyline over a sliding window, Tao and Papadias [33] proposed an effective method. Similarly, we need to keep the dynamic skyline based on Lemma 6 to obtain the reverse skyline. Thus, we adapt the sliding window skyline processing technique (denoted as SWSP) proposed in [33] to our reverse skyline processing over sliding windows.
Lemma 9 (see [33]).
Let
Since SWSP is for the skyline processing, SWSP considers a single data space. In addition, SWSP handles the database
For instance, as shown in Figure 6(a),

An example for a sliding window (
As shown in Figure 6(b),
In our work,
Proposition 10.
Given a query point q and an orthant o, every point
Proof.
It trivially holds by Lemma 7.
By Proposition 10, when a point
Lemma 11.
Given a query point q and an orthant o, every point
Proof.
By Definition 2, given a new point
By Lemma 11,
The pseudocode of our proposed algorithm, denoted by RSPW, is presented in Pseudocode 1. The algorithm RSPW computes the reverse skyline over a sliding window. RSPW consists of two parts. The first one is for processing a newly created point
// //the midpoint of a point
(1) Let (2) mark_time = 0 (3) is_dsky = (4) (5) (6) (7) (8) (9) } (10) ( ( ( (14) ( ( ( ( ( (20) } (21)
Recall that we maintain two buffers for each orthant o:
When a new point
When a point
Finally, the set of dynamic skyline point (i.e.,
The following example illustrates the behavior of our proposed algorithm RSPW within a single orthant o.
Example 12.
Let the size of window w be
As shown in Figure 7(b), since

The behavior of RSPW.
Up to now, we present our algorithm to compute the reverse skyline over a sliding window in data stream environments. In the next section, we will describe how to calculate the reverse skyline in WSN environments.
5. Energy Efficient RSPW for WSNs
As mentioned above, the energy efficiency is the utmost important in WSN environments. A brute-force algorithm, denoted as
Based on the following lemma, we can apply RSPW to each sensor node in WSNs.
Lemma 13.
Given an orthant o, a query point q, and two sensor nodes
Proof.
Assume that
Now, we assume that
By Lemma 13, a simple extension of RSPW to WSN environments is that, at each time, a sensor node performs RSPW with its sensor reading and the dynamic skyline points coming from its child nodes to maintain its
Since each sensor node transmits
//This is invoked at each time j // //Let (1) (2) (3) (4) (5) } (6) (7) (8) remove every point (9) unmark every point (10) (11) (12) (13) remove (14) (15) remove (16) (17) remove (18) (19) } (20) } (21) } (22) sendToParent( (23) }
The intuition of
At first, each sensor node
Since
6. Performance Study
We empirically evaluated the performances of our proposed algorithms in two environments: data stream environments and WSN environments. In data stream environments, we measured the processing time of our proposed algorithm RSPW and other algorithms with the synthetic data sets. On contrary, in WSN environments, we present the energy consumption of our algorithms
6.1. Experiments in Data Stream Environments
6.1.1. Experimental Environments
We performed this experiment to compare the execution time of RSPW with Naive and 2-Skyband [6]. In Naive algorithm, each point in a window is compared with the other points in a window to check whether it is a reverse skyline point or not. In addition, since 2-Skyband [6] did not consider the sliding window, we extended 2-Skyband to the sliding window context which computed 2-skyband with recent w points at each time.
In order to evaluate the performance of each algorithm over diverse environments, we used three synthetic data sets which are generated by independent, correlated, and anticorrelated as shown in Figure 8. These three data sets are commonly used to evaluate the performance of the skyline operator as well as its variants [1, 32].

Example of data sets (2-dimension).
Table 1 shows the parameters used in this experiment. Each synthetic data set consists of 100,000 points. We ran all algorithms 10 times with different query points generated randomly and report the average execution times. We varied the number of data points' dimension from 2 to 10 as well as the windows size of a query from 2 to 10.
Parameters.
6.1.2. Experimental Results
Figure 9 shows the execution time of each algorithm according to the data sets with default values of the parameters. As shown in Figure 9, our proposed algorithm RSPW is the best performer. In average over all data sets, RSPW achieves up 9.23 times faster than Naive and 4.73 times faster than 2-Skyband. In correlated and anticorrelated data sets, since the data distributions are skewed, a large number of points are dominated by a few points in each orthant and hence the number of reverse skyline points is small. Meanwhile, since the points are uniformly distributed in independent data set, the number of reverse skyline points is larger than those of the other data sets. Thus, the processing time for the independent data set is worse than those for the other data sets.

Execution time over each data set.
With varying d from 2 to 10, we plot the execution time of each algorithm in Figure 10. As shown in Figure 10, as the number of dimensions d increases, the running time of each algorithm also increases since the overhead evaluating dominance relationship becomes increase with increasing d. However, the performance gap between RSPW and the other algorithms increases over all data sets as d increases since RSPW calculates the reverse skyline efficiently using two buffers.

Varying d.
We varied w from 2 to 10 and present the running times of the algorithms in Figure 11. As illustrated in Figure 11, when the window size w is small (i.e.,

Varying w.
6.2. Experiments in WSN Environments
6.2.1. Experimental Environments
We show show the effectiveness of our proposed algorithms for WSNs with a real-life data set. As a real-life data set, we used the data LUCE provided by Audiovisual Communications Laboratory [34]. A sensor network is composed of 89 nodes deployed on the EPFL campus as shown in Figure 12 and they measured key environmental quantities at high spatial and temporal resolution over a year. The data set consists of 9 attributes such as surface temperature, solar radiation, relative humidity, rain meter, and wind speed. The size of the sensing field is 277 × 430 meter2 and the base station is located at the center of the sensing field. To make a routing tree, we set the communication distance to 55 meter. The average depth (i.e., average number of child nodes) and the maximum width of the routing tree are 4.94 and 13, respectively. Since, in the real-life data set, the values of sensor readings are fixed, it is hard to make diverse configuration. Instead, to simulate diverse environments, we used some parameters. The parameters used for our experimental study are summarized in Table 2.
Parameters.

Placement of sensor nodes.
For this experiment, we implemented
6.2.2. Experimental Results
We plotted the total energy consumption of the sensor network varying diverse parameter values in Figure 13. Figure 13(a) shows the consumed energy of each algorithm varying d. As the number of dimensions d increases, the energy consumption of each algorithm increases since the size of data to be transmitted increases. However, since our algorithms

The total energy consumption with the real-life data set.
With varying the window size w from 2 to 10, we plot the energy consumption of each algorithm in Figure 13(b). Since, in
Figure 13(c) presents the consumed energy of each algorithm varying the packet size p. As the packet size p increases, the number of transmissions decreases since many points can be in a packet. Thus, the energy consumption of each algorithm decreases with increasing p. However, our enhanced algorithm
7. Conclusion
In this paper, we present an algorithm RSPW to compute the reverse skyline over a sliding window. To calculate the reverse skyline, we divide d-dimensional data space into
Footnotes
Conflict of Interests
The author declares that there is no conflict of interests regarding the publication of this paper.
Acknowledgment
This work was supported by Defense Acquisition Program Administration and Agency for Defense Development under Contract UD140022PD, Republic of Korea.
