Emergent Technologies in Big Data Sensing: A Survey

Abstract

When the number of data generating sensors increases and the amount of sensing data grows to a scale that traditional methods cannot handle, big data methods are needed for sensing applications. However, big data is a fuzzy data science concept and there is no existing research architecture for it nor a generic application structure in the field of sensing. In this survey, we explore many scattered results that have been achieved by combining big data techniques with sensing and present our vision of big data in sensing. Firstly, we outline the application categories to generally summarize existing research achievements. Then we discuss the techniques proposed in these studies to demonstrate challenges and opportunities in this field. Finally, we present research trends and list some directions of big data in future sensing. Overall, mobile sensing and its related studies are hot topics, but other large-scale sensing researches are flourishing too. Although there are no “big data” techniques acting as research platforms or infrastructures to support various applications, multiple data science technologies, such as data mining, crowd sensing, and cloud computing, serve as foundations and bases of big data in the world of sensing.

1. Introduction

Big data, as a concept, was first proposed by META Group analyst Doug Laney in the 2001 research report [1] and his related lectures. Increasing volume (amount of data), velocity (speed of data), and variety (range of data types and sources) are used as three important characteristics to define big data. As for now, two new characters, value and veracity, are added by some organizations [2] to further illustrate the necessary properties of big data. This “5Vs” model, which is used for describing big data and its related challenges, like data capture, storage, search, sharing, transfer, analysis, and visualization, is a hot topic in current data science research field.

In the field of sensing, special issues are generated. With the exponential increasing number of data generating devices (such as computers, tablets, and sensors, especially smartphones), vast amount of data needs to be processed. Research methods for big data can be applied to various fields by utilizing sensing techniques, such as science, engineering, medicine, health care, finance, business, and ultimately the whole society. However, currently, there is still no generic and systematic big data research model in the world of sensing.

The vision of data processing in future sensing is vague and relevant infrastructures and structures have not yet been well defined. A road map has yet to be made, even though there have been published research papers. Techniques to collect, analyze, or process sensing data are usually ameliorated from existing data sciences, and, until now, there is no clear definition to describe what is “big data.” The most intuitive understanding that comes into people's mind is a large amount of data reflecting the space domain of data sourcing. In the 5Vs model, volume and variety are directly relevant to this understanding. In the world of sensing, large amount of data is usually gotten from a large sensing area, for example, town or city level sensing or the applications for Internet of Things.

Town or city level sensing relies not only on sensors within city infrastructures, but also on a large number of device owners willing to sense and contribute their data to data aggregation platforms. A survey result shows that every day we create more than 2.5 quintillion bytes of data, and a prediction says that, in 2016, over 4.1 terabytes of data will be generated per day per square kilometer in urbanized land area. Furthermore, in 2016, it is estimated that $39.5$ billion dollars will be spent on smart city technologies, up from $8.1$ billion dollars in 2010 [3]. The pervasive use of mobile phones and other similar mobile sensing devices will account for a dominant portion of aforementioned increment. Smartphones enable everyone to collect data at any time and place. Although some sensing data may not be valuable to the sensor owner, they can be valuable to the scientific community.

Currently, building a generic sensing platform for a city scale data application faces many challenges. The first challenge is how to design a system in which users can benefit from data sharing [4, 5]. As one of the most important parts of city scale sensor, personal sensing devices are still within the “owner-is-the-user” model. Getting considerable benefits without personal information leakage is the baseline of making full use of individual sensing data, as privacy and security are general concerns. The second challenge is how to effectively collect the data scattered in the individual sensing devices. The large amount of data generated by distributed sensors typically does not have a central control or a centralized accounting device that can be notified when new data is generated.

Internet of Things (IoT) is a much broader concept which was formally proposed by Kevin Ashton in 2009 [6] as a technique for uniquely identifiable objects and their virtual representations in an Internet-like structure. This concept later develops into a worldwide architecture for sensing, computing, and communication. Such large amount of computing and communication resources enables sensing, capturing, collecting, and processing real-time data from billions of distributed devices and serves a great number of applications including health care, climate monitoring, earthquake detection, volcano monitoring, power grid control, smart home, and business intelligence [7]. In the prospective future, IoT will not be restricted to uniquely identifiable objects and their virtual representations. It will include billions of devices which pour vast amount of data to our existing network. Sensor networks increasingly enable applications and services to interact with the physical world; such services may be located across the Internet from sensing networks. Internet techniques, cloudy services, and smart assets are being used to store and analyze these data to improve networks' features, such as scalability and availability, which are required by future sensor networks that contain millions or even billions of devices.

Beside the “spacial domain,” “time domain” sensing data management is also a hot topic in data science. Real-time processing of large amount of sensing data normally requires very high computing abilities and large-scale hardware infrastructures. Even with sufficient resources, it is still challenging to reliably compile large-scale time-stamped data set. As examples in [8] demonstrated, the physical restrictions in the measurement systems, the limitations of computing abilities, the energy capacity, and the difficulties posed by certain measurement problems, will result in data loss, data errors, and ambiguities in data inferences. Long period sensing data analysis and storage are also important research topics in “time domain,” especially in the field of environmental monitoring and object behavior analysis [9]. Remote sensing technologies are wildly applied in environment related research fields. The data acquired and accumulated (usually in the form of images) requires large storage space and highly efficient analysis methods. For object behavior analysis, various techniques are applied and usually long term monitoring is required. Take [9] as an example; the accurate and continuous monitoring of lakes and inland seas is applied to analyze impact of climate changes and human activities on the terrestrial water resources since 1993.

In the rest of this survey paper, we first introduce the applications that motivate the big data sensing research in Section 2 and then summarize the existing techniques for big data sensing in Section 3 and propose the future research directions in Section 4. Finally, we conclude this paper in Section 5.

2. Applications

In this section, we first introduce smartphones enabled big data applications including Internet of Things, crowd sensing, environment monitoring, and health monitoring. Then, we discuss the common issue of smartphone enabled applications.

2.1. Applications Enabled by Smartphones

Today's smartphones serve not only as important communication devices, but also as computing and sensing devices with rich sets of embedded sensors, such as accelerometers, digital compasses, gyroscopes, GPS, microphones, and cameras. Generally, combining growing computing abilities, these sensors are enabling new applications across a wide variety of domains, such as human health care, social networks, safety, environmental or climate monitoring, and transportation. They lead to a new research area called mobile phone sensing [3, 10–12]. As the number of smartphone users increases rapidly across the whole world, large amount of data is generated, transferred, aggregated, and analyzed. The ubiquity of mobile phones and the increasing size of the data generated by sensors and applications lead to a new research domain across computing and social science. Big data, as a data science to process high volume information, is consequently involved in this field. Researchers have begun to address big data issues by using large-scale mobile data as an input to characterize and understand real-life phenomena, including individual traits, human mobility, communication, and interaction patterns.

2.1.1. Smartphones for Internet of Things

Semantic-oriented vision, as one of the broader visions of Internet of Things (IoT), emphasizes on data integration and management from vast number of smart devices, such as smartphones, pads, sensor nodes, and other devices with the ability to send out information [13]. As one of the most important constituent parts of IoT, smartphones can not only provide more information than other devices, but also act as information collecting and distributing terminals. How to integrate diverse information is a big challenge of utilizing smartphones for IoT. In [14], the authors proposed an approach to optimize data collection performance by updating routing structure of smartphones, which can also be applied to large amount of data processing in IoT.

Mobile data collected from wireless sensor networks are strongly spatial correlated; however, traditional methods are usually in static setting and the so-called optimal data collection trees are fixed and their performance suffers from link problems when mobile users change virtual sinks. The model proposed in this paper initializes an optimized tree and updates it according to users' accessing virtual sinks by locally modifying the previously constructed data collection tree. Their model is easy to implement, has low cost, and provides real-time data acquirement even when updating the tree structure. Similar techniques can be applied to vast amount of data collection and distribution structures by dynamically modifying the mobile access routing structure to achieve optimal performance [15, 16]. Similar to [14], the authors proposed a model for data collection by using smartphones in [17]. Instead of optimizing data accessing routing, this paper focuses on construction of data center and relative database. By connecting smartphones and data center to the Internet, users can monitor sensor information remotely and in real-time.

2.1.2. Smartphones for Crowd Sensing

Static sensing is traditional and mature but has node coverage, maintenance, and scalability issues. Mobile crowd sensing is more flexible, manageable, and scalable, especially when vast numbers of smartphones are used as sensing nodes in cities or towns. The fast increasing number of smartphone users, various inherent mobile applications, and exponential increasing capacity of 3G/4G networks lead to this new mobile sensing paradigm. Currently, smartphones are used as sensors for localization, personal/surrounding context recognition, traffic monitoring, and other daily life related applications. But, in the near future, other applications, such as environmental pollution detection, health care monitoring, and social life analysis, will generate large amount of sensing data. Unlike conventional sensor networks, mobile crowd sensing is more human related; therefore privacy and security should be carefully considered. Otherwise, smartphone users will be unwilling to share their devices and subsequent data with others. To the best of our knowledge, there is no mature platform for mobile crowd sensing and researchers are working in that direction. For example, researchers proposed Medusa [18], which can provide high-level abstractions for stages in completing crowd sensing tasks and a distributed system which can coordinate the execution of these tasks between smartphones and the cloud.

How to attract users to participate in projects of crowd sensing becomes a very important problem. Unlike conventional methods of constructing sensor networks, there is less support from institutions or organizations. The willingness of personal users decides the scale of mobile crowd sensing. In [19], two system models are proposed. The platform-centric model is designed to award participating users who share information with others, and the user-centric model can help individuals to ask for a reserve price for their sensing service. The former is run as a Stackelberg game to maximize the utility of this platform and no user can improve its utility by deviating from the current strategy unilaterally. In this model, the total benefit for user is fixed and competition exists. The second model introduces a strategy in which users calculate their won cost and ask for prices. In this model, users receive payments which are not lower than their asked prices, if their prices are accepted. These two models normalize user behaviors in crowd sensing networks to protect users' benefits, in order to encourage individuals to join in sharing networks.

In the above two paragraphs, we introduced two popular applications in mobile crowd sensing. With the rapidly increasing number of smartphones, more and more research topics are developed, like strategy of data collection, mobile sensing performance, communication quality, privacy and security, energy efficiency, and other categories of applications. The fast development of mobile crowd sensing not only leads to a generation of vast amounts of data, but also requires fast and efficient data processing abilities. Science of big data can be one of mobile crowd sensing's fundamental research fields [20].

2.1.3. Smartphones for Environment Monitoring

Weather and environment monitoring are usually the responsibility of governments and some specific institutions. But if billions of mobile phones can be utilized for such jobs, more diversified and abundant information can be used to improve human's living conditions. Currently, combined with a cloud of supporting web services, large amount of smart mobile devices make such a distributed data collection infrastructure possible, though not immediately usable. An appropriate platform can be used in this field for further applications. Paper [21] proposed the Personal Environmental Impact Report (PEIR), a system that combines web and personal mobile techniques to inform users of environmental impact and exposure, which can help people make more informed and responsible decisions. PEIR is built on location tracing and GPS records that are sampled. Based on the GPS information, users' trips are predicted and environmental impact or exposure measurements are aggregated from each trip. This platform can be used for a number of applications, such as traffic condition measurement, environmental pollution monitoring, and vehicle emission estimating. Though only four applications were proposed by the authors, new models can be developed based on this platform and scalability, stability, performance, and usability are the foreseeable promising directions for this kind of platforms.

While the above paper [21] shows an example of platform building for environment monitoring using smartphones, [22] is a good instance to show a specialized application. Nericell is a system designed to make full use of mobile phone sensing components to provide rich sensing information about the road and traffic conditions. In this system, microphones, GSM radios, and GPS sensors are organized to detect potholes, bumps, braking, and honking. The large amounts of mobile phones and the variety of information from each mobile device can guarantee an effective road and traffic condition detection without significant energy consumption. Unlike similar approaches which use meaningful digital information, Nericell also utilizes sharp changes of analog signals like acceleration alternation from accelerometers and then builds certain models to detect incontinuous vehicle running behaviors. This type of application largely enriches the utilization of smartphone sensors and shows a broader prospect of mobile sensing.

2.1.4. Smartphones for Health Monitoring

On-body sensing with small, inexpensive, and low-power sensors has led to series of research on human health monitoring. With the improvement of artificial intelligence and computing capability of mobile devices, machine learning has been applied to provide health suggestions by analyzing data acquired by sensors [23]. Mobile phones, as the “most frequently carried devices,” are the best human behavior monitor devices. Without buying expensive sensors or carrying additional heavy sensors, people can simply get their activities and health suggestions from their cell phones. Researchers have found that regular daily activity is important to people's physical and psychological health, regardless of their static body conditions. Therefore, mobile phones can be the best choice over any other approaches if they are carefully utilized. Paper [24] introduces UbiFit Garden, a system that is designed to interpret and reflect on the data about people's physical activities, and provides certain health information to users. This system is comprised of three parts: (i) a fitness device which uses 3D accelerometer and barometer to acquire and process data, (ii) an interactive application which runs on mobile phones to interact with users about practice activities, and (iii) a glanceable display that presents key information about the user's physical activities and goal attainments. Though a special designed fitness device is used in this paper, the proposed technique can leverage the 3D accelerometers and barometers in smartphones as well. Based on this platform, a smartphone network can be built and people's health information can be aggregated, compared, and analyzed by central servers; then, useful health suggestions are sent back to individuals' smartphones based on machine learning or doctor suggestions (if certain health institutions are involved).

2.1.5. Common Issue of Smartphone Related Applications

In previous sections, we introduced different applications enabled by smartphones. One common research issue among the wide variety of applications that use smartphones as sensing data sources is power consumption. With the development of smartphones, more and more embedded devices and powerful processors are attached. Therefore, smartphones consume significantly more energy than the previous generation of cellular phones. A smartphone which never stops using its GPS, not to mention those applications which might combine GPS with other components, may run out of energy within several hours. So, for every newly developed application, power consumption is an unavoidable problem.

Crowd sensing with smartphones (and its advantages) is discussed in the previous subsection; for example, observing and measuring phenomena over a large area by collecting and sharing data is implied [25]. However, due to limited battery storage, smartphones usually cannot support nonstop sensing tasks. Thus, for every newly developed application, power consumption should be considered. This paper proposed a Mobile Publish/Subscribe (MoPS) middleware system which focuses on the requirements of mobile and resource-constrained environments with a goal of reducing overall energy consumption and building a general platform for mobile crowd sensing. The basic idea of MoPS is filtering out uninteresting data from mobile Internet-connected objects to avoid redundant information being transferred to the cloud. The filter method for sensor data depends on contexts before transmission. For example, a specific application is covered by multiple smartphones and only one needs to transfer data to the cloud.

Reference [26] focuses on how to save power from smartphones, presence services. The main idea of this paper is similar to MoPS. By analyzing a large mobile data challenge data set, smartphones learn and infer user presence status by using available context data to enable nonintrusive and energy-efficient maintenance automatically. Besides using the calendar or other settings as static grounds for status alternating, GPS, accelerometers, and microphones are applied to sense user's behaviors. Whenever people enter an “unavailable” or another status in which it is not convenient for users to response to a real-time conversation, the presence service frequency is reduced. Since smartphones usually have a considerable number of present related applications, turning off presence service is an effective method to save power.

2.2. Techniques for Smartphone Enabled Applications

Smartphones, due to their vast number, wide coverage range, multiple embedded sensing components, significant computing ability, and convenient network accessing, are currently considered to be the largest sensing data source. The potential of embedded components (e.g., cameras, microphones, GPS, compresses, and accelerometers) is not yet well developed. Every combination or new application of these components can provide a brand new direction for mobile sensing. For example, utilizing microphones to detect vehicle horns can infer traffic conditions [22]. With the development of computing capabilities, every mobile phone can act as a high performance terminal, in which case cloud and parallel computing can be applied with the help of multiple network accessing ability like WiFi, 3G, Bluetooth, and so forth. Based on these hardware advantages of smartphones, various software designs and policies are proposed. These include information sharing tactics, data management, privacy preservation, and security protection. At the system level, scalability, robustness, and other requirements call for further research and novel techniques. On the other hand, techniques of studying smartphone sensing are highly diversified. Multiple existing data science techniques (e.g., cloud computing [27], data mining [28, 29]) have been applied in this field. In [27], an approach (called Pickle) was proposed to prevent privacy leakage when applying cloud computing to collaborative learning for mobile sensing. Pickle perturbs the training data by premultiplying a private random matrix to train feature vector matrices. Since the private random matrix can be seen only at the user side, user's information is unavailable to cloud server or other participants after perturbing.

Data mining is considered as another frequently used technique to analyze smartphone sensing information. Various embedded sensing devices (e.g., cameras, microphones, accelerometers, light sensors, and GPS) generate abundant information to achieve innovative applications. When large amount of sensing data are aggregated together, data mining can be applied to extract useful and interesting information from them. The rapid growth of smartphone number shows great opportunity for data mining and introduces new challenges at the same time. Paper [30] (i) discusses the limitation and impact on applying data mining to mobile sensing in detail and (ii) introduces their solution: a method based on their wireless sensor data mining which is a smartphone-based sensor mining architecture. In this paper, the authors discussed issues which include the following: limited resources, scalability, real-time responsibility, granularity, configurability of polling rate, interactions with normal phone functions, conflicts with the needs of sensor mining, convenience for developers, self-learning ability, trade-offs between application scalability and limited resources, database management, I/O bottleneck of real-time transmission, parallelism requirements, pipelining requirements, programing language choice, algorithms for different application, secure connection/communication/storage, privacy control, trade-offs between sensing mining performance and energy/resources, and data compression (encoding).

Besides the above mature data analyzing sciences, other general or special purpose techniques are also developed. For example, [31] introduces a method which can utilize human-carried mobile phones to mule information from distributed sensors to other sensor nets.

2.3. Other Applications

Besides the smartphone enabled applications, wireless sensor networks [32–35] also enable a lot of applications. In this section, we introduce these applications including building energy management, pollution monitoring, and smart transportation systems.

2.3.1. Building Energy Management

Since sensor devices need to continuously collect data, energy management of sensor devices [36–38] is critical. On the other hand, utilizing sensors for building energy management [39–41] is an emergent application in sensor network community. As one of the most important research fields in the world of sensing, building energy management investigates energy consumption information in both space and time domains, by utilizing smart meters. The energy utility companies in the United States have deployed millions of “smart meters” in both residential and commercial buildings to better understand the electricity demand of consumers. This advanced metering infrastructure generates huge amount of data about the energy consumption of a customer at high granularity (e.g., at second level). But the utility companies have been inefficient at getting maximum utilization from such a wealth of data. About 27% of the total electricity consumption in the USA is utilized for thermal conditioning (HVAC), that is, heating and cooling of premises in response to the outside temperature. One of the recent works [42] focused on building thermal profiles of residential energy users using smart meter data. Another paper [43] by the same authors leveraged the concept by building thermal profiles at both individual and group levels and applying them in a dynamic model for studying the thermal sensitivity in a given sample of users. Such profiles can also be utilized by the utility companies in their demand-response programs that focus on temperature-dependent consumption. The paper also analyzed the seasonal and time-of-day effects on thermal sensitivity at both individuals and their neighborhoods. Finally, it presented a methodology for aggregation of thermal profiles based on geographically homogeneous groups of users.

The rate at which data are being generated from the current electric microgrids and smart grids is tremendous. Efficient utilization of the generated real-time streaming sensor data remains a challenging task considering the sheer volume, complexity, and the rate of acquisition. Therefore, there is an urgent need to effectively manage and control such data via advanced processing, modelling, optimization, real-time forecasting, and analytics. There are internal factors (related to the grid) and external factors (e.g., weather, user behavior, and user economics) that affect the management of real-time data. Paper [44] proposes large-scale predictive analytics for real-time energy management by deploying a microgrid in a university campus aiming at maximizing its operational benefits. This particular environment was chosen due to the rich resources of cutting-edge analytics and high performance computing available for studying the huge and complex real-time data streams generated by the deployed microgrid. The proposed model aims at improving operational efficiency, lowering operating costs, and reducing the overall carbon footprint of the microgrid by using novel time series prediction algorithms.

Today,s residential and commercial buildings are equipped with large number of different sensors and smart meters. These devices are primarily used as a mode of providing value added services by service providers and getting important feedback for customers on their usage patterns. But these devices can be used to make unwanted inferences about occupants and their behaviors. The research paper [45] explores this possibility of unwanted inferences (e.g., privacy) from the sensor data available to the utility companies. It attempts to infer answers to the following questions: (i) is a particular space occupied? (ii) how many people are there in that space? (iii) if that space is occupied, what are its occupants' identities? and (iv) which particular subspaces do they occupy? The paper focuses on inferences from two different types of sources: motion sensors (i.e., passive infrared sensors) installed by security companies and smart electric meters deployed by utility companies.

In the current era of smart meters deployed by the utility companies, the rate at which data is being generated by such smart devices is immense. The consumers, who are the key stakeholders of the energy usage data, are often not involved in the analysis of this data. There are no existing systems which (i) empower users with access controls and (ii) provide control and access of their energy usage data with high granularity. In [46], the authors propose a new system design which (i) offers cloud-based personal data and execution containers for persistent data storage and (ii) at the same time gives independence to consumers in choosing their analytic algorithms. In this system, the consumers can also utilize third party applications which analyze data in a privacy-preserving fashion. Finally, the containers can also be utilized for secure and private control of home appliances from any Internet-enabled device.

2.3.2. Pollution Monitoring

Urban air pollution is one of the growing concerns in major cities worldwide. Large amount of data in the form of air pollution maps helps health protection agencies in assessing air quality. Ultrafine particles (UFPs) are often neglected as atmospheric pollutants, due to their small contribution to the total particle mass. The authors in [47] try to understand the impact of these high spatial variability particles on human health by proposing a mobile measurement system for producing accurate UFP pollution maps with high spatiotemporal resolution. The static measurement systems are inefficient at measuring such kinds of highly spatial variability pollutants. Moreover, these systems have high acquisition and maintenance costs. To enable a large urban coverage, the proposed system has its 10 sensor nodes installed on top of public transport vehicles. It also utilizes land-use regression models for modeling pollution concentrations at locations not covered by the mobile sensor nodes.

2.3.3. Smart Transportation System

Today's modern cities are one of the major contributors to the generation of big data. The different mobile sensing devices as well as the city infrastructure sensors produce large amounts of data, which provide a wealth of information about their surroundings and can be utilized for improving the social lives of human beings. In the current scenario of more precise and pervasive sensing, lots of dynamic information about individual cars becomes available through car-to-car (C2C) and car-to-infrastructure (C2I) communication. Paper [48] dwells on the possible research area of dynamic infrastructure-to-car communication where dynamic information about vehicles is exploited. The main contribution of the paper is a model of a distributed intelligent speed adaptation system. The authors also provide a formal proof about the correct dissemination of speed limit information by such a system. This information is in the form of speed advice from traffic centers, traffic sign detectors, or obstacle detectors. The paper proposes a global control system, to be used by highway authorities, for considering incidents (such as accidents, construction sites, or traffic jams) which are well beyond the scope of sensor coverage of a local vehicle. The paper also identifies the safely operable bounds of such a system.

In [49], the authors present Context-Aware Platform using Integrated Mobile services (CAPIM) which is basically a platform enabling smart management of the large amount of available contextual information. CAPIM focuses on collection and aggregation of context data (e.g., location, user's profile, and characteristics) through smart services offered by mobile devices like smartphones and tablet PCs that have multiple sensors. The platform supports collaborative environment by enabling its users to learn about their surroundings through sharing data without too much user interaction. The authors then present an intelligent transportation system that is designed on top of CAPIM, for improving the understanding of traffic related problems. Finally, they propose a solution called context-aware framework which deals with the efficient storage of context data on a larger scale.

3. Summary of Big Data Techniques

As discussed above, a lot of applications are in the urgent need of novel big data techniques. However, big data itself is a new data science. Currently, there is no mature architecture for it. Presently, some of the researchers in this field are devoting themselves to building general platforms, architectures, and analysis methodologies. The others are focusing on developing solutions for particular problems.

3.1. Platform Development

One of the significant features of sensing in future is “gigantism.” Concepts like smart cities and IoT require vast number of sensors to work together under certain control policies. Conventional topologies, policies, architectures, and methods are no longer suitable. Platforms which can deal at city level, country level, or even world level with sensor data are in need.

In [4], the authors explored five key challenges, which all researchers will face in the field of future sensing in developing a city level sensing platform. The first challenge mentioned is crowd sourcing and collaboration. This is mainly about how to create a mature system from which users can get tangible benefits through sharing and using information. Current single-provider model no longer fits the requirement of future sensing but multiple-provider model is suffering from lack of structure and consistency. A mature platform must support operations for sharing, annotating, reusing, and analyzing data itself. The second challenge is heterogeneity and disparity. Sensing data in a city are distributed anywhere and it is impossible to aggregate them in one central location. Data collected by individuals under diverse regimes are different as a matter of course. An effective informatics system which can extract useful information from different data format is necessary. The third challenge is multiresolution and multiscale which relate to the fact that there is no unified standard for sensing so far. While data from different sources are aggregated for new applications, multiresolution is the first problem researchers are facing. Even worse, will the conclusion based on these resources lead to future ambiguity? The fourth challenge is data uncertainty and trustworthiness. Data from some sources may be wrongly calibrated or inaccurate due to sensing devices. Sensor system should be able to identify uncertainty and distinguish trustful information sources from others and ensure that users can manage and get profits from different sources. The fifth challenge is model and decision making. The quality of analysis depends on data and leveraging weights of different data sources are key issues. Moreover, the costs of time and resources processing and analyzing large amounts of data are too high given that real-time decisions need to be made.

Paper [5] focuses on building cloud-based big data architecture for supporting sensor services. Data quality is key aspect of their system. The purpose of this paper is building a sensing infrastructure for federated sensor services paradigm. However, several design requirements must be considered. The first one is models for feed content and quality. A cloud network designed for federated sensor services should be able to satisfy customers' requirements in terms of both content and quality. The second is techniques for feed discovery, composition, and adaptation. Techniques for a federated sensor services' cloud should be able to adapt various environmental dynamics. The third is markup language. A semantics-rich markup language is required for user applications to express their feed requirements and feed providers. The fourth is massively scalable feed storage and analytics. A federated sensor service cloud should provide scalable storage and analytic services for feeds. The fifth is pricing models and service-level agreements (SLA). Benefits are incentives for users to join certain services. A federated sensor service cloud should be able to support real-time pricing model, based on service quality. And an effective SLA is critical for sensor data markets.

The authors of [17] proposed another model that is designed for wireless sensor networks to aggregate sensor data from various devices. Nowadays, a vast amount of mobile devices is connected to Internet and users can get access to sensing data by using user-friendly mobile applications anytime and anywhere. Then integration of all sorts of data through Internet is challenging. The proposed model in this paper fully utilizes existing infrastructures to aggregate, process, and distribute data. It can be considered as ubiquitous since it is designed for general data integration scenes. The whole model contains a REST Web service which relies on open standards such as Hypertext Transfer Protocol (HTTP) and Extensible Markup Language (XML) and a MySQL database to store information from mobile devices. Then, the data can be delivered to mobile clients in XML messages by HTTP servers.

3.2. Data Processing Techniques

Big data, just as its name implies, is a data science which cannot be easily processed using existing infrastructure or data processing methods. Currently, researchers are working in two directions to solve this problem. One is modifying and improving current infrastructures, for instance, strengthening processing abilities or optimizing computing structures, to handle data more efficiently. Another direction is developing new data management methods. Various techniques are applied in each direction and it is hard to categorize them precisely. So, we only introduce several representative papers in this section.

In [50], the authors introduce a well designed sensor network (RACNet) that can be used for monitoring data center's environmental conditions. RACNet is a large-scale sensor network for high-fidelity data center environmental monitoring. The sensor nodes of this network are custom-made. And the protocol applied here is a congestion control policy called Wireless Reliable Acquisition Protocol (WRAP), which is developed by leveraging frequency and time multiplexing. The experimental results show that RACNet can improve the data center's safety and energy efficiency. WRAP is the most important part in RACNet for reliable wireless data acquisition. It inherits advantages from both distributed and centralized data collection policies. A distributed system will suffer channel contention which eventually leads to packet losses due to lack of coordination, especially under high network load, while a centralized data collection system requires additional communication load from or to the gateway, especially when the number of nodes in a network is large. The square increasing control information load adds a great burden to the large-scale sensing network. As a hybrid approach, WRAP transfers tokens, which can be passed one by one through distributed nodes, to exchange authority of sending control information. Thus, tokens can avoid being passed to interflow contention which may lead to congestion and packet loss.

In [51], the authors propose prediction models to improve geometric monitoring framework. These models provide significant communication savings ranging from two to three orders of magnitude, compared to the transmission cost of the original monitoring framework. Multiple predictor models are proved to fit this kind of large-scale monitoring network. Actually, the concepts of the predictor models proposed in this paper have existed for a long time, but applying them to significantly reduce the communication burden is the key idea of building a big data sensing network. If the current infrastructure cannot afford the impact of rapid growing data volume, there is a need to improve or redesign current systems for higher computing abilities or data throughput.

Paper [52] introduces a data management method that is designed for data query processing. Packets sent by sensors usually lack time information, and even timestamps are embedded. Query processing is still challenging due to the infinite amount of sensor data. Conventional model-based query processing approaches mostly employ the relational data model on top of modeled segments of sensor data. MapReduce is applied in the cloud era to have time series stored in key value stores. In this paper, the authors proposed KVI-index, which combines the advantages of key value stores and the MapReduce parallel computing together, to dynamically accommodate new sensor data segments efficiently.

Opportunistic sensing is another new approach which exploits sensing capabilities of mobile devices. It can be applied as tactics to enlarge mobile sensing scales without additional investments. Paper [53] describes a framework for fully distributed opportunistic sensing which can perform recruitment and collect data. Profile-cast and opportunistic geocast are used for recruitment. An original version of profile-cast aims at reaching nodes which match a certain target profile, but the recruitment also needs to reach the nodes that match only a part of the target profile. Based on opportunistic geocast, geodissemination which calculates EVR for the buildings in the traces, instead of for the hexagonal cells, achieves better performance when recruiting nodes. Similar to the recruiting case, data collection aims to reach any of the nodes that match the target profile, since sensing nodes are usually greatly out of sync.

Another way of dealing with large amount of data is compression. Different compressing algorithms suit different application scenes. Paper [54] introduces GAMPS, a compressing method which processes sensing data before they are aggregated in data center for mining. Though the compressing method is not lossless, maximum error is acceptable compared to the significant profits. Two key ideas are proposed in this paper. One is dynamically compressing data in a group which contains related signals, and the other is considering different amplitudes of signals and reconstructing the joint signal within the maximum allowed reconstruction error bound. Besides these two compressing methods, GAMPS maintains an index so that several important queries can be issued directly from compressed data.

The authors of [55] worked on a data set which is relatively “big.” In this realm of wireless sensing, nodes with deployed devices are usually inexpensive and have limited computing ability, energy, bandwidth, and storage space. In this kind of sensing networks, there are new challenges in data processing and dissemination. Though the total amount of data is not that large, compared to the limitation of sensor nodes, novel techniques are still required to improve the networks' data processing capabilities. The method proposed in this paper compresses data streams from different sensors based on the historical information they carried. Though not lossless, the compressing algorithm in this paper has a lower compressing error ratio than conventional methods. The method is designed to find correlation and redundancy from measured information of the same sensors. A base signal is extracted based on the difference of correlation signals which are from real measurement features. These measurement features are used to encode signals as well. The proposed algorithm is not restricted to particular sensing application scenario. So it can be applied to any data set in which correlation and redundancy exist.

Sensing in the future will grow in size with no doubt, and large amount of data can be aggregated in many physical systems over time. But since these series usually exhibit various behaviors, it is challenging to build one static model to analyze them efficiently and benefit from the growth of data. In [56], a dynamic model which integrates multiple existing models is proposed. It selects suitable models for different series based on their extracted features. In the feature extraction techniques which are used for individual time series, both linear and nonlinear methods are applied. The main idea known as “trajectory mining” is used to model the evolution path of time series in the feature space. This paper shows that combining and improving current techniques is a convenient way to solve the upcoming sensing data problems.

3.3. Techniques for Specific Problems

The increasing scope of applications of the wireless sensor networks is producing data at an extremely higher rate than before. The sudden inconsistencies of data, or outliers, often affect applications which heavily rely on timely and reliable sensory data. Current approaches to identifying outlier values introduce an overwhelming communication overhead which limits their practical implementations. The researcher of [57] proposes Tunable Approximate Computation of Outliers (TACO), an outlier detection framework that trades bandwidth for accuracy. TACO supports various similarity measures such as the cosine similarity, the correlation coefficient, and the Jaccard coefficient. It involves two levels of hashing mechanisms. The first level deals with dimensional reduction using locality sensitive hashing. The second level of hashing comes into picture during the intracluster communication phase. TACO also employs a boosting process for improving its accuracy. The TACO's novel load balancing and comparison pruning mechanisms ensure reduced processing and communication load at clusterheads, resulting in a more uniform, intracluster power consumption. Therefore, TACO can prolong unhindered network operations.

Recently, the wide-area shared sensing has been the center of attraction. Different from a typical wireless sensing application, it has certain characteristics such as a relatively diverse set of queries (e.g., Max/Min, Sum, Uniform Samples, Quantiles, Top-k readings, frequent readings, and push-based data collection). There are several reasons for using the push-based data collection technique, for example, large number of geographically dispersed sensors, substantial high query rate to the shared sensor compared to the data collection or reporting frequency of the sensor, and occasional connectivity of some sensors (e.g., once per hour) for data reporting purposes. These reasons make it unfeasible to use pull-based data collection at query time. The portals usually outsource data collection and query processing tasks to the third parties, called aggregators who provide data aggregation services. Such an outsourced aggregation model faces key security challenges such as the fact that aggregators can be untrusted, compromised, or even malicious. Thus the correctness of answers provided by aggregators should be verified to prevent incorrect query answers.

Currently, there is a need to maximize the overall value of the collected data, subject to resource constraints, in a particular class of sensor networks that focus on the reliable collection of high-resolution signals. The main characteristic of such systems is that the collected data is more than the amount of data that can be delivered to the base station, due to the severe limitations on radio bandwidth and energy. These systems also cannot utilize the in-network data aggregation due to the high data rates and raw signals requirement. Moreover, applications look for the most “interesting” signals rather than wasting resources on “uninteresting” signals. Some examples of sensor network applications where high-resolution signals are needed from low-power wireless sensor nodes include monitoring acoustic, seismic and vibration waveforms in bridges, industrial equipment, volcanoes, and animal habitats. The researchers in [58] present Lance, a system that aims at providing value-driven bandwidth and energy management framework for high-data-rate sensor networks. Lance uses cost estimators to predict the energy cost for reliably downloading each Application Data Unit from the network. It also utilizes user-supplied policy modules for decoupling resource allocation mechanisms from application-specific policies, allowing the system to be tailored to a broad range of applications.

3.4. Security and Privacy Preserving Techniques

In this field, researchers have investigated secure network protocols [59, 60] and privacy-preserving techniques [61, 62]. The design and evaluation of large-scale urban sensing networks often utilize mobility traces of people. There is a growing privacy concern about the public availabilities of such real user traces. The reason that the synthetic movement models produce inaccurate traces in network design is leading to increasing efforts towards having real-world participants in such systems. The effectiveness of some cloaking techniques, such as introducing noise or reducing the resolution of the recorded data, in protecting privacy of the real-world users is not known. Hence, the side information or the information about the whereabouts of the participants (victims) in public spaces can be obtained by an adversary over an extended period of time. The researchers in [63] analyze, both theoretically and experimentally, the ways in which an attack can be carried out by an adversary either through direct observations or indirect information sources based on the huge amounts of publicized data about real user traces available on either consolidated data portals or websites. The results indicate that it may lead to potential privacy breach. The researchers of [64] present SECOA, the first unified framework with a family of optimally secured (i.e., no false positive/negative) protocols. SECOA supports a large set of aggregations with Most Popular Readings and Frequent Readings aggregation in a secure aggregation scheme. SECOA also utilizes RSA encryption in one-way chains for aggressive optimization to reduce computation overhead.

The amount of data that smartphones are generating is huge with the help of various embedded sensors. The need for classification of data naturally arises. The researchers in [61] explore an entirely new way of building robust classifiers through collaborative learning where users contribute sensor data as training samples such as audio clips. Such learning enables user diversity; thus it helps train a model to robustly recognize the environment the user is in. The employment of cloud computing platform for classifier construction raises privacy concern on submitted samples. The authors propose Pickle, a new approach to privacy-preserving collaborative learning. It encourages user's participation by ensuring privacy of the contributed training samples. Pickle also boasts many desirable properties such as high accuracy, independent user operation, tuning the level of privacy, and robustness to poisoning attacks.

There is a growing privacy concern on the large number of applications available on the Apple iPhone App Store that are accessing private user information without user's consent. The private user information can be user's location, address book, music, photos, and unique identifiers such as IMEI number, UDID, and Wi-Fi MAC addresses. The incorporation of free applications from untrusted developers who rely on third party advertisement frameworks as a source of income often leads to access of private information by these advertisement frameworks when a particular user installs such an application. The authors in [65] compare the other leading mobile OS platform Android with Apple iOS. Android puts the responsibility of reviewing app permissions on users at the time of download while iOS checks apps before including them on App Store. But due to the recent cases of private data leakage because of some applications on iOS, there has been a public outcry in general. The authors propose the ProtectMyPrivacy system which detects access to private information by apps at runtime. The unique feature of this system is its crowdsourced recommendation engine which provides app privacy recommendations based on collected and analyzed user protection decisions.

In today's era, where mobile devices such as smartphones and PDAs are ever-growing in terms of sensing, computation, storage, and communication capabilities, huge amounts of data are being generated by such devices very rapidly. People now are active data contributors instead of being just passive data users as was the case several years ago. People-centric urban sensing is one of the promising fields in this new direction which supports urban-scale distributed data collection, analysis, and sharing. But the privacy concerns in such a system result in user reluctance for participation in contributing personal data. For example, a study on relationship between air quality and public health requires researchers to obtain people's health data such as heart rates, blood pressure levels, and weights for some aggregate statistics. But most of people will not provide their personal data unless they assure that their data will not be misused to invade their privacy. The researchers in [62] propose PriSense, a privacy-preserving data aggregation solution in people-centric urban sensing. PriSense consists of two main components: one for dealing with additive aggregation functions and the other for nonadditive aggregation functions. It utilizes the concept of data slicing and mixing. It can support different functions such as Sum, Average, Variance, Count, Max/Min, Median, Histogram, and Percentile with accurate aggregation results. The level of user privacy can be increased substantially by tuning threshold number of colluding users and aggregation servers.

4. Future Research Directions

With the development of sensing techniques and rapid growth of sensing devices (e.g., smartphones and tablets) large amount of sensing data will be generated and, thus, big data has become a hot topic. However, big data is a relatively new concept in the world of data sciences. The future research directions of big data in sensing have a lot of challenges and also great opportunities for researchers.

Mature infrastructures for sensing data generation, collection, classification, analysis, and processing are desired. For now, several key network techniques [66, 67] can be applied to build this kind of general purpose infrastructures. Cloud computing and parallel structure are essential techniques to build high performance platforms. Grid or stream computing and relevant programming models beyond Hadoop/MapReduce and STORM can be used to define basic architectures of future sensing. Currently, sensor networks are usually restricted to small regions. They are commonly developed and maintained by individuals, labs, or certain groups. However, sensor networks in the future should be at the town or city level, or even world level. They are expected to be maintained by large companies, institutions, or governments. Data will be aggregated and distributed in different methods to all potential users. Therefore, large profits will be gained during the data sharing process. Smartphone sensing is the forerunner of building such large-scale networks and it is one of the top concerned topics in this research field. Mobile sensing will lead this field in the coming future. Therefore, existing localization techniques [68, 69] should be improved to support mobile sensing.

Based on certain infrastructures, data management methods will bloom. But other data sciences have been introduced to solve problems in the world of big data, such as data mining, crowd sourcing, techniques on data base, data management, security and privacy, data protection and integrity, data storage, machine learning, and neural networks. Currently, researchers are focusing on data management performance based on existing techniques. But in the future, with the development of sensing infrastructure, high performance data management methods will flourish. These data management methods include (i) different optimization techniques which improve data analysis ability, (ii) compression methods which condense data values, and (iii) searching approaches which extract useful information from database.

With the development of data infrastructures and data management methods, it is foreseeable that sensing in the future will step into every corner of this world, for example, smart grids [70–72]. Then more security and privacy problems will arise. Without solving security problems, techniques may introduce damages instead of profits. Currently, researchers are mostly focusing on privacy leakages and user data protection. However, with the development of sensing infrastructures and data management techniques, more and more sensing data will flood. Then the sensor network itself can be a target of attackers, just like Internet. Current sensor packets are usually not encrypted and a single node which runs the same protocols can decode information from the network or even inject attacker's malicious information. To address this problem, we need encryption which leads to additional burden to sensor nodes and may impact energy efficiency of sensor networks. How to protect sensing information efficiently is a promising direction.

Applications and research methods are inseparably interconnected. Various and innumerable applications might be developed based on people's needs as determined by the big data collected, processed, and analyzed over time. Though, currently, smartphones enabled applications are the most popular applications in the sensing world, other sensing applications (such as monitoring systems, remote sensing, and sustainable computing) are also promising directions to be investigated in the future.

5. Conclusion

In this survey paper, we introduced research circumstances of big data in the field of sensing. We first introduce different applications that deal with big sensing data and then summarize techniques used to solve the big sensing data problems. Finally, we propose some future research directions. A large number of platforms which have the capacity for sensing at the city level are still in the designing concept stage, but a lot of research methods have been proposed. Though most of them are based on existing data processing and management techniques, they are still very useful. Mobile sensing and smartphone applications are still considered as the most popular topic. Researchers will dedicate themselves to smartphone applications in the near future because it is the most mature large-scale sensor network so far.

Footnotes

Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.

Acknowledgment

This work is supported by the NSF Grant CNS-1503590.

References

Laney

3D data management: controlling data volume, velocity and variety

Application Delivery Strategies, 2001

IEEE BigData 2013, http://cci.drexel.edu/bigdata/bigdata2013/

Xhafa

Dobre

Intelligent services for big data science

Future Generation Computer Systems 2014 37 267 281

10.1016/j.future.2013.07.014

Silva

Tsinalis

Yan

Ghanem

Guo

Lee

C.-H.

Birch

Building a generic platform for big sensor data application

Proceedings of the IEEE International Conference on Big Data

October 2013

Silicon Valley, Calif, USA

94 102

10.1109/bigdata.2013.6691559

2-s2.0-84893235442

Ramaswamy

Lawson

Gogineni

S. V.

Towards a quality-centric big data architecture for federated sensor services

Proceedings of the IEEE International Congress on Big Data (BigData ′13)

July 2013

86 93

10.1109/bigdata.congress.2013.21

2-s2.0-84886078135

Mattern

Floerkemeier

That ‘internet of things’ thing, in the real world things matter more than ideas

RFID Journal 2009

Georgakopoulos

Zaslavsky

Perera

Sensing as a service and big data

Proceedings of the International Conference on Advances in Cloud Computing (ACC ′12)

July 2012

Bangalore, India

Yang

Song

LiveWeb: a sensorweb portal for sensing the world in real-time

Tsinghua Science and Technology 2011 16 5 491 504

10.1016/s1007-0214(11)70068-2

2-s2.0-80053558863

Crétaux

J.-F.

Jelinski

Calmant

Kouraev

Vuglinski

Bergé-Nguyen

Gennero

M.-C.

Nino

Abarca Del Rio

Cazenave

Maisongrande

Sols: a lake database to monitor in the near real time water level and storage variations from remote sensing data

Advances in Space Research 2011 47 9 1497 1507

10.1016/j.asr.2011.01.004

2-s2.0-79953025227

10.

Lane

N. D.

Miluzzo

Peebles

Choudhury

Campbell

A. T.

A survey of mobile phone sensing

IEEE Communications Magazine 2010 48 9 140 150

10.1109/MCOM.2010.5560598

2-s2.0-77956382087

11.

Laurila

Gatica-Perez

Aad

Bloma

Bornet

Trinh-Minh-Tri

Dousse

Eberle

Miettinen

The mobile data challenge: big data for mobile computing research

Proceedings of the Mobile Data Challenge Workshop (MDC ′12)

June 2012

12.

Laurila

J. K.

Gatica-Perez

Aad

Blom

Bornet

T. M. T.

Dousse

Eberle

Miettinen

From big smartphone data to worldwide research: the Mobile Data Challenge

Pervasive and Mobile Computing 2013 9 6 752 771

10.1016/j.pmcj.2013.07.014

2-s2.0-84889096994

13.

Sheth

Aggarwal

C. C.

Ashish

The internet of things: a survey from the data-centric perspective

Managing and Mining Sensor Data 2013 chapter 12

New York, NY, USA

Springer

383 428

10.1007/978-1-4614-6309-2_12

14.

Wang

Cao

Ubiquitous data collection for mobile users in wireless sensor networks

Proceedings of the IEEE INFOCOM

April 2011

Shanghai, China

IEEE

2246 2254

10.1109/infcom.2011.5935040

2-s2.0-79960887806

15.

Tracey

Sreenan

A holistic architecture for the internet of things, sensing services and big data

Proceedings of the 13th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid ′13)

May 2013

Delft, The Netherlands

546 553

10.1109/CCGrid.2013.100

16.

Atzori

Iera

Morabito

From ‘smart objects’ to ‘social objects’: the next evolutionary step of the internet of things

IEEE Communications Magazine 2014 52 1 97 105

10.1109/mcom.2014.6710070

2-s2.0-84893373985

17.

Elias

A. G. F.

Rodrigues

J. J. P. C.

Oliveira

L. M. L.

Zarpelão

B. B.

A ubiquitous model for wireless sensor networks monitoring

Proceedings of the 6th International Conference on Innovative Mobile and Internet Services in Ubiquitous Computing (IMIS '12)

July 2012

Palermo, Italy

835 839

10.1109/imis.2012.33

2-s2.0-84867718134

18.

Porta

T. L.

Govindan

M.-R.

Liu

Medusa: a programming framework for crowd-sensing applications

Proceedings of the 10th International Conference on Mobile Systems, Applications, andServices (MobiSys ′12)

2012

337 350

19.

Fang

Tang

Yang

Xue

Crowdsourcing to smartphones: incentive mechanism design for mobile phone sensing

Proceedings of the 18th Annual International Conference on Mobile Computing and Networking (MobiCom ′12)

August 2012

173 184

10.1145/2348543.2348567

2-s2.0-84866627852

20.

Lenzini

Luconi

Vecchio

Faggiani

Gregori

Lessons learned from the design, implementation, and management of a smartphone-based crowdsourcing system

Proceedings of the 1st International Workshop on Sensing and Big Data Mining (SenseMine ′13)

November 2013

Roma, Italy

1 6

21.

Mun

Reddy

Shilton

Yau

Burke

Estrin

Hansen

Howard

West

Boda

PEIR, the personal environmental impact report, as a platform for participatory sensing systems research

Proceedings of the 7th ACM International Conference on Mobile Systems, Applications, and Services (MobiSys ′09)

June 2009

55 68

10.1145/1555816.1555823

2-s2.0-70450248480

22.

Mohan

Padmanabhan

V. N.

Ramjee

Nericell: Rich monitoring of road and traffic conditions using mobile smartphones

Proceedings of the 6th ACM Conference on Embedded Networked Sensor Systems (SenSys ′08)

November 2008

323 336

10.1145/1460412.1460444

2-s2.0-84866503356

23.

Qin

Zhang

Sun

Discovering human presence activities with smartphones using nonintrusive wi-fi sniffer sensors: the big data prospective

International Journal of Distributed Sensor Networks 2013 2013 12

927940

10.1155/2013/927940

24.

Toscos

Chen

M. Y.

Froehlich

Harrison

Klasnja

LaMarea

LeGrand

Libby

Smith

Landay

J. A.

Consolvo

McDonald

D. W.

Activity sensing in the wild: a field trial of ubifit garden

Proceedings of the SIGCHI Conference on Human Factors in Computing Systems

April 2008

1797 1806

10.1145/1357054.1357335

2-s2.0-57649188943

25.

Pripužić

Žarko

I. P.

Antonić

Publish/subscribe middleware for energy-efficient mobile crowdsensing

Proceedings of the ACM Conference on Ubiquitous Computing (UbiComp '13)

September 2013

Zurich, Switzerland

1099 1110

10.1145/2494091.2499577

2-s2.0-84885225815

26.

Antonic

Zarko

I. P.

Jakobovic

Inferring presence status on smartphones: the big data perspective

Proceedings of the 18th IEEE Symposium on Computers and Communications (ISCC ′13)

July 2013

600 605

10.1109/iscc.2013.6755013

2-s2.0-84897460888

27.

Liu

Jiang

Sha

Govindan

Cloud-enabled privacy-preserving collaborative learning for mobile sensing

Proceedings of the 10th ACM Conference on Embedded Networked Sensor Systems (SenSys ′12)

November 2012

57 70

10.1145/2426656.2426663

2-s2.0-84873449020

28.

Weiss

G. M.

Lockhart

J. W.

Identifying user traits by mining smart phone accelerometer data

Proceedings of the 5th International Workshop on Knowledge Discovery from Sensor Data (SensorKDD ′11)

August 2011

ACM

61 69

10.1145/2003653.2003660

2-s2.0-80051680184

29.

Lockhart

J. W.

Weiss

G. M.

A comparison of alternative client/server architectures for ubiquitous mobile sensor-based applications

Proceedings of the 14th International Conference on Ubiquitous Computing (UbiComp ′12)

September 2012

721 724

2-s2.0-84879477130

30.

Xue

J. C.

Gallagher

S. T.

Grosner

A. B.

Pulickal

T. T.

Lockhart

J. W.

Weiss

G. M.

Design considerations for the WISDM smart phone-based sensor mining architecture

Proceedings of the 5th International Workshop on Knowledge Discovery from Sensor Data (SensorKDD ′11)

2011

25 33

10.1145/2003653.2003656

31.

Park

Heidemann

Data muling with mobile phones for sensornets

Proceedings of the 9th ACM Conference on Embedded Networked Sensor Systems (SenSys ′11)

November 2011

162 175

10.1145/2070942.2070960

2-s2.0-83455208234

32.

Oliveira

Rodrigues

Elias

Zarpelo

Ubiquitous monitoring solution for Wireless Sensor Networks with push notifications and end-to-end connectivity

Mobile Information Systems 2014 10 1 19 35

10.3233/MIS-130170

33.

Diallo

Rodrigues

Sene

Lloret

Distributed database management techniques for wireless sensor networks

IEEE Transactions on Parallel and Distributed Systems 2015 26 2 604 620

10.1109/TPDS.2013.207

34.

Mendes

L. D. P.

Rodrigues

J. J. P. C.

Lloret

Sendra

Cross-layer dynamic admission control for cloud-based multimedia sensor networks

IEEE Systems Journal 2014 8 1 235 246

10.1109/jsyst.2013.2260653

2-s2.0-84897589532

35.

Ullah

Rodrigues

Khan

Verikoukis

Zhu

Protocols and architectures for next-generation wireless sensor networks

International Journal of Distributed Sensor Networks 2014 2014 3

705470

10.1155/2014/705470

36.

Zhu

Zhong

Zhang

Z.-L.

Energy-synchronized computing for sustainable sensor networks

Ad Hoc Networks 2013 11 4 1392 1404

10.1016/j.adhoc.2010.11.005

2-s2.0-84877579556

37.

Zhu

Achieving energy-synchronized communication in energy-harvesting wireless sensor networks

ACM Transactions on Embedded Computing Systems 2014 13 2, article 68

10.1145/2544375.2544388

2-s2.0-84893525811

38.

Zhu

Zhang

Achieving long-term operation with a capacitor-driven energy storage and sharing network

ACM Transactions on Sensor Networks (TOSN) 2012 8 4, article 32

10.1145/2240116.2240121

2-s2.0-84867545485

39.

Zhu

Mishra

Irwin

Sharma

Shenoy

Towsley

The case for efficient renewable energy management in smart homes

Proceedings of the 3rd ACM Workshop on Embedded Sensing Systems for Energy-Efficiency in Buildings (BuildSys ′11)

November 2011

Seattle, Wash, USA

ACM

67 72

10.1145/2434020.2434042

2-s2.0-84875174038

40.

Sharma

Gummeson

Irwin

Zhu

Shenoy

Leveraging weather forecasts in energy harvesting sensor systems

Proceedings of the IEEE Conference on Sensor, Mesh and Ad Hoc Communications and Networks (SECON ′14)

2014

41.

Huang

Luo

Skoda

Zhu

E-Sketch: Gathering large-scale energy consumption data based on consumption patterns

Proceedings of the IEEE International Conference on Big Data (Big Data '14)

October 2014

Washington, DC, USA

656 665

10.1109/bigdata.2014.7004289

42.

Albert

Rajagopal

Thermal profiling of residential energy use

IEEE Transactions on Power Systems 2014 30 2 602 611

10.1109/TPWRS.2014.2329485

43.

Albert

Rajagopal

Building dynamic thermal profiles of energy consumption for individuals and neighborhoods

Proceedings of the IEEE International Conference on Big Data

October 2013

723 728

10.1109/bigdata.2013.6691644

2-s2.0-84893207808

44.

Balac

Sipes

Wolter

Nunes

Sinkovits

Karimabadi

Large Scale predictive analytics for real-time energy management

Proceedings of the IEEE International Conference on Big Data, Big Data

October 2013

657 664

10.1109/bigdata.2013.6691635

2-s2.0-84893303609

45.

Yang

Ting

Srivastava

M. B.

Inferring occupancy from opportunistically available sensor data

Proceedings of the 12th IEEE International Conference on Pervasive Computing and Communications (PerCom ′14)

March 2014

60 68

10.1109/percom.2014.6813945

2-s2.0-84901297127

46.

Singh

R. P.

Keshav

Brecht

A cloud-based consumer-centric architecture for energy data analytics

Proceedings of the 4th ACM International Conference on Future Energy Systems (e-Energy ′13)

May 2013

Berkeley, Calif, USA

63 74

10.1145/2487166.2487174

2-s2.0-84878631638

47.

Hasenfratz

Saukh

Walser

Hueglin

Fierz

Thiele

Pushing the spatio-temporal resolution limit of urban air pollution maps

Proceedings of the 12th IEEE International Conference on Pervasive Computing and Communications (PerCom ′14)

March 2014

69 77

10.1109/percom.2014.6813946

2-s2.0-84901299179

48.

Mitsch

Loos

S. M.

Platzer

Towards formal verification of freeway traffic control

Proceedings of the IEEE/ACM 3rd International Conference on Cyber-Physical Systems (ICCPS ′12)

April 2012

171 180

10.1109/iccps.2012.25

2-s2.0-84861509153

49.

Dobre

Xhafa

Intelligent services for Big data science

Future Generation Computer Systems 2014 37 267 281

10.1016/j.future.2013.07.014

2-s2.0-84901607859

50.

Liang

C.-J. M.

Liu

Luo

Terzis

Zhao

RACNet: a high-fidelity data center sensing network

Proceedings of the 7th ACM Conference on Embedded Networked Sensor Systems (SenSys ′09)

November 2009

15 28

10.1145/1644038.1644041

2-s2.0-74549167960

51.

Giatrakos

Deligiannakis

Garofalakis

Sharfman

Schuster

Prediction-based geometric monitoring over distributed data streams

Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD ′12)

May 2012

265 276

10.1145/2213836.2213867

2-s2.0-84862665086

52.

Papaioannou

T. G.

Guo

Aberer

Model-view sensor data management in the cloud

Proceedings of the IEEE International Conference on Big Data

October 2013

282 290

10.1109/bigdata.2013.6691585

2-s2.0-84893258469

53.

Benincasa

Tuncay

G. S.

Helmy

Participant recruitment and data collection framework for opportunistic sensing: a comparative analysis

Proceedings of the 19th Annual International Conference on Mobile Computing and Networking (MobiCom ′13)

October 2013

191 194

10.1145/2500423.2504573

2-s2.0-84887091422

54.

Gandhi

Nath

Suri

Liu

GAMPS: compressing multi sensor data by grouping and amplitude scaling

International Conference on Management of Data and 28th Symposium on Principles of Database Systems (SIGMOD-PODS '09)

July 2009

771 784

10.1145/1559845.1559926

2-s2.0-70849135402

55.

Roussopoulos

Deligiannakis

Kotidis

Compressing historical information in sensor networks

Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD ′04)

June 2004

Paris, France

ACM

527 538

10.1145/1007568.1007628

56.

Sharma

Jiang

Xiong

Liu

Chen

Modeling heterogeneous time series dynamics to profile big sensor data in complex physical systems

Proceedings of the IEEE International Conference on Big Data

October 2013

631 638

10.1109/bigdata.2013.6691632

2-s2.0-84893316651

57.

Giatrakos

Kotidis

Deligiannakis

Vassalos

Theodoridis

TACO: tunable approximate computation of outliers in wireless sensor networks

Proceedings of the International Conference on Management of Data (SIGMOD ′10)

June 2010

Indianapolis, Ind, USA

279 290

10.1145/1807167.1807199

2-s2.0-77954750313

58.

Werner-Allen

Dawson-Haggerty

Welsh

Lance: optimizing high-resolution signal collection in wireless sensor networks

Proceedings of the 6th ACM Conference on Embedded Networked Sensor Systems (SenSys ′08)

November 2008

Raleigh, NC, USA

169 182

10.1145/1460412.1460430

2-s2.0-84866514281

59.

Zhu

Xiao

Ping

Towsley

Gong

A secure energy routing mechanism for sharing renewable energy in smart microgrid

Proceedings of the IEEE 2nd International Conference on Smart Grid Communications (SmartGridComm ′11)

October 2011

143 148

10.1109/smartgridcomm.2011.6102307

2-s2.0-84855828842

60.

Zhu

Zhang

Green firewall: An energy-efficient intrusion prevention mechanism in wireless sensor network

Proceedings of the IEEE Global Communications Conference (GLOBECOM ′12)

December 2012

3037 3042

10.1109/glocom.2012.6503580

2-s2.0-84877639290

61.

Liu

Jiang

Sha

Govindan

Cloud-enabled privacy-preserving collaborative learning for mobile sensing

Proceedings of the 10th ACM Conference on Embedded Networked Sensor Systems (SenSys ′12)

November 2012

Toronto, Canada

ACM

57 70

10.1145/2426656.2426663

2-s2.0-84873449020

62.

Shi

Zhang

Liu

Zhang

PriSense: privacy-preserving data aggregation in people-centric urban sensing systems

Proceedings of the IEEE INFOCOM

March 2010

San Diego, Calif, USA

10.1109/infcom.2010.5462147

2-s2.0-77953308558

63.

C. Y. T.

Yau

D. K. Y.

Yip

N. K.

Rao

N. S. V.

Privacy vulnerability of published anonymous mobility traces

IEEE/ACM Transactions on Networking 2013 21 3 720 733

10.1109/TNET.2012.2208983

2-s2.0-84879157826

64.

Nath

Chan

Secure outsourced aggregation via oneway chains

Proceedings of the ACM International Conference on Management of Data (SIGMOD ′09)

June-July 2009

Providence, RI, USA

65.

Agarwal

Hall

ProtectMyPrivacy: detecting and mitigating privacy leaks on iOS devices using crowdsourcing

Proceedings of the 11th Annual International Conference on Mobile Systems, Applications, and Services (MobiSys ′13)

June 2013

Taipei, Taiwan

97 109

10.1145/2462456.2464460

2-s2.0-84881124452

66.

Jun

Cheng

Zhu

Exploiting sender-based link Correlation in wireless sensor networks

Proceedings of the IEEE 22nd International Conference on Network Protocols (ICNP ′14)

October 2014

Raleigh, NC, USA

445 455

10.1109/icnp.2014.67

67.

Zhou

Xie

Zhu

Huang

Zhang

Xiao

EEP2P: an energy-efficient and economy-efficient P2P network protocol

Proceedings of the International Green Computing Conference (IGCC ′14)

November 2014

Dallas, Tex, USA

1 6

10.1109/igcc.2014.7039171

68.

Zhong

Zhu

Wang

Tracking with unreliable node sequences

Proceedings of the 28th IEEE Conference on Computer Communications (INFOCOM ′09)

April 2009

1215 1223

10.1109/infcom.2009.5062035

2-s2.0-70349696146

69.

Zhang

Zhou

Guo

Zhu

Xiao

Fingerprint-free tracking with dynamic enhanced field division

Proceedings of the IEEE Conference on Computer Communications (INFOCOM ′15)

April-May 2015

Hong Kong

70.

Zhong

Huang

Zhu

Zhang

Jiang

Xiao

iDES: incentive-driven distributed energy sharing in sustainable microgrids

Proceedings of the International Green Computing Conference (IGCC ′14)

November 2014

Dallas, Tex, USA

IEEE

1 10

10.1109/igcc.2014.7039166

71.

Zhu

Liu

Shin

K. G.

SHARE: SoH-aware reconfiguration to enhance deliverable capacity of large-scale battery packs

Proceedings of the ACM/IEEE 6th International Conference on Cyber-Physical Systems (ICCPS '15)

April 2015

Seattle, Wash, USA

169 178

10.1145/2735960.2735967

72.

Huang

Corrigan

Zhu

Luo

Zhan

Exploring power-voltage relationship for distributed peak demand flattening in microgrids

Proceedings of the ACM/IEEE 6th International Conference on Cyber-Physical Systems (ICCPS ′15)

2015