Abstract
In current carpooling systems, drivers and passengers offer and search for their trips through available mediums, for example, accessing carpool website by smartphone, for finding a possible match of the journey. While efforts have been made to achieve fast matching for known trips, the need for accurate mobile tracking for individual users still remains a bottleneck. For example, drivers feel impatient to input their routes before driving, or centralized systems haves difficulties to track a large number of vehicles in real time. In this paper, we present the idea of Mobility Crowdsourcing (MobiCrowd), which leverages private smartphone to collect individual trips for carpooling, without any explicit effort on the part of users. Our scheme generates daily trips and mobility models for each user, and then makes carpooling zero-effort by enabling travel data to be crowdsourced instead of tracking vehicles or asking users to input their trips. With prior mobility knowledge, one user's travel routes and positions for carpooling can be predicted according to the location of the time and other mobility context. Based on a realistic travel survey and simulation, we prove that our scheme can provide efficient and accurate position estimation for individual carpools.
1. Introduction
Nowadays, quick and easy transportation has been an essential part of modern society. Vehicles offer flexibility and mobility when it comes to our work-related and personal lives, enable rapid and timely delivery of goods, but can also cause traffic jams, carbon emissions, pollution, accidents, energy crisis issues, and other problems, inevitably. On the one hand, vehicle transportation plays a vital role in global economy; therefore, any efficiency improvement will yield great profits. On the other hand, the efficiency of vehicle transportation points out how we use vehicles and, also, the degree of expensive social costs. Although lots of institutions, resources, and research are dedicated to improving transportation efficiency, the waste of transportation capacities is still ubiquitous in current vehicle transportation.
According to NHTS [1] from the US Department of Transportation, the average occupancy rate of personal vehicle trips is 1.6 persons per vehicle mile. Since a regular vehicle carries 5 persons in full occupancy, 68% of transportation capacities are wasted during personal trips. In US alone, this involves 204 million personal vehicles and causes a great deal of loss. Similarly, such inefficiency has also been observed in business transportation. Taxies, vans, trucks, and other vehicles are often running in low occupancy or utilization or are sometimes even unoccupied. The oversupply does not just come from wasted transportation capacities, but also comes from information opacity between supply and demand.
Carpooling seems an effective method to achieve green and efficient transportation. Traditional carpools are neighbors or coworkers with similar routes, who can easily contact with each other for possible carpooling. Casual carpools, as impromptu carpools formed among strangers, can team up in public areas near HOV lanes, but it is severely limited in deployed roads [2]. Later carpool associations allow people to match their respective trips through the Internet, even if they are strangers. In the “dynamic carpooling” concept, casual carpooling is proposed, and many researchers try to achieve this aim through special designed systems. In such carpooling, drivers and passengers offer and search for their trips through available mediums, for example, accessing carpooling website by smartphone, for finding a possible match of the journey. However, these systems failed to provide convenient and flexible carpool services for common users. While efforts have been made to achieve fast matching for known trips, the need for accurate mobile tracking for individual users still remains a bottleneck in current carpooling systems. For example, drivers feel impatient to input their routes before driving, or centralized systems have difficulties to track a large number of vehicles and to acquire their real-time positions.
Crowdsourcing has been evolving as a distributed problem solving and business production model in recent years, which was proposed to reduce production costs and make more efficient use of labor and resources with the aids of participators. An example of crowdsourcing tasks is seen in indoor localization, the system Zee [3] makes the calibration zero-effort, by enabling WiFi fingerprint training data to be crowdsourced without any explicit effort on the part of users. In this paper, we present the idea of MobiCrowd, which leverages private smartphone to collect individual trips for carpooling, without any explicit effort on the part of users. Our scheme acquires location information from smartphone, generates daily trips and mobility models for each user, and then makes carpooling zero-effort by enabling travel data to be crowdsourced instead of tracking vehicles or asking users to input their trips. With prior mobility knowledge, a user's travel routes and positions can be predicted according to the location of the time, and then possible carpooling can be arranged to fit the mobility context. Based on a realistic travel survey and simulation, we prove that our scheme can provide efficient and accurate position estimation for individual carpools.
The remainder of this paper is structured as follows: Section 2 presents a brief overview of related work. In Section 3, we explain the design of MobiCrowd step by step, including human mobility patterns, system overview, driving sensing, trajectory prediction, and position estimation. Section 4 evaluates our scheme via a survey and simulation, while Section 5 finally summarizes the paper.
2. Related Work
Based on the techniques presented in current carpool web sites, taxi service centers, and bus-tracking systems, some centralized carpooling systems are proposed. Composite traffic [4] builds hypothetical large-scale public transport for the Helsinki metropolitan area, where a centralized system collects the information on all trip demands online and then merges the trips with the same origin and destination into public vehicles with eight or four seats. Hartwig and Butchmann [5] investigate the challenges in casual carpooling and suggest a travel matching system by extending current cell phone services. Dynamic Ride Sharing Community [6] carries out a carpooling system over Traffic Information Grid so that users can order the services via Internet, in-vehicle terminal, PDA, and mobile phone. Lue and Colorni [7] develop a practical scheme for the carpooling in a university, where users are informed immediately in case of delay or changes via e-mail or short messages. In [8], Chen and Regan explore the feasibility and challenges of WiFi-based carpooling systems in metropolitan locations and indicate that WiFi connectivity works well while vehicles are traveling at slow speed. In [9], Lalos et al. describe how positioning systems can be utilized to support a dynamic network of car and taxi pool services. In [10], Agatz et al. define dynamic carpooling and outline the optimization challenges that arise when developing technology to support carpooling.
Generally, the centralized approaches have some intrinsic drawbacks. On one hand, collecting vehicle trip is an obstacle to the system, where available carpooling relies on the numbers of the drivers willing to input their trips. It often leads to a closed system running on a few registered vehicles and excludes the extensive unregistered ones. On the other hand, the centralized framework is not scalable enough and flexible enough. Vehicles are often coming late or early for traffic jams and other reasons. For a service center, tracing all vehicles and distributing updates to each passenger can be highly expensive.
Alternatively, approaches based on distributed framework to implement dynamic carpooling have also been considered. In [11], Winter and Nittel propose ad hoc shared-ride trip planning in Mobile Geosensor Networks, and look into the communication strategies among mobile agents in colocated geographical space. Intelligent Self-Organizing Transport [12] develops the idea of shared-ride trip planning and demonstrates the feasibility of an ad hoc carpooling system. Our previous work, Vehicle-to-Passenger Communication [13], combines distributed carpooling and vehicular communication and then develops a roadside vehicle calling over vehicular ad hoc networks. However, the main disadvantage of these approaches is the assumption that vehicles support ad hoc communication. It means that such schemes are not practical until the wide deployment of vehicular communication device.
Compared with previous carpooling schemes, MobiCrowd decouples carpool matching and mobile tracking, develops mobility crowdsourcing and prediction-based tracking, and results in a simple, flexible, and scalable solution to casual carpooling.
3. System Design
Before discussing the idea of MobiCrowd, we first investigate the background of human mobility and the feasibility of mobility prediction. Then, we give the global framework and explain the process of crowdsourcing step by step. Later, we explain how to detect driving with smartphone. Finally, we establish a trip history model to predict one's driving trajectory and then use a Markov model to estimate one's accurate position according to previous trajectory prediction.
3.1. Human Mobility
The research on human mobility has had rapid development in recent years. Some of them notice the relationship between human social activities and geographic movements. The time-variant community mobility model [14] captures two properties of human mobility via empirical WLAN traces: skewed location visiting preferences and periodical reappearance of nodes at the same location. Another study [15] investigates the trajectories of 100,000 anonymous mobile phone users and finds that human trajectories show a high degree of temporal and spatial regularity: each individual can be characterized by a time-independent characteristic length scale and a significant probability to return to a few highly frequented locations. Recent research [16] has discussed the mobility patterns of 50,000 individuals from a 3-month-long record and shown a 93% potential predictability in user mobility across the whole user base, by measuring the entropy of each individual's movement.
According to NHTS, the majority of individual daily trips—87 percent—are taken by personal vehicle. The daily activities of an individual person, including “going to work,” “having lunch,” “shopping,” and so on, often show regular features. As a kind of human activity, driving is controlled by individual drivers and follows their respective social activities, partially. Although some unexpected driving occurs, individual or household driving in a certain vehicle usually yields the same spatial and temporal features. For instance, a commuter always drives his/her car from home to office at 9:00, and from office to home at 17:00. It makes large amounts of vehicle movement predictable at the microscopic scale. Research in the field of transportation also validates the regularities, both in human mobility and vehicle mobility. The Mobidrive project [17, 18] monitored the trajectories of private cars by collecting their GPS data, which found these sorts of regular activities in the 30,000 trips performed by 320 correspondents over a six-week study. A mobility pattern is observed in Figure 1, which is constituted by the spatial distribution of those locations where a traveler has had six weeks of personal experience. The spatial regularities in vehicle mobility are marked by the gray lines of vehicle movement, in which two-to-four main locations (including home) cover more than 70% of the overall trips.

Since the daily driving activities of a user provide a high degree of temporal and spatial regularity, it is possible to establish some mobility models to describe such driving activities and to predict the related mobility features for carpooling.
3.2. System Overview
The proliferation of smartphone is motivating the research community to look at the ways for more reliable and more convenient carpooling with the support of smartphone. Today's smartphones are not only programmable, but also come with cellular and WiFi interfaces and a rich set of embedded sensors, including an accelerometer, digital compass, gyroscope, GPS, microphone, and camera, which enable great sensing and communicating abilities to play the role of carpooling terminals. Thus, smartphones carryed by carpools can automatically investigate driving behavior, record the location of the time, and establish mobility model for carpooling, without requiring any explicit effort of users. Additionally, it is very easy to identify one user by his/her private smartphone and to track each carpooling, which is helpful to establish insurance between unfamiliar carpools.
Figure 2 shows the envisioned phone/server architecture forcasual carpooling. We briefly describe the system components, followed by a discussion on challenges and solutions.

MobiCrowd framework.
To one carpool participating in MobiCrowd, the smartphone will try to periodically locate itself in daily using as accurately as possible. Outdoor locations can be collected through GPS, while indoor locations could be found through WiFi and cellular localization services when the user accesses the Internet. A trail of the user's movement can be established as the 〈index, position, time〉 tuples, in the order of time sequence. At the same time, accelerometer and gyroscope readings are used to detect whether a driving event happens. During the period of driving, the trajectory is recorded and added up to local data set. After a period of learning, some daily locations, where the user often keeps unmoved for a long time, and the driving trajectories among these locations are combined into a mobility model similar to the one shown in Figure 1.
After building a mobility model for the user, the smartphone will upload it to a carpooling server to perform the crowdsourcing of the user-specific mobility model. Thus, the server maintains many mobility models as carpooling resources. With this knowledge, it can manage mobility model, estimate driving trajectory and real-time vehicle position, response carpooling request from passengers, and then make proper vehicle-to-passenger matching. Compared to previous carpooling servers, the needs for active mobile tracking are removed. With mobility crowdsourcing, MobiCrowd cuts down a great deal of communication spending between server and each vehicle, by adapting position estimation instead of mobile tracking. However, the estimation is not always accurate. Some methods to correct estimation errors should be considered. At first, the smartphone should send a message to activate its model as soon as it detects driving, which avoids picking up false vehicle targets and provides accurate start time of each driving. Furthermore, the smartphone should estimate its position in driving according to its mobility model. Since the server and the smartphone maintain the same model, the smartphone can report correction message if the real position is far from the estimated value.
The above problems bring into question how to estimate the position of a moving vehicle accurately. Accurate estimation requires accurate mobility model, and accurate mobility model requires accurate location and trajectories and proper modeling methodology. If we want to collect location and trajectory data in the daily using of smartphone, we have to know what are the driving behavior, for example, the detailed sensor readings. At the same time, we have to predict the whole driving, not only the trajectory, but also the real-time position. The rest of this study is meant as a step towards a deeper understanding of these fundamental issues.
3.3. Driving Sensing
In our MobiCrowd scheme, the difficulty first arises from how to define driving with a smartphone, for example, with what kind of features of sensor readings we can regard a user is driving a car with his/her smartphone. Some recent works [19–21] have adopted smartphone to collect and analyze different driving behavior for providing driving safety, which prove the effectiveness of driving recognition. Human diving behavior can be regarded as a set of driving events and time, which can be detected by the sensors of a smartphone on a vehicle. However, excessive classifying and recognizing of driving behavior are not necessary to our study. In fact, we only focus on basic driving directly related to the location of the time of vehicle movement.
In this study, we adopt a simple method to detect driving by measuring the movement speed between two walking events. As shown in Figure 3(a), an acceleration signature is very easy to identify in human walking patterns [22]. This signature arises from the natural up and down bounce of the human body while walking and can be used to count the number of steps walked. For a typical driving, a user will walk to a car at one place, drive it to another place, park the car and finally walk to the destination. Thus, we can capture two successive walkings and record the end position of the first walking and the start position of the second one with timestamps, as 〈Position1, Time1〉 and 〈Position2, Time2〉, respectively. Compared to walking, driving does not have obvious characteristics in acceleration readings but shows a very high moving speed, for example, large displacement within small time interval. After calculating the Euclidean distance between the two positions, we can figure out the moving speed of the user. If the speed is not less than 30 km/h, we can regard that the user had a driving between two walking behaviors. Of course, this method cannot make very fine distinctions between driving and riding (e.g., taking a bus, taxi, subway, or flight). We simply regard that a driver proving carpooling often drives his/her own car, and errors can be removed in long-time trip history model in the next subsection.

Driving sensing.
3.4. Trajectory Prediction
Some mobility prediction schemes, Greedy Mobility Pattern agent [23], and the map matching algorithm [24] try to establish the mobility pattern in Figure 1 in an individual node by recording node-specific trajectories from GPS data. With trajectory history, a moving vehicle can compare its position to its previous trajectories and find out the most possible route.
However, such patterns also have some intrinsic drawbacks. First, the time criterion in vehicle movement is completely neglected. If a driver starts his/her car at home in a workday morning, he/she is driving to the work place most likely. If the driver does so in a nonworkday evening, the possibility that he/she is going to the work place is very slim. Since the trip history is simplified to the trajectory history, the temporal characteristics of vehicle movement are inevitably lost. Second, the overlapped trajectories can cause prediction errors. If a vehicle is moving at the only path to the driver's home, the route is clear. If the path leads to his/her home and favorite shop, the destination prediction by simple route pruning is questionable. Finally, the trajectory records bring heavy burden on data processing and storage, especially those from long distances or infrequent trips. Discarding these records may result in false predictions, whereas keeping them will exceed the capacity of the smartphone.
Thus, we need a new mobility model to predict vehicle movement, which should be spatiotemporal in mobility features, doubtless in the destination prediction, and lightweight in data size. Here we explain the idea of trip history by giving a simple example. In Figure 3, a series of trips taken by the driver can be kept as records in Table 1, in which temporal and spatial features of the trips are more concisely represented.
To drive from home to work place at 8:40, Friday. To drive from work place to home at 17:35, Friday. To drive from home to sports at 14:05, Saturday, and send his wife to first shop on the way. To drive from sports to friend's home at 18:55, Saturday.
Trip history record.
The locations “home,” “work place,” “business place,” “first shop,” “second shop,” “sports,” and “friend's place” are not accurate GPS positions, but rough regions in geography. Since the driver may have no special parking space in these frequent visiting places, nearby parking positions within a certain scope, for example, within 500 m can be regarded as the same parking location of a given place. Similarly, the start time of the trip is separated into discrete time sets: Day (Mon., Tues., Wed., Thurs., Fri., Sat., and Sun.) and Time (<10, 10–12, 12–14, 14–16, 16–18, >18). Whenever a driver starts his/her car, MobiCrowd will log the start time and position as “Day,” “Time,” and “Source.” During movement, it obtains continuous position data from the GPS every several seconds. When the car stops, the last position becomes “Destination.” After finishing a trip, MobiCrowd can abbreviate the trip trajectory. If the trajectory from Source to Destination accords with the shortest path (not zero) in the electric map, it cancels all middle position records as record 1, 2, and 4. If the trajectory does not accord, it tries to find the shortest path from Source to later positions for as long as possible, cancels the middle positions from Source to the first “Midway Point,” and repeats the procedure until arriving at Destination. Record 3 indicates such an abbreviation, in which the Midway Point can be the first shop. Sometimes, Source equals Destination in a trip. For example, the man drives his wife from home to second shop, and, after a live parking, he drives back to home finally. In this case, the trip is abbreviated similarly with a Midway Point as second shop. Since most daily driving trajectories take the shortest paths in geography, the trip records can be largely shortened in data size.
With the trip history, we develop a heuristic and context-dependent induction method based on decision trees, to predict vehicle moving trajectories. In data mining and machine learning, decision trees are widely used as the predictive tool mapping from observations about an item to conclusions about its target value. The related theories and algorithms can be found in [25]. When a car starts, MobiCrowd constructs a decision tree, where previous trip records in Table 1 are expressed as branches and leaves in Figure 4. In each leaf node, the probability of selecting a destination is given by

Decision tree structure.
While driving, the smartphone periodically checks whether its position is on the way to the predicted destination or midway point. If the position disagrees with the predicted route, the car needs to calculate a new destination probability by
3.5. Position Estimation
Knowing a driving trajectory is still far from casual carpooling, for the accurate position of the vehicle is necessary in real-time vehicle-to-passenger matching.
To predict the position of one vehicle based on driving trajectory, we can easily eliminate the influence of ambiguous past route. Thus, we model the sequence of vehicle's positions in one specific trajectory as
Even if we can predict the time when the vehicle will arrive at destination. It is still not easy to calculate vehicle's exact position just by time. So it naturally turns into the problem of probability theory field. We redefine the sequence of vehicle's positions as
The Markov model leads one way to do position estimation task based on known trajectory. The first Markov model is
This means that the position of vehicle can be estimated by knowing its last segment, and the last segment is determined by last two landmarks. The probability distribution for every segments is correlated with the past segments' probability distribution.
As usual, if we count two or more trajectory segments, higher accuracy will be got. The most apparent superiority compared with first Markov model is the sense of direction. We can build this model to be similar to the one mentioned above. However, in this paper, we have already known the direction at any time. In the result, there is only a few increments of accuracy by large number of segments to be counted.
Although the Markov model can be transformed to estimate the position, for instance,
4. Performance Evaluation
In order to evaluate MobilCrowd accurately, we first construct a realistic travel survey to demonstrate the trip history model and then examine the performance of position estimation in real driving.
4.1. Travel Survey
To evaluate the driving regularities for individual vehicles and the accuracy of our trip history model, a travel survey, taken over 9 weeks, has been performed. Five volunteers from the academic staff were invited to attend the survey, not involving any authors or contributors of this study. Each one processes a private car with GPS device. During the survey, they and their family members driving the cars were asked to note down the start time, source position, and destination position of each car, as well as the midway point positions, if there were any. Then, we collected the records and calculated trips for each car in Table 2. We found partial travel numbers from 135 to 243, due to the different household driving habits. And the average trips in each car, respectively, range from 2.14 per day to 3.86 per day.
Collected trips in travel survey.
After the survey, the trip records were translated into discrete results in Table 2 and input into the electric map of Chengdu City, China. All trips outside of the city scope were discarded. In order to evaluate the accuracy of predictions, we established a trip history model with the trips in the first eight weeks as a training set and validated the prediction with the trips in the last week as a target set. We examined the generated up-to-date predictions at different stages in the vehicle moving process. The hit rate, shown in Figure 5, demonstrates that the trip history model accords with the actual vehicle moving from 65.83% at the beginning to 97.04% at the end of the journey. It proves that the trip history model can provide an accurate prediction of vehicle movement, even when the vehicle first starts moving.

The accuracy of trajectory prediction.
4.2. Simulation Results
In this study, we compare MobiCrowd with the conventional method in previous carpooling systems. Fortunately, GPS localization often shows consistent errors nearly 10 m, making it able to provide reference position. Thus, the conventional method, mobile tracking, enables smartphone to receive GPS signals and report the server periodically during the whole driving. Here we assume the smartphone reports its GPS position every 5 seconds. As shown in Figure 2, MobiCrowd first sends an activating message to the server, estimates vehicle position, and sends correcting messages if necessary. Since correcting message is used, errors in trajectory prediction can be removed, for the smartphone can report its time and position. Here we set the max position error as 20 m, which is an acceptable distance to ordinary carpools. If the estimation error is beyond this limit, the smartphone will send a correcting message. In the simulations, all trip data are collected from the realistic survey in the last subsection.
As shown in Figure 6, the accuracy of position is different when carpooling server uses two methods to acquire the position of a moving vehicle. Using the conventional method, the accuracy equals the accuracy of GPS localization and keeps constant during the whole trip. With an interval of 5 second, the average errors are less than one meter at most of time. Using MobiCrowd with auxiliary correcting messages, the average errors are often nearly 10 meters. Even for casual carpooling, the accuracy of MobiCrowd is enough to provide real-time matching, without misleading drivers and passengers.

The accuracy of position.
In Figure 7, we represent the numbers of generated messages of two method. Although both the methods produce messages linear to the driving time, MobiCrowd shows overwhelming advantages in the spending of communication. During the whole driving, MobiCrowd generates about ten messages averagely, while the conventional method exceeds 200 messages in average. Considering that a carpooling server often simultaneously tracks tens of thousand of vehicles or more, mobile tracking becomes a huge burden to the server and network inevitable. That is to say, the complexity and cost of carpooling system will increase sharply, which leads a bad performance and terrible user experience. On the contrary, MobiCrowd relies on position estimation based on mobility crowdsourcing, which is greatly reducing the communication, simplifies the system architecture, and results in efficient and accurate carpooling service.

The numbers of generated messages.
Generally, our MobiCrowd scheme shows great performance both in accuracy and spending. It also proves the value of mobility prediction in human daily activities.
5. Conclusion
Motivated the needs of mobile tracking in current carpooling systems, we propose MobiCrowd to achieve simple and flexible mobility crowdsourcing for carpools. The basic idea of MobiCrowd is simple: if private smartphone knows daily movement of the owner, why not let it predict the owner's driving trajectory and real-time position for carpooling?
In this paper we introduce human mobility patterns to investigate daily movement of carpools. Since carpools drive cars with their smartphones, we use smartphones to build mobility models to acquire accurate position of moving vehicles. First, we explain how to detect driving with smartphone. Then, we establish a trip history model to predict one's driving trajectory. According to trajectory prediction, we use a Markov model to estimate the accurate position of the moving vehicle. Finally, we prove that our scheme can provide efficient and accurate position estimation through realistic travel survey and simulation.
We believe that MobiCrowd has a bright future in next-generation traffic networks. In such networks, mobile tracking is no longer a bottleneck for casual carpooling, carpools are characterized as their mobility models, carpooling matching is made without any explicit effort, vehicles are running in higher occupancy and less traffic, and all of us benefit from green and efficient vehicle transportation.
Footnotes
Acknowledgment
This paper is supported in part by the China NSF Grants (61103226, 60903158, 61170256, 61173172, and 61272526), and the Fundamental Research Funds for the Central Universities Grants (ZYGX2010J074, ZYGX2011J102, and ZYGX2012J083).
