Abstract
Origin-destination (OD) data collection methods are steadily attempting to move from conventional survey techniques (roadside interview, license plate, etc.) toward using passively collected big data sources such as those based on global positioning system (GPS) and cell phone call detail records (CDR). In this study, a new passive data source, Google’s Aggregated and Anonymized Trips (AAT), was used to derive hourly OD demand matrices for the San Francisco Bay Area. Since the AAT dataset contains relative flows or weights as opposed to absolute trips, machine learning techniques were applied to convert them with the help of observed OD flows from expanded household travel survey. Several machine learning models were trained to perform quite well for both training and test data. However, it was found that the multi-layer perceptron (MLP), a neural networks approach, resulted in the best performing model for the conversion. Additionally, all models were used for predictions in a hypothetical application context where input AAT data were scaled by different growth factors. This exercise showed that, even though the trip predictions of all models were close to each other initially, they varied widely for different magnitudes of OD markets and growth factors.
Get full access to this article
View all access options for this article.
