Analyzing multiday route choice behavior of commuters using GPS data

Abstract

In this study, accurate global position system and geographic information system data were employed to reveal multiday routes people used and to study multiday route choice behavior for the same origin–destination trips, from home to work. A new way of thinking about route choice modeling is provided in this study. Travelers are classified into three kinds based on the deviation between actual routes and the shortest travel time paths. Based on the classification, a two-stage route choice process is proposed, in which the first step is to classify the travelers and the second one is to model route choice behavior. After analyzing the characteristics of different types of travelers, an artificial neural network was adopted to classify travelers and model route choice behavior. An empirical study using global position systems data collected in Minneapolis–St Paul metropolitan area was carried out. It finds that most travelers follow the same route during commute trips on successive days. And different types of travelers have a significant difference in route choice property. The modeling results indicate that neural network framework can classify travelers and model route choice well.

Keywords

Multiday travel behavior day-to-day modeling route choice behavior global position systems data

Introduction

Understanding the multiday route choice behavior of commuters is one of the most important missions in travel behavior modeling. In traditional methods, in order to find the day-to-day differences, respondents were asked to list the used multiday paths. The quality of results is sensitive to the accuracy of respondents’ memories. Global position system (GPS) devices can capture location, speeds, and other information every second. The extensive use of GPS-based travel surveys in the last few years provides an opportunity to trace vehicle movements; hence, GPS will likely become the main mode of travel data collection in the future as smartphone-based platforms for collecting travel data come into use. GPS data present some obvious advantages (and some disadvantages) compared with traditional diary-based surveys. GPS can capture more precise time and location of trip and present less potential for respondents to omit trips from the survey. Particularly, we can easily capture multiple days of travel for each respondent.

If travelers were perfectly rational decision makers, with complete information and perfect computational skills, who cared only about their own travel time, they would choose the shortest paths day by day.¹ However, previous research^2,3 has found that fewer than 40% of travelers use the shortest paths, though 90% of subjects took routes that were within 5 min of the shortest paths. Travelers consider criteria beyond travel time,^4,5 like monetary cost,⁶ avoiding stops,⁷ travel time reliability,⁸ and aesthetics.⁹ In multiday route choice behavior, travelers might want to engage in route search or instead remain on habitual routes. Previous studies show that some commuters do not always follow the same exact route to work,¹⁰ but some do. So we want to differentiate the travelers who usually take a same route and those who do not. Whether travelers switch routes or not may depend on their socio-demographic characteristics, trip and route characteristics, or other circumstances such as weather conditions, time of day, and traffic information.¹¹

The classification of different route choice analysis models relies on the distinction between different model structures. Some model frameworks were based on artificial intelligence (AI) theory, such as fuzzy logic,^12–15 artificial neural networks (ANN),^16,17 as well as cognitive psychology.^18,19 In the discrete choice framework, multinomial logit (MNL) is the simplest method available, but does not recognize the similarity between routes that share common roadway links. The C-logit, path-size logit (PSL), and path-size correction logit (PSCL) models under this category address this issue by either making changes to the deterministic or the random component of the utility.^20–22

Given the importance of multiday route choice modeling and accuracy of GPS data, this study focuses on multiday route choice behavior by evaluating routes followed by residents of the Minneapolis–St Paul metropolitan area, as measured by the GPS component of the 2010 Twin Cities travel behavior inventory (TBI). In the previous study, travelers’ route behavior is analyzed by a sole model.^12–22 But as we know, different travelers have different choice decision process. For the first time, in this study, avoiding the uniform analysis of all travelers, travelers are classified into three kinds based on the difference between actual route and shortest travel time path: same route shortest path (SRSP) travelers, same route not shortest path (SRNSP) travelers, and not same route (NSR) travelers. A new way of thinking about route choice modeling, a two-stage route choice process, was proposed. After analyzing the characteristics of different types of travelers, ANN was adopted to classify travelers and model route choice behavior. An empirical study using GPS data collected was carried out in the following section.

The rest of this article is organized as follows: data, methodology, models, empirical result, and conclusion.

Data

Several kinds of data were adopted in this study, including travel data and base data (network and speed data).

The travel data are collected by the GPS component of the 2010 Twin Cities TBI conducted by the Metropolitan Council in the Minneapolis–St Paul metropolitan area. This is detailed in the TBI report for Metropolitan Council and the data available in the Transportation Secure Data Center. Valid data were collected from 278 persons from 250 households as a part of the TBI survey. The data are handled with identification and exclusion steps. The details are shown in Table 1.

Table 1.

TBI GPS trips used/unused for analysis, by reason of exclusion.

Steps	Numbers	Description
Origin trips	16,902	The identification was based on time gap between two successive GPS points If the dates of two GPS points were different and were not at midnight, the latter point was considered as origin of the next trip If the dates of two GPS points were the same, then time of them was checked If time gap was greater than a threshold (300 s), they were also assigned as different trips
1	12,572	Remove trips in which all the number in speed column is zero
2	8461	Remove trips where trip duration was less than 5 min
3	4895	Because in some trips the speed is “2” or “0” with no other numbers, remove the trips with average speed less than 2
H2W, auto	142	Using the classification method to identify trips by purpose, mode
H2W, auto	124	Destinations of two of the trips are not in the Twin Cities GIS network, so were excluded Some of the trips involved indirect travel from home to work; indirect trips were excluded from the H2W category and were instead included in H20
H2W, auto, multiday	109	Select multiday trips

GPS: global position system; GIS: geographic information system.

Trips are selected based on mode classification and commute trip identification rules. Mode classification rules are shown in Table 2, while the commute trips are identified based on the location of trip origin and destination. If the distance between a trip origin and house location ranges between 0 and 500 m, and the distance between the destination and work location ranges between 0 and 500 m, the trip is considered as commute trip.

Table 2.

Trips mode classification rules.

Steps	Modes	Interpretation
1	Walk	Maximum speed of all points ≤20 km/h
		Duration ≥60 s
		Percentile of speed of all points ≤10 km/h
		Average speed of all points ≤6 km/h
2	Rail	Distance from first point of speed accelerates to 10 km/h to the nearest rail station <150 m
		Distance from last point that speed is greater than 10 km/h to the nearest rail station <150 m
		Average speed of all points >10 km/h
3	Bus	Distance from first point of speed accelerates to 10 km/h to the nearest bus stop <50 m
		Distance from last point that speed is greater than 10 km/h to the nearest bus stop <50 m
		Average speed of all points >10 km/h
4	Bicycle	85th percentile of speed of all points ≥10 and <20 km/h
4	Bicycle	Max. speed of all points ≤30 km/h
5	Car	The remaining trip segments with average speed of all points >10 km/h

The base data include network data and speed data. In the process of data assembly, it requires a high-resolution geographic information system (GIS)-based roadway network for study area. The Lawrence Group (TLG) Twin Cities network is used as base road network in this study, with 290,231 links and 113,864 nodes. It covers the seven-county metropolitan Minneapolis area and is considered by local planners the most accurate GIS street map of the regional network to date. Because in the TLG network, this GIS layer only has information on functional classification and distance of most links in the roadway. It lacks speed information. The speed data source is the TomTom road network data for 2010, which is acquired by the Metropolitan Council for the TBI. TomTom speed data include seven periods in 24-h day. Link travel speed was chosen based on the period of the trip in question’s start time in GPS data.

Once the travel data and base data were assembled, Map Matching was performed. During this process, we identify the specific links of roadway traversed by a vehicle by mapping the points from its GPS trace to an underlying GIS-based roadway network database. In this study, multiday auto commute (H2W) GPS data were matched to TLG Twin Cities network. Then, the actual routes are obtained. The shortest travel time paths are found based on link travel speed on TomTom network. The actual route and shortest travel time path are compared to find their difference. The difference between two actual routes taken on different days is also compared.

Descriptive statistics

Before examining individual’s classification and travel behavior in detail, we report some results of descriptive statistics.

In the sample, 35 travelers had multiple home-to-work trips, averaging 3.11 trips for each traveler, giving 109 trips. Among these multi-trip travelers,

26 of 35 travelers (comprising 83 trips) (or 74%) took the same route each day. Of these

7 of the 26 travelers (or 27%) (comprising 25 trips) took the shortest travel time paths, whom we call SRSP travelers.

19 of the 26 travelers (or 73%) did not generally use the shortest path. These are SRNSP travelers.

9 of 35 travelers (comprising 26 trips) (or 24%) do not take the same route each day, whom we call NSR travelers.

Although the sample is a little small, from this statistic result, we can still find that most travelers (about 75%) choose a same route for their multiday commute trip. Moreover, most of them (about 73%) choose a specific route that is not the shortest travel time route. This shows a similar result with the previous research that most travelers take a same route for commute trip. In the study by Abdel-Aty et al.,¹⁰ about 15.5% of the respondents said they use more than one route to work. In our study, about 25% of the traveler might use more than one route. Although this value is a 10% higher than that in the study by Abdel-Aty et al.,¹⁰ considering the wide use of information systems, it still can indicate a similar result.

Methodology

Classifier of travelers

Based on the comparison results between routes that are chosen by commuters during their morning commutes, travelers can be divided into two parts: same route (SR) travelers and NSR travelers. SR travelers choose a same route for their commute trips each day, while NSR travelers choose at least two different routes. Furthermore, SR travelers can be classified into two kinds: SRSP travelers and SRNSP travelers. If a traveler is a SRSP traveler, he will choose the shortest travel time path. He is a perfect rational traveler in a sense. SRNSP travelers choose a specific path but not the shortest travel time path. The characteristics of these three kinds of travelers are presented in the following section in detail.

Two-stage choice process

In the models proposed in previous studies, either within the discrete choice modeling framework^20–22 or based on other theories, such as AI theory,^12–19 all commuters in the sample are analyzed in a same choice model. According to the traveler classification definition, different kinds of travelers have different route choice decision process. Based on this, a two-stage choice process, consisting of a traveler classification stage and a route choice stage, is presented in this study. In the traveler classification stage, drivers are classified into SRSP travelers, SRNSP travelers, or NSR travelers based on some rules or theory. In the route choice stage, each kind of travelers is examined in each way. It is easy to model SRSP travelers because they choose the shortest travel time paths. The result can be output merely based on network topology and link performance disregarding their socio-demographics. For SRNSP and NSR travelers, their route choice behavior can be modeled based on either discrete modeling framework or AI theory. The two-stage choice process framework is presented in Figure 1.

Figure 1.

The framework of two-stage choice process.

Neural network classifier

Here, a neural network classifier model is constructed and used to analyze type of travelers as well as SRNSP traveler route choice behavior. Without revealing the mathematical description, a neural network can store vast input–output model mapping. A typical ANN includes three layers: input layer, hidden layer, and output layer. Each layer consists of units (neurons) to represent elements. In the learning phase, the outputs of last layer are feed to the units on the next layers, but there is no feedback to the previous layer. A back propagation (BP) neural network includes signal passing forward and error passing backward. In the error passing backward step, the elements in hidden layers are corrected based on the error assigned to the units. When the error is acceptable or the iteration times reach a number set in advance, the learning phase stops. In the testing phase, the test data are modeling in this classifier. Then, the output of testing phase is compared with the actual valuable. The error is used to estimate the neural network. Figure 2 shows the connection scheme of a typical multi-layer network. In this structure, each input (x_i) was assigned an associated weight (w_ij) to connect input layer and hidden layer, while the connections w_jk were assigned to connect hidden layer and output layer. The mathematical formulations of processing functions and error update functions could be found in a lot of previous studies about ANN.^16,23–25

Figure 2.

A typical multi-layer network.

Models

Property analysis model

As defined in the previous section, travelers are classified into three kinds: SRSP travelers, SRNSP travelers, and NSR travelers. How to analyze and compare properties of different types of travelers? Five variables are proposed in this study.

The following is the formula to describe the overall difference between actual routes (O_aa) taken by one traveler day by day

O_{aa} = \frac{1}{Q_{n}} \sum_{Q_{n}} (1 - \frac{D_{overlap, i, j}^{n}}{D_{shorter, i, j}^{n}})

where $D_{overlap, i, j}^{n}$ is the distance of overlap between two actual routes i and j for traveler n, $D_{shorter, i, j}^{n}$ is the distance of the shorter route between i and j between the origin and destination for traveler n, and $Q_{n}$ is the number of route pair (i, j) for a particular traveler n. $Q_{n} = C_{M_{n}}^{2} = M_{n} (M_{n} - 1) / 2$ , where $M_{n}$ is the total number of actual route for traveler n.

For SRSP and SRNSP travelers, $O_{aa}$ is equal to 0 because two routes overlap, and the distance of overlap between two actual used paths is the same (i.e. $D_{overlap, i, j}^{n}$ = $D_{shorter, i, j}^{n}$ ). The value of $O_{aa}$ is larger than 0, but less than or equal to 1 for NSR travelers. A higher $O_{aa}$ means more difference. A $O_{aa}$ of 1 means no overlap (the routes are completely different).

As the description of $O_{aa}$ , it can reflect an overall difference between each actual route. Moreover, we assess the variation between actual route and shortest time route (using TomTom data). Two aspects were employed to evaluate the difference between actual route and shortest time route. One is percentage of overlap between them, while the other is how far it is from actual route to shortest time route in minutes. Like the overall difference between actual routes, the overlap difference between actual route and shortest time route $(O_{as_p}, O_{as_t})$ can be written as

\begin{matrix} O_{as_p} = \frac{1}{M_{n}} \sum_{M_{n}} P_{overlap, i, s}^{n} \\ O_{as_t} = \frac{1}{M_{n}} \sum_{M_{n}} T_{i, s}^{n} \end{matrix}

where $P_{overlap, i, s}^{n}$ is the percentage of overlap between actual route i and shortest time route, $T_{i, s}^{n}$ is the relative time difference between actual route i and shortest time route, $T_{i, s}^{n} = (t_{GPS, i}^{n} - t_{s}^{n}) / t_{s}^{n}$ , and it is positive.

As we know, the difference between everyday actual route and shortest time route would fluctuate. We also calculated the standard deviation of percentage of overlap $(S D_{p})$ and time relative difference $(S D_{t})$ between actual route and shortest travel time route for each traveler. These five variables are described in detail in Table 3.

Table 3.

Significances of variables.

Variable	Significance
$O_{aa}$	It reflects the strength of fluctuation of route choice behavior. When it is equal to 0, it indicates that the traveler use a same route day by day, either the shortest travel time paths or not. When it is equal to 1, it means that traveler takes a completely different route every day, although this is impossible
$O_{as_p}$	It is an average percentage of overlap between actual route and shortest travel time path during the survey period. It reflects the traveler’s actual route choice behavior relative to perfect rational choice behavior (Under perfect rationality, travelers are assumed to choose the route with lowest cost)
$O_{as_t}$	It is an average relative time difference between actual route and shortest travel time path during the survey period. It indicates the difference of performance between actual route and shortest travel time path
$S D_{p}$	It is a measure that is used to quantify the dispersion of the set of percentage of overlap between every actual route and shortest travel time path
$S D_{t}$	It is a measure that is used to quantify the dispersion of the set of relative time difference between every actual route and shortest travel time path

Classification model

There are many possible reasons that result in traveler types, including household factors, individual demographic, and employment. We developed a model to classify travelers

R^{n} = F {(I_{h h}^{n}, T_{c u r}^{n}, R_{n}, N_{v e h}^{n}, A_{n}, G_{n}, E_{n}, I_{i n}^{n}, S_{e m p l o y m e n t}^{n}, H_{w o r k}^{n}, F_{w o r k}^{n})}^{T}

where

R^{n} = [\begin{matrix} x \\ y \\ z \end{matrix}] = {\begin{matrix} [\begin{matrix} 1 \\ 0 \\ 0 \end{matrix}] if n \in SRSP \\ [\begin{matrix} 0 \\ 1 \\ 0 \end{matrix}] if n \in SRNSP \\ [\begin{matrix} 0 \\ 0 \\ 1 \end{matrix}] if n \in NSR \end{matrix}

$I_{hh}^{n}$ is the weighted household income of traveler n, $T_{cur}^{n}$ is the weighted length of time at current address of traveler n, $R_{n}$ is the weighted household region classification, $N_{veh}^{n}$ is the weighted number of household vehicles, $A_{n}$ is the weighted age of traveler n, $G_{n}$ is the weighted gender of traveler n, $E_{n}$ is the weighted education level of traveler n, $I_{in}^{n}$ is the weighted type of industry, $S_{employment}^{n}$ is the weighted employment status of traveler n, $H_{work}^{n}$ is the weighted average number of hours worked in a week of traveler n, and $F_{work}^{n}$ is the weighted flexibility in work hours of traveler n.

The neural network used in this study consists of the input layer, the hidden layer, and the output layer as shown in Figure 3. In our model, three pieces of information, household property, individual socio-demographic, and type of industry, are considered to be important in the traveler classification stage. There are 11 elements in the input layer. They distribute various pieces of information to the network. Driver’s student status can be used as input variable, but it is not considered here because in this study we focus on commute trip. In the above variable, household variables reflect basic socio-economic and demographic characteristics and the degree of familiarity of the transportation network around the neighborhood; individual variables represent the basic information of the travelers, they can reflect the ability of obtaining driver’s previous travel experiences; and industry variables reflect the influence of industry characteristics on the traveler classification.

Figure 3.

A three-layer neural network model for traveler classification.

A single processing element in the output layer is used to indicate a classification of travelers among SRSP, SRNSP, and NSR travelers. In previous studies, the testing or prediction results are set as continuous variables.¹⁶ Although it might increase the output hit ratio in the test, it is not an accurate result. In the study, during the training of the neural network, the desired output is set to be a single column matrix with three elements. If a traveler is classified as SRSP traveler, the first element is set to be 1, while others are set to be 0. In the same way, the matrix of SRNSP and NSR travelers can also be set.

Route choice model

For SRSP travelers, they choose the shortest travel time path day by day, so it is easy to build their route choice model as follows

P_{i} = {\begin{matrix} 1 if i = shortest travel time path \\ 0 if i \neq shortest travel time path \end{matrix}

$P_{i}$ is the probability that traveler choose the route i.

For SRNSP travelers, they choose a same route but not the shortest travel time path. This decision-making behavior is influenced by a multitude of factors, including individual socio-demographic characters, household property, and parameter of route alternatives. Like the traveler classification model, a route choice analysis model based on neural network is proposed for SRNSP travelers. The input valuables of individual socio-demographic and household property are the same as that in neural network model for traveler classification. The third piece of information considered in the route choice decision involves road network topology. The additional input variables to the model are defined as follows:

Distance: weighted distance of routes;

Circuity: weighted ratio of length of route to the Euclidean distance;

Length of longest stretch of roadway (LLSR): weighted ratio of the length longest (distance units) stretch of roadway without intersection to the length of route;

Freeway distance share: weighted ratio of distance traveled on the freeway to the length of route;

Freeway access share: weighted ratio of the travel distance from each trip’s origin to the freeway entrance along the trip to the length of route, ranging from 0 to 1. When it is equal to 1, it means there are no freeway segments included in the route;

Intersection: weighted numbers of intersections;

Left turns: weighted numbers of left turns.

Like the socio-demographic variables, there are also other network variables playing in this, for instance, the location of signals, crash data, or toll. Undoubtedly, these variables might be significant for route choice decision-making behavior, but they were not available in this analysis for information collecting problems. On the other hand, in our model, we added other three variables: (1) the ratio of the length longest (distance units) stretch of roadway without intersection to the length of route (LLSR), (2) the numbers of intersections, and (3) the numbers of left turns. These variables might partly reflect the influence of signals because the influences of them on route choice behavior are all related to less stops or deceleration. The neural network used is shown in Figure 4.

Figure 4.

A three-layer neural network model for route choice analysis of SRNSP travelers.

For NSR travelers, they choose at least two routes, this choice decision making is influenced by a lot of factors besides traveler characteristics and network topology, such as weather²⁶ and incidents. These kinds of variables are not easy to get. They are not included in the collected data in this study. Furthermore, the proportion of this kind of travelers who do not always follow the same exact route to work is very low,¹⁰ this indicates that to some extent, the traffic flow generated by these commuters is much smaller than that generated by travelers who select a same route. And the traffic flow generated by NSR travelers is distributed on the road network randomly, while it is relatively determined for SR travelers. Given these discussions, the route choice analysis of NSR is not considered in this study.

Results

In this study, GPS data collected in Minneapolis–St Paul Metropolitan area are applied to methodology and models proposed in previous sections. The data processing is provided in section “Data.” The property analysis results and neural network analysis results are as follows.

Property analysis results

In model construction phase, five kinds of variables are calculated to describe the property of different kinds of travelers. Using the data provided in this study, a sequence of results is shown in Figure 5. Considering the definition of these variables, from Figure 5 we can find some tendency.

Figure 5.

Property analyses of different types of travelers.

It is easy to understand that $O_{aa}$ of SRSP travelers and SRNSP travelers is 0; because they use a same route day-to-day, there is no fluctuation for them. About one-fifth of NSR travelers fluctuate a lot during their commute travel. Although other NSR travelers choose at least two routes, the similarity of the chosen alternatives is much higher.

Because SRSP travelers choose the shortest travel time path, $O_{as_p}$ of them is always equal to 1. Comparing SRNSP travelers and NSR travelers, we find SRNSP travelers have a lower percentage of overlap with the shortest travel time path in average. That might be because some shortest travel time paths contain high proportion of freeway, but SRNSP travelers prefer a route with a lot of local roads.

The distribution of $O_{as_t}$ is reasonable for these travelers. SRSP travelers have a lowest relative time difference because they use the shortest travel time path. The variable of SRNSP travelers is also not very high, and this owes to their travel experience. Although they chose a different route from the shortest travel time path, it can be conjectured that they choose this specific route from a set of alternatives based on their knowledge. So it is not a global optimal solution, but a local optimal solution. Doubtlessly, $O_{as_t}$ of NSR travelers fluctuate a lot because they choose different route day-to-day. The results of $S D_{p}$ and $S D_{t}$ can also be analyzed in a similar way.

From the results, we find that different types of travelers have a significant difference in route choice behavior. It is necessary to analyze a different type of traveler in a different way.

Neural network analysis results

Neural network classifier toolbox in MATLAB is used in this study. And 3 layers and 20 elements in hidden layers are determined in this neural network. In the traveler classification phase, first, 35 travelers are sorted in random order. And the data are divided into two parts: training and testing. In each training cycle, the training vectors are presented to the network in sequential order from traveler 1 to traveler 15, while the remaining travelers are used in testing cycle.

In the route choice analysis phase, the “chosen” route has been identified in the data process step and then other routes that are available to the traveler for making the same trip are determined. This is also called choice-set generation step in discrete modeling framework. Because the set cannot be generated based on the GPS survey data, we construct the choice set by considering the network topology, TomTom data, and locations. To determine a set of alternative route for each origin–destination (OD) pair, enhanced version of the breadth first search link elimination (BFE-LE) was employed. This algorithm has been described in detail in previous research.²⁷ In contrast with previous studies, here TomTom congested travel time rather than free-flow travel time on links is used as the travel cost. The built-in shortest-path calculation tools from ArcGIS are used. For SRNSP travelers, the choice set consisting of 10 alternatives was generated, including the used route, the shortest travel time path, and other 8 alternatives generated by BFE-LE algorithm. Similar to traveler classification, the routes are sorted in random order, and the last 50 options are set as a testing set, while others are trained in proposed neural network.

Considering the small sample in this study, either in traveler classification or route choice analysis phase, the data are sort in random order 10 times. And the corresponding output is collected. Then, the total output results are compared with the actual event. Tables 4 and 5 present the replication results with respect to traveler classification and route choice decision for the neural network built in the model building phase. Figure 6 shows the neural network training performance. From Tables 4 and 5, we find that 79.09% travelers can be classified in a correct category. The highest error rate happens under the situation that when actual class is NSR, while output class is SRNSP. This can be interpreted that some NSR travelers use a same route in most days during survey period, but choose another one in 1 or 2 days. They are close to SRNSP travelers. Compared with the accuracy rate of traveler classification model, the accuracy rate of route choice analysis model is a little lower; the result is 67.27%. Two errors are nearly equal. In Figure 6, we find the best validation performance is 0.15731 at epoch 16. This indicates that the proposed model has an efficient expression at convergence.

Table 4.

Replication of traveler classification by ANN model.

		Actual class
		SRSP	SRNSP	NSR
Output class	SRSP	10.91%	1.36%	2.27%
	SRNSP	6.82%	54.55%	7.73%
	NSR	0.00%	2.73%	13.64%

SRSP: same route shortest path; SRNSP: same route not shortest path; NSR: not same route.

Table 5.

Replication of route choice modeling by ANN model and SWS model.

		ANN model
		Actual decision
		Chosen route	Not chosen route
Output decision	Chosen route	24.36%	16.00%
Output decision	Not chosen route	16.73%	42.91%
		SWS model
		Actual decision
		Chosen route	Not chosen route
Output decision	Chosen route	19.64%	31.82%
Output decision	Not chosen route	21.45%	27.09%

ANN: artificial neural network; SWS: simple weighted sum.

Figure 6.

Neural network training performance.

In this study, we use the simple weighted sum (SWS) model as the comparative model to model route choice behavior. In this model, when we set the decision matrix, the influence of route characteristics was assumed to be at a same level with the influence of travel socio-demographic. And moreover, the total influence of household, individual, and industry are all the same. This method requires minimal knowledge of the decision-maker’s priorities and minimal input from decision maker. The equal weights method was popularized and applied in many decision-making problems.²⁸ The SWS model results are also given in Table 5. Compared with the results of ANN model, we find that, although weighted criteria model used in this study is easier to understand, the accuracy of results is worse than ANN approach. It might be because ANN approach can interpret the influence of factors better than what the comparative model can do for ANN model considering the variables as nonlinear variables, which is much more realistic. Anyway, there might be complex weighted criteria models that can model route choice well, but it is not our research interest in this study.

Conclusion

The study contributes to the literature on traveler classification and route choice behavior analysis literature by analyzing multiday GPS data from the Minneapolis–St Paul region. In previous study, travelers’ route behavior is analyzed by a sole model. But as we know, different travelers have different choice decision process. For the first time, in this study, travelers’ route choice behavior is modeled by classifying the travelers into three types: SRSP travelers, SRNSP travelers, and NSR travelers. And then, each type of travelers is modeled in each way. Besides the classification and route choice models, in order to describe the characteristics of different types of travelers, five variables of each type of travelers are proposed in this study.

Using the GPS data, we find that the descriptive statistics results confirm the result in the previous research.¹⁰ Most travelers choose a same route for commute trip from home to work. From property analysis results, it appears clearly that different types of travelers have different route choice behaviors. They approve that different types of travelers can be modeled in a different way. Traveler classification can be executed before modeling route choice. And then, the traveler classification and route choice behavior are modeled using a neural network. Input layer includes traveler’s demographics (household income, length of time at current address, household region, number of household vehicles, age, gender, education level, type of industry, employment status, number of work hour in 1 week, and flexibility in work hours) and route attributes (distance, circuity, ratio of the length longest stretch, ratio of distance traveled on the freeway, ratio of the travel distance from each trip’s origin to the freeway origin, intersection, and left turns). The testing results of model are acceptable. And compared with SWS model, we find that ANN model can model the route choice behavior better. It might be because ANN model considers the variables as nonlinear variables, which is much more realistic.

Overall, a new way of thinking about route choice modeling is provided in this study. We envision this study as an important contribution toward the development of analyzing traveler’s route choice decision-making process. The results can be further enhanced in other empirical studies with larger hold-out samples.

Footnotes

Acknowledgements

All analysis and errors are the responsibility of the authors. The raw data used for the research are accessible through the Transportation Secure Data Center (TSDC) of the National Renewable Energy Laboratory

Academic Editor: Hongwei Wu

Declaration of conflicting interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: The Federal Highway Administration of the US Department of Transportation is acknowledged for funding the work under a grant to RSG. This research is also supported by the National Natural Science Foundation of China (51178110 and 51378119).

References

Wardrop

JG.

Some theoretical aspects of road traffic research. P I Civil Eng 1952; 1: 325–378.

Zhu

The roads taken: theory and evidence on route choice in the wake of the I-35W Mississippi River bridge collapse and reconstruction. PhD Thesis, University of Minnesota, Minneapolis, MN, 2010.

Zhu

Levinson

Do people use the shortest path? An empirical test of Wardrop’s first principle. PLoS ONE 2015; 10: e0134322.

Ben-Akiva

De Palma

Isam

Dynamic network models and driver information systems. Transport Res A: Gen 1991; 25: 251–266.

Dia

An agent-based approach to modelling driver route choice behaviour under the influence of real-time information. Transport Res C: Emer 2002; 10: 331–349.

Carrion

Levinson

Valuation of travel time reliability from a GPS-based experimental design. Transport Res C: Emer 2013; 35: 305–323.

Zhang

Xie

Levinson

Illusion of motion. Transp Res Record 2009; 2135: 34–42.

Carrion

Levinson

Value of travel time reliability: a review of current evidence. Transport Res A: Pol 2012; 46: 720–741.

Zhang

Levinson

Determinants of route choice and value of traveler information: a field experiment. Transp Res Record 2008; 2086: 81–92.

10.

Abdel-Aty

Vaughn

Kitamura

. Models of commuters’ information use and route choice: initial results based on a southern California commuter route choice survey. Richmond, CA: California Partners for Advanced Transit and Highways (PATH), 1993.

11.

Bovy

Stern

Route choice: wayfinding in transport networks (Studies in Operational Regional Science). Dordrecht: Kluwer Academic Publishers, 1990.

12.

Lotan

Koutsopoulos

HN.

Models for route choice behavior in the presence of information using concepts from fuzzy set theory and approximate reasoning. Transportation 1933; 20: 129–155.

13.

Lotan

Effects of familiarity on route choice behavior in the presence of information. Transport Res C: Emer 1997; 5: 225–243.

14.

Henn

Fuzzy route choice model for traffic assignment. Fuzzy Set Syst 2000; 116: 77–101.

15.

Ridwan

Fuzzy preference based traffic assignment problem. Transport Res C: Emer 2004; 12: 209–233.

16.

Yang

Kitamura

Paul

. Exploration of route choice behavior with advanced traveler information using neural network concepts. Transportation 1993; 20: 199–223.

17.

Yamamoto

Kitamura

Fujii

Drivers’ route choice behavior: analysis by data mining algorithms. Transp Res Record 2002; 1807: 59–66.

18.

Nakayama

Kitamura

Route choice model with inductive learning. Transp Res Record 2000; 1725: 63–70.

19.

Nakayama

Kitamura

Fujii

Drivers’ route choice rules and network behavior: do drivers become rational and homogeneous through learning?

Transp Res Record 2001; 1752: 62–68.

20.

Bekhor

Ben-Akiva

Ramming

Evaluation of choice set generation algorithms for route choice models. Ann Oper Res 2006; 144: 235–247.

21.

Bierlaire

Frejinger

Route choice modeling with network-free data. Transport Res C: Emer 2008; 16: 187–198.

22.

Prato

CG.

Route choice modeling: past, present and future research directions. J Choice Model 2009; 2: 65–100.

23.

Rumelhart

McClelland

Group

PR.

Parallel distributed processing, vol. 1. Cambridge, MA: MIT Press, 1988.

24.

Wang

S-C

. Artificial neural network. In: Wang

S-C

(ed.) Interdisciplinary computing in Java programming. Dordrecht: Springer, 2003, pp.81–100.

25.

Murata

Yoshizawa

Amari

S-I.

Network information criterion—determining the number of hidden units for an artificial neural network model. IEEE T Neural Networ 1994; 5: 865–872.

26.

Khattak

De Palma

The impact of adverse weather conditions on the propensity to change travel decisions: a survey of Brussels commuters. Transport Res A: Pol 1997; 31: 181–203.

27.

Dhakar

Srinivasan

Route choice modeling using GPS-based travel surveys. Transp Res Record 2014; 2413: 65–73.

28.

Wang

J-J

Jing

Y-Y

Zhang

C-F

. Review on multi-criteria decision analysis aid in sustainable energy decision-making. Renew Sust Energ Rev 2009; 13: 2263–2278.