Sage Journals: Discover world-class research

Abstract

To address the methodological limitation of cross-sectional studies and the data constraints of longitudinal/panel studies, this paper presents a model-based method to fuse repeated cross-sectional travel survey data based on the theory of rational inattention (RI) in discrete choice modeling. In the proposed framework, older cross-sectional data are used to model the prior probability of choice alternatives, and more recent cross-sectional data are used to capture conditional heterogeneous choices. The fusion method is theoretically more robust and computationally less burdensome than existing data pooling techniques. The method is empirically tested using data from two cycles of a large-sample post-secondary student travel survey in the Greater Toronto and Hamilton Area to investigate the commuting mode choices of post-secondary students. Parameter estimates of the RI-based multinomial logit (MNL) model indicate that the proposed method can generate behaviorally consistent results. Validation of the estimated model using a holdout sample indicates its improved forecasting performance compared with the classical random utility maximizing MNL model. The fusion method can be extended to more than two cycles of repeated cross-sectional data by updating the prior probabilities whenever new cross-sectional data become available. Thus, the study presents a continuous framework for fusing information from multiple time points using repeated cross-sectional datasets to capture preference evolution better and enhance the forecasting robustness of discrete choice models.

Keywords

repeated cross-sectional data data fusion mode choice rational inattention in choice modelling

The transportation ecosystem is undergoing major transformation owing to internet/smartphone-based shared on-demand services (e.g., ride-sourcing, car sharing, bike sharing), telecommuting and online shopping options, connected and automated vehicle technologies, and natural disruptions like the COVID-19 pandemic. The emerging mobility options and post-pandemic adaptations constantly evolve our travel-related behaviors, significantly affecting future travel demand patterns (1 –4). The state-of-the-art travel demand modeling uses data from the most recent time point, even when cross-sectional data are available from multiple time points ( 5 ). This is because the most recently available data context is believed to be the most similar to the context of a future time point. Thus, travel demand forecasts based on the most current data are expected to be more accurate. However, completely ignoring older information is an inefficient use of available data. More importantly, data from a single time point (i.e., a single cross-sectional survey) cannot adequately capture the constant evolution of travel behavior.

Panel data, that is, data obtained by tracking behavior of the same individuals over multiple time points ( 6 ), can be used to account for the changing travel behavior. However, panel data suffer from sample attrition and response bias ( 7 ). One way to address both the methodological limitation of cross-sectional studies and the data constraints of longitudinal/panel studies is to use multiple cross-sectional data collected at different time points in the region of interest. These repeated cross-sectional surveys make similar measurements on samples from an equivalent population at different points in time without ensuring that any respondent is included in more than one round of data collection ( 8 ). Such repeated cross-sections or “pseudo-panels” have been used in the travel demand literature to investigate the temporal transferability of models and to account for changing preferences over time ( 5 , 9 –14). Most studies used repeated cross-sectional datasets by pooling them together and estimating meta-models with additional temporal factors in the year-specific constants and scale parameters. However, these temporal factors do not provide adequate behavioral explanation and, in many cases, can mask the effect of other factors not captured by the models.

This study adopts an alternative approach to fuse repeated cross-sectional travel survey data based on rational inattention (RI) theory in discrete choice modeling ( 15 , 16 ). RI relaxes the assumption of perfect knowledge and exhaustive computation of the utility functions of all choice alternatives of traditional random utility maximization (RUM)-based discrete choice models. Instead, it considers that choice makers have a prior probability of choices, and they process any further information at the expense of an information processing cost. Thus, RI-based models consider choice makers as Bayesian agents and require additional information about their prior probability that is difficult to obtain. By applying the RI concept for repeated cross-sectional survey data (i.e., pseudo-panel data), this study proposes a temporal data fusion framework to capture preference evolution and enhance the forecasting robustness of discrete choice models. It also demonstrates the feasibility of RI-based empirical investigations using multiple cycles of cross-sectional revealed preference data. Specifically, it uses data from two cycles of a large-sample, multi-institution post-secondary student travel survey in the Greater Toronto and Hamilton Area (GTHA) to investigate the commuting mode choices of post-secondary students in the area.

The remainder of the paper is organized as follows. The next section reviews previous studies that used repeated cross-sectional datasets for travel behavior analysis. Following this, the formulation of the RI-based fusion method proposed in this study is outlined. The empirical investigation of commuting mode choices of post-secondary students in the GTHA is then presented. The final section summarizes the findings and presents recommendations for future research.

Literature Review

Data from repeated cross-sectional surveys have been used in the transportation literature since Deaton ( 17 ) introduced the concept of pseudo-panels in 1985. A pseudo-panel consists of grouping individuals from cross-sectional surveys into cohorts, the averages of which are then treated as individual observations in an artificial panel. This approach is used without actual panel data to approximate the latter by following virtual persons (created by the aggregation into cohorts) over time. Examples include the analysis of vehicle ownership evolution (18 –21), travel demand forecasting (22 –24), and so forth. However, obtaining enough artificial observations while avoiding biased estimates of cohort averages poses a challenge in estimating robust models using the artificial pseudo-panel data generated from repeated cross-sectional surveys.

The repeated cross-sectional data have also been applied for disaggregate travel behavior analysis. In this regard, datasets collected in different years are pooled, and a meta-model is estimated with year-specific scale parameters and coefficients of key variables. For example, Sobhani et al. estimated a scaled version of the Multiple Discrete Continuous Extreme Value model to study the evolution of Canadian’s activity participation decision using four waves of data from the General Social Survey ( 25 ). Habib and Weiss ( 26 ) accommodated a parameterized scale function in their logit captivity model to investigate the temporal evolution of commuting mode choice preference using data from three cycles of the Transportation Tomorrow Survey (TTS) ( 27 ). Similarly, Anowar et al. proposed the scaled generalized ordered logit model to study the evolution of vehicle ownership in Montreal, Canada using cross-sectional origin–destination survey datasets from 1998, 2003, and 2008 ( 9 ).

Recently, Dias et al. used two cycles of repeated cross-sectional data to analyze the evolution of ride-hailing adoption and usage in the Puget Sound region ( 28 ). They estimated a joint binary probit–ordered probit model on the pooled data to account for sample-selection differences between the two surveys. On the other hand, Borysov and Rich proposed a method to study travel behavior dynamics by constructing detailed synthetic pseudo-panels from repeated cross-sectional data ( 29 ). The method is based on modeling a high-dimensional joint distribution of travel preferences conditional on detailed socioeconomic profiles by using a conditional variational autoencoder.

Some studies in the literature have used repeated cross-sectional data to test the temporal transferability of travel demand models. By pooling available data and estimating meta-models with year-specific coefficients and scale parameters, these studies have demonstrated the capability of repeated cross-sectional data to account for changing travel preferences and enhance forecasting robustness. For example, Habib et al. used three waves of the TTS to estimate year-specific and pooled models with additional temporal factors in commuting mode choice ( 13 ). The authors found that the pooled model outperforms year-specific models in model transferability. Sanko also reached a similar conclusion while analyzing the temporal transferability of the alternative-specific constants and the level-of-service (LOS) and socioeconomic parameters ( 5 , 10 ). Similarly, Salem and Habib demonstrated the applicability of the repeated cross-sectional data for investigating the temporal transferability of activity generation and scheduling models ( 11 , 12 ).

However, these studies expressed the temporal factors and/or year-specific scales as a function of time ( 11 , 12 ) or macro-economic variables like gross domestic product ( 10 ). Such parameterization may not provide adequate behavioral explanation and can mask the effect of other factors not captured by the models. Moreover, estimating the pooled data model becomes increasingly expensive in relation to computational requirements as newer cycles of data become available. Deciding how many data cycles to pool is also a critical decision that might affect the model’s forecasting performance. To overcome these issues, this study proposes a fusion framework for repeated cross-sectional travel survey data that is based on RI theory in discrete choice modeling. The following section presents the detailed formulation of the fusion method, including the motivation for choosing the RI concept.

Methodology

RI has been discussed in general choice-making since the 1950s (30 –32). However, its application in discrete choice modeling is quite recent. Matějka and McKay first proposed the RI-based discrete choice model by measuring the cost of information processing and optimizing the random utility of choice-making ( 15 ). The proposed model takes the classical multinomial logit (MNL) formulation when information cost is modeled using the Shannon entropy ( 33 ), as shown below.

Suppose a choice maker faces a discrete choice context of $J$ possible alternatives. The utility or pay-off of a choice alternative, $j$ , is $V_{j}$ . For now, consider $V_{j}$ is unknown to the choice maker, so it has a distribution. Considering $v_{j}$ as an instance of these random variables:

f (v_{j}) = \Pr (V_{j} = v_{j}) > 0

(1)

For convenience, we can consider that the $V_{j}$ takes a value, $v_{j} \in Λ$ , where $Λ \subset R_{J}$ . The choice probability of choosing an alternative $j$ is conditional to a specific value of $V_{j} = v_{j}$ . The resulting unconditional probability of choice is the prior probability, $\Pr_{j}^{0}$ , which is the choice probability before considering any further information.

\int_{v_{j}} P r (j | v_{j}) f (v_{j}) = P r_{j}^{0}

(2)

Given $\Pr_{j}^{0}$ , the choice maker needs to process any information she receives on the choice alternatives and contexts, which incurs psychological costs. So, an RI choice maker optimizes the expected utility gains over the incurred cost of information processing. This means the posterior choice probability, $\Pr (j | v_{j})$ , is the value that optimizes the balance between the expected pay-off and the cost of information processing $\tilde{C}$ .

\begin{matrix} \max \\ P r (j | v_{j}) \end{matrix} {\sum_{j = 1}^{J} \int_{v_{j}} (v_{j} P r (j | v_{j})) f (v_{j}) - \tilde{C}}

(3)

Subject to constraints of non-negativity of probability: $\Pr (j | v_{j}) > 0$ and law of total probability: $\sum_{j = 1}^{J} \Pr (j | v_{j}) = 1$ . The cost of information processing $\tilde{C}$ is the difference between the entropy (expressing the uncertainty) based on expected unconditional probability (the prior probability) and the expectation of entropy based on the conditional probability (the posterior probability after collecting/receiving information). The measure of entropy in $\tilde{C}$ shapes the final choice probability as the solution to the optimization. Matějka and McKay ( 15 ) proposed using Shannon’s ( 33 ) entropy measurement to define the information cost as follows:

\tilde{C} = μ (- \sum_{j = 1}^{J} \Pr_{j}^{0} \log (\Pr_{j}^{0}) + \int_{v_{i}} \sum_{j = 1}^{J} \Pr (j | v_{j}) \log (\Pr (j | v_{j})) f (v_{j}))

(4)

Here, $μ$ is the unit cost of information processing. Putting $\tilde{C}$ . in the objective function of (3) and solving the optimization problem, we get:

\Pr (j | v_{j}) = \frac{\Pr_{j}^{0} \exp (v_{j} / μ)}{\sum_{k = 1}^{J} \Pr_{k}^{0} \exp (v_{k} / μ)} = \frac{\exp (v_{j} / μ + \log (\Pr_{j}^{0}))}{\sum_{k = 1}^{J} \exp (v_{k} / μ + \log (\Pr_{k}^{0}))}

(5)

This is equivalent in formulation to the MNL model where the systematic utility of a choice alternative is offset by the prior probability of choosing that alternative. The key challenge to estimating (5) is the specification of the prior probability function, $\Pr_{j}^{0}$ and collecting relevant data. Some studies, like Caplin et al. ( 34 ) and Joo ( 35 ), proposed using an observed market share as an alternative to measuring the prior probability. Habib recently proposed individual choice-maker-specific market share (through a simple MNL model) to measure the prior probability ( 36 ). The study empirically showed the better performance of RI-based discrete choice models compared with classical RUM-based models. In another study, Shakib and Habib demonstrated the feasibility of using latent preferences produced from an efficient-adaptive stated preference survey to measure prior beliefs ( 37 ).

In this study, we propose to measure the prior probability using the individual-specific perceived market share of alternatives from an older dataset of the same population (i.e., an older repeated cross-sectional survey data). Thus, we propose to utilize the concept of pseudo-panels in the context of RI. In this case, as the same individual is not observed between the datasets, $\Pr_{j}^{0}$ is based on the cohort mean of the individual from the older dataset. This assumption is substantiated by the statistical properties of repeated cross-sectional data or pseudo-panel data, where dynamic autocorrelation (traceable through aggregates) likely remains even though observations are not repeated ( 38 ). In other words, the preference of an individual belonging to cohort $c$ in time $t$ is more similar to the aggregate preference of cohort $c$ in time $t - 1$ . Based on this assumption, we first estimate a “perceived market share model” (i.e., an MNL model with a full set of alternative-specific constants and choice-maker attributes) using data from time $t - 1$ . Then, we apply the estimated model to the data of time $t$ to generate individual choice-maker-specific market share at time t, which is considered as the prior probability $\Pr_{j}^{0}$ at time t. With the prior probability identified, the RI-based choice model of Sanko ( 5 ) is estimated using repeated cross-sectional data from time t via the classical maximum likelihood estimation technique. In this way, we fuse information from two time periods $t - 1$ and $t$ by applying the RI theory to the context of repeated cross-sectional survey data (i.e., pseudo-panel data). For the estimation of empirical models of this paper, the GAUSS MAXLIK function was used ( 39 ). Figure 1 summarizes the proposed RI-based fusion method of this study.

Figure 1.

Proposed RI-based temporal fusion method.

The proposed RI-based fusion method can be easily extended to more than two cycles of repeated cross-sectional data. An interesting feature of the framework is that only the estimated model from each time period can be carried over to the next period like a Markov process. So, the current state is dependent only on the immediately prior state, and there is no need to estimate computationally expensive pooled data models.

Empirical Analysis

Data Description and Study Area

Data for the empirical investigation of this paper come from the two cycles of a large-scale post-secondary student travel survey conducted in the GTHA. The survey is named StudentMoveTO (SMTO) ( 40 ). The two datasets were collected at an interval of 4 years (Fall 2015 and Fall 2019). Four universities participated in both survey cycles, representing nearly 184,000 post-secondary students across seven campuses. All of these campuses are situated in Toronto; however, the students’ home locations are spread across the entire study area. The surveys collected information on individual and household socio-demographics, mobility tool ownership, a one-day travel diary, and typical commute characteristics. As the objective of the empirical exercise is to investigate post-secondary students’ commuting mode choice decisions, we took only respondents who reported complete information about socio-demography, mobility tool ownership, and travel diary. Table 1 compares the two samples used in this study.

Table 1.

Descriptive Statistics of the 2015 and 2019 StudentMoveTO Samples

	2015 sample	2019 sample		2015 sample	2019 sample
N	10,715	4,851	N	10,715	4,851
Gender			Bicycle owner
Female	66.3%	68.7%	No	51.8%	58.4%
Male	32.8%	28.8%	Yes	48.2%	41.6%
Other	0.9%	2.5%	Bikeshare member
Age*	23.61 (6.82)	23.08 (6.19)	No	98.60%	96.87%
Post-secondary institution			Yes	1.40%	3.13%
OCAD University	2.9%	3.3%	Home zone
Ryerson University	19.3%	28.7%	Toronto	70.1%	65.5%
University of Toronto	55.8%	49.7%	Durham	2.4%	2.9%
York University	21.9%	18.3%	York	10.7%	12.3%
Level of education			Peel	13.2%	14.8%
Undergraduate	73.1%	72.9%	Halton	1.9%	2.6%
Graduate	25.3%	25.1%	Hamilton	0.4%	0.6%
Other	1.7%	2.0%	Other	1.3%	1.3%
Student status			Household living situation
Full-time	91.4%	92.1%	Live with family/parents	52.7%	55.0%
Part-time	6.9%	5.9%	Live with roommates	23.2%	23.3%
Other	1.7%	2.0%	Live with partner	12.6%	10.9%
Program type			Live alone	11.5%	9.4%
Art & science	68.7%	69.1%	Live with host family/friend	0.0%	1.4%
Business	7.2%	8.1%	Number of household members*	3.31 (1.67)	3.50 (2.52)
Engineering	11.3%	11.5%	Number of household vehicles*	1.07 (1.05)	1.17 (1.13)
Health & medicine	10.1%	10.0%	Building type
Law	1.5%	1.3%	Single detached house	33.6%	35.1%
Continue	1.1%	0.0%	Semi-detached house	13.3%	9.8%
Driving license owner			Row/Townhouse (attached)	6.2%	10.4%
No	38.7%	35.1%	Apartment or Condo	39.1%	36.1%
Yes	61.3%	64.9%	On-campus residence	5.8%	5.1%
Transit pass owner			Other (Ex: mobile home, chalet…)	0.7%	0.8%
No	59.2%	67.9%	I don’t know	1.3%	1.0%
Yes	40.8%	32.1%	Prefer not to answer	0.0%	1.7%
			Home-to-campus distance (km)*	12.57 (18.59)	13.57 (14.63)

Note: The table shows the mean value and standard deviation within parentheses for continuous variables only.

Values in bold denote cases where $p < 0.05$ based on the Z-test and thus the two surveys differ significantly on the specific categorical attribute.

$p < 0.05$ based on the Mann–Whitney U test and thus the non-normally distributed continuous attribute is significantly different between the two surveys.

During the 4-year time lag between the two survey waves, the most important transportation infrastructure change in the study area was the extension of a subway line where six new stations were added to the network. Thus, in general, the students of the two waves were making mode choices based on similar characteristics of the travel alternatives available to them. However, the cost of living in Toronto increased significantly within these 4 years, thereby forcing the students to live further away from their campuses, which in turn affected their travel mode choices.

In general, the samples appear to be similar in relation to individual and household socio-demographics. The average age of the two datasets is about 23 years. Both the samples are skewed toward female respondents, a common issue in the post-secondary student travel data collection process ( 41 ). Some 73% of the respondents in the datasets are undergraduate students, and about 91% are full-time students. The distribution of program types in the datasets is also quite similar (69% Arts & Science students, 11% Engineering students, 10% Health & Medicine students, and about 7% Business students). In relation to household attributes, more than 50% of the students live with family, whereas 23% live with roommates. About 5% of the students live in on-campus residences. The average household size in both the samples is above three, and the average number of the household vehicle is about one. However, using appropriate statistical tests (the Z-test and the Mann–Whitney U test), it is found that the two surveys differ significantly on specific attributes like gender, home zone, household size, number of household vehicles, home-to-campus distance, and so forth. This is expected given that the two surveys were conducted on different cohorts of students at different time periods.

From Table 1, the students are found to live further from campus in 2019 compared with 2015. In relation to mobility tools, more students have driver’s licenses and bikeshare memberships, whereas fewer students own transit passes and bikes in 2019 compared with 2015. Interestingly, the proportion of transit pass holders decreased significantly in the second wave, although the proportion of students using “Transit with walk access” increased (see Figure 3 below). This may be linked to students living further from campus in 2019 and therefore they might be choosing classes in such a way that their schedule is consolidated to a few days per week. This might require them to commute less frequently. As such, paying for monthly transit pass might be more expensive than paying for individual trip and therefore fewer students are buying it in 2019 compared with 2015.

The travel diary components of the two surveys collected information about all trips made by the respondent on the survey day, including trip start time, travel mode, trip origin and destination locations, and trip purpose. The diary data indicate that more than one-third of all student trips are for commuting purposes (i.e., home-to-campus trips) (Figure 2). This highlights the importance of better understanding the factors that affect their commute mode choice decisions. Among the other destination types, more work trips and fewer shopping/errand & leisure/entertainment trips were observed in 2019.

Figure 2.

Distribution of trip purposes observed in the two samples.

The SMTO does not collect user-defined LOS (travel time and travel cost) information of the trips. These are imputed using a calibrated deterministic user equilibrium traffic assignment model called the GTA model. The GTA model is calibrated using the 2016 TTS data. TTS is a large-scale (5% sample) household travel survey conducted in the GTHA ( 27 ). The assignment model consistently produces expected travel time and cost by auto and transit for any given pair of traffic analysis zones in the study area. It should be noted that the GTA model, like most other large-scale transport models, is susceptible to endogeneity ( 42 ). However, addressing this limitation is outside the scope of this paper.

The choice set for each individual was determined using feasibility constraints: one must have a driver’s license and a car to use the car drive mode; total transit travel time over 150 min is considered to be infeasible for commuting; a distance over 3 km is considered to be infeasible for walking, and a distance over 10 km is considered to be infeasible for using bicycle. The feasibility constraints for transit, walk, and bicycle are based on the sample data. After removing the observations with infeasible mode choices, missing personal and household socioeconomic information, and missing LOS attributes, final datasets of 6,941 and 2,497 commuting trips from the same number of commuters are obtained for 2015 and 2019, respectively (i.e., the trips in the final datasets are made by different individuals, so there is no scope for potential biases in the data).

In the datasets, seven major commute modes are observed: car drive, car passenger, transit with walk access, park and ride, kiss and ride, walk, and cycle. The distributions of the modes are shown in Figure 3. Students are highly reliant on transit, with almost half of the respondents using transit for their commute trips (45% in 2015 and 53% in 2019). A gradual shift from active to motorized modes (especially transit) is observed between the samples. This can be partly associated with students living further from campus in 2019. A similar shift is also observed for the non-commuting trips, indicating a transition in travel mode preference among students, which necessitates an in-depth mode choice investigation. To accurately capture the evolution of commute mode choice behavior over the years, this study fuses the two samples using the proposed RI-based temporal fusion method. Specifically, the 2015 data are used to estimate the “perceived market share model” to calculate the prior probabilities for the 2019 data. About 80% of the 2019 sample is used to estimate the proposed RI-based MNL model of commute mode choices, and the remaining 20% is used as a holdout sample for model performance validation.

Figure 3.

Distribution of commute modes observed in the two samples.

Results

Table 2 presents the final RI-MNL model. To reach the final model, we first estimated the “perceived market share model” or “prior probability sub-model” (which is a classical MNL model containing alternative-specific constants and choice-maker attributes) using the 2015 data. Then, we applied the estimated model to the 2019 estimation data to calculate prior probabilities for the 2019 respondents. Finally, the full RI-MNL model was estimated using the 2019 estimation sample along with the calculated prior probabilities. The model specification was derived by accommodating variables with proper signs and statistical significance. The critical value of the t-stat with a 95% confidence limit was used as the threshold for considering variables in the model. However, a few parameters with lower t-stat values are retained in the model because the corresponding variables provide considerable insight into the behavioral process.

Table 2.

Empirical Models

		RI-MNL model
		Prior probability sub-model		Full model		RUM-MNL model
Loglikelihood of full model		na		−1440.45		−1462.15
Likelihood ratio against equiprobable model		na		0.5113		0.5040
Likelihood ratio against aggregate market share model		na		0.1614		0.1487
		Parameter	t-stat	Parameter	t-stat	Parameter	t-stat
Systematic utility function/pay-off function
	Alternative-specific constant
	Car drive	0.00	na	0.00	na	0.00	na
	Car passenger	6.16	4.34	0.28	0.66	4.77	3.12
	Transit with walk access	8.07	11.08	−0.99	−1.93	8.34	5.60
	Park and Ride	−1.26	−4.29	−0.39	−0.51	−2.11	−0.72
	Kiss and Ride	4.29	3.39	−1.03	−0.69	2.86	1.70
	Walk	9.14	9.03	0.35	1.00	9.56	6.36
	Cycle	3.87	3.30	−2.06	−2.72	−2.42	−0.93
	Travel time (minutes)
	Car drive	na	na	−0.06	−4.23	−0.04	−7.25
	Car passenger	na	na	−0.07	−3.83	−0.05	−6.67
	Transit with walk access	na	na	−0.02	−4.31	−0.02	−8.10
	Park and Ride	na	na	−0.01	−1.64	−0.01	−1.34
	Kiss and Ride	na	na	−0.01	−0.83	−0.004	−0.71
	Travel cost ($)
	All motorized modes	na	na	−0.10	−1.47	−0.07	−1.76
	Travel distance (km)
	Walk & Cycle	na	na	−0.19	−3.45	−0.09	−4.14
	Female
	Car drive	na	na	0.69	2.02	na	na
	Car passenger	0.51	3.37	na	na	na	na
	Transit with walk access	0.26	2.92	0.74	2.60	na	na
	Park and Ride	0.52	2.65	na	na	na	na
	Kiss and Ride	0.59	3.66	1.26	1.93	na	na
	Walk	0.28	2.71	na	na	−0.64	−2.50
	Cycle	na	na	−0.93	−2.04	−1.49	−4.64
	Log of age (years)
	Car passenger	−2.46	−6.12	na	na	na	na
	Transit with walk access	−1.83	−8.47	na	na	na	na
	Park and Ride	na	na	na	na	2.24	2.94
	Kiss and Ride	−1.82	−5.14	na	na	na	na
	Walk	−2.08	−7.19	na	na	na	na
	Cycle	−1.47	−4.38	na	na	2.17	3.41
	Graduate student
	Kiss and Ride	na	na	1.04	1.75	na	na
	Cycle	0.76	6.67	na	na	na	na
	Full-time student
	Car passenger	0.91	2.15	na	na	na	na
	Transit with walk access	na	na	0.85	2.16	na	na
	Park and Ride	na	na	na	na	−0.95	−1.66
	Kiss and Ride	1.39	2.70	−2.56	−2.08	na	na
	Walk	0.45	2.40	na	na	na	na
	Engineering student
	Transit with walk access	0.42	2.86	na	na	na	na
	Kiss and Ride	0.61	2.91	na	na	na	na
	Walk	0.84	4.09	na	na	na	na
	Cycle	0.77	3.64	na	na	0.54	1.32
	Works off-campus
	Car passenger	na	na	−1.00	−2.09	−0.57	−2.49
	Bicycle owner
	Cycle	na	na	4.38	5.21	2.74	6.77
	Family supports education
	Car drive	na	na	na	na	−0.52	−2.32
	Car passenger	na	na	na	na	0.65	1.78
	Live with family
	Kiss and Ride	na	na	na	na	1.05	1.66
	Cycle	na	na	−4.14	−3.15	na	na
	Not live with family
	Cycle	na	na	na	na	1.78	3.50
	Household vehicle number
	Car passenger	−0.33	−3.56	na	na	0.40	3.26
	Transit with walk access	−0.84	−13.12	na	na	na	na
	Park and Ride	0.24	2.42	na	na	0.89	5.45
	Kiss and Ride	−0.28	−3.19	na	na	0.45	2.86
	Walk	−0.94	−11.99	na	na	na	na
	Cycle	−1.08	−13.63	na	na	−0.58	−2.41
	Apartment/condo
	Transit with walk access	na	na	1.17	3.76	0.84	4.49
	Park and Ride	na	na	na	na	−1.84	−1.75
	Walk	na	na	0.92	2.38	0.85	3.43
	On-campus residence
	Walk	1.62	8.09	na	na	1.46	3.08
	Family income < 15k
	Car passenger	−0.51	−2.20	na	na	na	na
	Kiss and Ride	−0.57	−2.18	na	na	na	na
	Cycle	0.28	2.39	na	na	na	na
	Toronto resident
	Transit with walk access	0.48	4.91	na	na	na	na
	Park and Ride	−1.34	−5.21	na	na	na	na
	Kiss and Ride	−1.21	−7.47	na	na	na	na
	Walk	0.80	4.66	na	na	na	na
	Cycle	2.31	5.69	na	na	na	na
Unit cost of information processing
	Family supports education	na	na	−0.37	−2.46	na	na
	Household size greater than 3	na	na	−0.48	−3.14	na	na
	More than 2 vehicles in the household	na	na	0.60	3.34	na	na
	Art & science student	na	na	−0.27	−1.80	na	na
	Law student	na	na	0.81	2.79	na	na

Note: RI = rational inattention; RUM = random utility maximization; MNL = multinomial logit; na = not applicable.

The estimates of the prior probability sub-model meet expectations and are in line with other studies in the literature. For example, female students are more likely to share rides, use transit, or walk to campus than driving or cycling. The negative association of females with driving and cycling has been well documented in previous studies ( 41 , 43 , 44 ). Based on the model estimates, older students are more likely to choose “car drive” and “park and ride” for commuting, whereas full-time students are more likely to choose “car passenger,”“kiss and ride,” and “walk”. The number of household vehicles is positively associated with driving modes and negatively associated with non-driving modes. Students who live on campus are more likely to walk to class. Students from lower-income households are less likely to share ride and more likely to cycle to campus. Students who live in the City of Toronto are more likely to choose transit, walk, or cycling for commuting, which makes sense given the city’s high transit accessibility and well-connected cycling network.

The estimates of the full RI-MNL model show that LOS attributes (travel time, travel cost, and trip length) have negative signs that match expectations. In relation to socio-demographic attributes, students who work off-campus are less likely to share rides, perhaps because they have more out-of-campus commitments and cannot synchronize their schedules with others for ridesharing. Students who live in apartments or condominiums are more likely to use transit or walk. This makes sense because apartments and condominiums are usually constructed in areas well served by transit.

A key feature of the RI model is that it allows us to capture the unit cost of information processing, which, formulation-wise, is equivalent to the scale parameter of the classical model. In this study, we specified the information processing cost as an exponential of a linear-in-parameter function to ensure positivity. Thus, a variable with a positive parameter in the function has a higher unit cost of information processing. This tends to make the choice maker less calculative in decision-making and vice versa. The model confirms that information processing costs are lower for students whose families financially support their education. It indicates that these students are more attentive to the performances of the choice alternatives than students who do not receive financial support from their families. Students whose household has more than three members spend more attention on the performances of choice alternatives than students belonging to smaller households. Commuters with more than two household vehicles tend to be less attentive to the performances of alternative modes as they are more likely to choose private car-based modes. Arts and Science students spend more attention whereas law students spend less attention on the performances of the commuting mode choice alternatives.

To assess the performance of the RI-MNL model estimated using data from two survey cycles, we compare it with a classical RUM-based MNL model estimated using the 2019 estimation sample only. It should be noted that the purpose of this RUM-MNL model is to serve as a baseline for assessing the benefit of using data from two survey cycles (via the RI-based MNL model) in comparison to using data from one cycle only. As such, it is estimated by using data from 2019 only and not by pooling data from the two survey cycles. The parameters of the base model are shown in Table 2. Overall, the estimates align with that of the RI-MNL model. However, some socio-demographic variables, like respondent age, are significant in the prior probability model but not in the full fused data model. This indicates that for post-secondary students, age tends to affect only the prior probability of choosing an alternative (like an underlying generational factor); however, the cost of additional information processing for choice-making is affected by other more direct factors like mobility tool ownership, living situation, work status, and so forth.

Interestingly, the fused data model fit the 2019 estimation data slightly better than the classical MNL model. This implies that introducing pseudo-panel-based prior probabilities in the RI-MNL model improves its explanatory power. Subjective values of travel time savings (SVTTS) are also estimated to have better intuitive evaluation of the models. SVTTS is defined as the ratio of travel time to travel cost coefficients. It captures the time–money trade-offs in mode choices. Table 3 presents the SVTTS estimates of the RI-MNL and RUM-MNL models presented in Table 2.

Table 3.

Subjective Value of Travel Time Savings (SVTTS) Estimates

	RI-MNL model	RUM-MNL model
Car drive	$36	$38
Car passenger	$44	$44
Transit with walk access	$15	$16
Park and ride	$9	$7
Kiss and ride	$5	$3

Note: RI = rational inattention; RUM = random utility maximization; MNL = multinomial logit.

Alternative-specific travel time coefficients and generic travel cost coefficients are considered for the motorized modes to capture the difference in their SVTTS estimates. Although it is impossible to have any ground truth about the values of SVTTS, it is found that the classical RUM-MNL model over-estimated the SVTTS for the larger market share alternatives like “Car drive” and “Transit with walk access” whereas the RI-MNL model over-estimated the SVTTS for the smaller market share alternatives like “Park and Ride” and “Kiss and Ride”. More in-depth analysis is required to correctly understand the underlying cause of such variation in mode specific SVTTS estimates between the two models. Overall, the results provide evidence that our proposed method provides an appropriate framework to fuse repeated cross-sectional datasets based on the theory of RI in discrete choice modeling.

Validation

In this section, we compare the forecasting performance of the RI-based model (estimated using data from two survey cycles) with that of the classical RUM-based model. For this validation exercise, we use the holdout sample of the 2019 data, which comprise 500 commuting trips. We apply both models to this dataset and compare the predictions using multiple metrics as shown below.

First, we estimate the first preference recovery (FPR) ( 8 ) for both the models. FPR, also referred to as “percentage of correct predictions,” is an aggregate measure that shows the proportion of individuals effectively choosing the alternative (travel mode) with the highest modeled utility. The measure is given by Equation 6:

FPR = \frac{100}{N} \sum_{i = 1}^{N} (y_{i}^{p} = y_{i}^{o})

(6)

Here, $y_{i}^{p}$ is the model predicted alternative (i.e., the alternative with the highest modeled utility) for an individual $i$ , $y_{i}^{o}$ is the observed choice (travel mode) for individual $i$ in the validation data, and $N$ is the number of individuals (observations) in the data. Based on this definition, the RI-MNL model (i.e., the RI-based fused data model) is found to have 70.8% FPR, which is slightly higher than the 70.4% FPR of the RUM-MNL model (i.e., the classical MNL model estimated using 2019 survey data only). This demonstrates the slightly better prediction performance of the RI-based fused data model. However, a major limitation of the FPR measure is its inability to differentiate between the range of probabilities assigned to the chosen alternatives ( 8 ). Therefore, we also calculate other measures of predictive performance that consider the probabilities assigned to the chosen and non-chosen alternatives as shown in Table 4 below.

Table 4.

Forecasting Performance on the Holdout Sample

	Observed trips	RI-based fused data model	MNL model
Car drive	44	44	42
Car passenger	27	20	21
Transit with walk access	250	250	253
Park and ride	18	19	18
Kiss and ride	16	18	17
Walk	126	131	131
Cycle	19	18	18
Mean absolute error		2.286	2.571
Chi-squared index		2.371	1.774

Note: RI = rational inattention; MNL = multinomial logit.

Table 4 presents the observed mode choices and predictions for each model, mean absolute error, and Chi-Squared values (summation of the squared difference between observed and predicted modal shares normalized by corresponding observed share). At a 5% significance level, the critical chi-squared value for 6 degrees of freedom is 12.59. The results show that both models perform quite well in immediate short-term prediction. However, the fused data model’s slightly lower mean absolute error hints toward its improved forecasting performance.

To better understand the forecasting performance, we present the validation results of each model in the form of a confusion matrix, which includes the observed and predicted number of trips for each mode, as well as recall and precision accuracies. The equations used to calculate precision and recall are presented below:

Precision = \frac{observations that are correctly predicted as mode M}{all observations that are predicted as mode M}

Recall = \frac{observations that are correctly predicted as mode M}{observations from mode M}

Tables 5 and 6 present the confusion matrices for the RI-based fused data model and the classical MNL model, respectively. For both the models, the recall accuracies for transit and walk are found to be the highest, which is probably because of the significantly larger number of transit and walk trips in the estimation data. The two models are also evaluated based on their corresponding F-score, which is an accuracy measurement defined as the harmonic mean of recall and precision. Higher F-score represents higher prediction accuracy. The F-score is computed for each travel mode in the two models. Table 7 illustrates the comparison of F-scores across modes and models.

Table 5.

Confusion Matrix of Rational Inattention (RI)-based Fused Data Model

Predicted (RI-based fused data model)	Car drive	Car passenger	Transit with walk access	Park and ride	Kiss and ride	Walk	Cycle	Total	Recall (%)
Observed
Car drive	15	3	17	5	3	1	1	44	35.3
Car passenger	4	3	16	2	3	0	0	27	11.3
Transit with walk access	16	10	176	6	9	25	8	250	70.4
Park and ride	4	1	8	3	2	0	0	18	18.9
Kiss and ride	2	1	8	3	2	0	0	16	10.7
Walk	2	1	19	0	0	99	4	126	78.8
Cycle	1	0	6	0	0	6	6	19	30.2
Total	44	20	250	19	18	131	18	500	na
Precision (%)	35.3	15.6	70.4	17.9	9.6	75.6	31.6	na	na

Note: RI = rational inattention; na = not applicable.

The diagonal bold elements represent the number of correctly predicted trips for each mode.

Table 6.

Confusion Matrix of Classical Multinomial Logit (MNL) Model

Predicted (MNL model)	Car drive	Car passenger	Transit with walk access	Park and Ride	Kiss and Ride	Walk	Cycle	Total	Recall (%)
Observed
Car drive	15	3	18	4	2	1	0	44	34.3
Car passenger	3	3	17	1	2	0	0	27	11.0
Transit with walk access	15	11	174	6	9	26	8	250	69.8
Park and ride	4	1	9	3	1	0	0	18	18.2
Kiss and ride	2	1	9	2	2	0	0	16	10.3
Walk	2	1	20	0	0	99	3	126	78.3
Cycle	1	0	6	0	0	7	5	19	27.8
Total	42	21	253	18	17	131	18	500	na
Precision (%)	36.3	14.5	68.8	18.3	9.7	75.0	29.8	na	na

Note: MNL = multinomial logit; na = not applicable.

The diagonal bold elements represent the number of correctly predicted trips for each mode.

Table 7.

Comparison of F-scores across Modes and Models

	Car drive	Car passenger	Transit with walk access	Park and ride	Kiss and ride	Walk	Cycle
RI-based fused data model	0.3532	0.1314	0.7039	0.1838	0.1010	0.7715	0.3088
MNL model	0.3527	0.1254	0.6928	0.1827	0.0996	0.7665	0.2879

Note: MNL = multinomial logit; RI = rational inattention.

Although the models’ performances are quite similar, the weighted average precision, recall, and F-score of the RI-based fused data model (60.3%, 61.0%, and 0.61) are slightly better than those of the MNL model (59.4%, 60.3%, and 0.59), highlighting its improved forecasting accuracy. However, to reach conclusive evidence, future studies should stress test the models using independent data from a more recent time point.

Conclusion and Future Works

The state-of-the-art travel demand models use data from the most recent time point, even when cross-sectional data are available from multiple time points. Forecasts of travel demand based on the most current data are expected to be more accurate. However, data from a single time point cannot capture the constant evolution of travel behavior. To overcome the limitation, this study proposes a method to fuse repeated cross-sectional travel survey data based on RI theory in discrete choice modeling. For empirical application, it uses data from two cycles of a large-sample post-secondary student travel survey in the GTHA to investigate the commuting mode choices of post-secondary students in the area. The study contributes to the literature in two ways. First, methodologically, it proposes an approach to fuse repeated cross-sectional datasets to generate better insights into the evolution of travel preferences. Second, empirically, the study contributes to university students’ commute mode choice literature by estimating the proposed model using a rich set of exogenous variables. Although the proposed method is empirically tested for mode choice decisions in this study, it can easily be extended to other contexts of disaggregate travel behavior analysis.

The study proposes to measure the prior probability of choice alternatives in the RI model using an individual-specific perceived market share of alternatives from an older dataset of the same population (i.e., an older repeated cross-sectional survey data). A more recent dataset should be used to model the conditional heterogeneous choices. The proposed fusion method is theoretically more robust than existing pooling techniques (where repeated cross-sectional datasets are pooled together to estimate meta-models with year-specific temporal factors and scale parameters). Moreover, it is computationally less burdensome. Empirical investigation demonstrates that the proposed method generates behaviorally consistent results and thus provides an appropriate framework for fusing repeated cross-sectional datasets.

Validation of the estimated RI-MNL model using a holdout sample indicates its improved forecasting performance compared with the classical MNL model. Future studies should stress test the models using independent data from a more recent time. Moreover, future work should also attempt to overcome the independent and irrelevant alternative (IIA) limitation of the RI-MNL by adopting the information cost function proposed by Fosgerau et al. ( 16 ). Currently, research is underway to apply the proposed method to fuse multiple cross-sectional travel surveys conducted at different time points during the pandemic to offer a better understanding of the potential trend of post-pandemic mobility.

Footnotes

Author Contributions

The authors confirm contribution to the paper as follows: study conception and design: S. Hossain, K.N. Habib; data collection: S. Hossain; analysis and interpretation of results: S. Hossain, K.N. Habib; draft manuscript preparation: S. Hossain, K.N. Habib. All authors reviewed the results and approved the final version of the manuscript.

Declaration of Conflicting Interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: The research was funded by an NSERC Discovery Grant.

ORCID iDs

Sanjana Hossain

Khandker Nurul Habib

The authors claim the sole responsibilities of all results, comments, and interpretations made in the paper.

References

Bansal

Kockelman

K. M.

Forecasting Americans’ Long-Term Adoption of Connected and Autonomous Vehicle Technologies. Transportation Research Part A: Policy and Practice, Vol. 95, 2017, pp. 49–63.

Fagnant

D. J.

Kockelman

Preparing a Nation for Autonomous Vehicles: Opportunities, Barriers and Policy Recommendations. Transportation Research Part A: Policy and Practice, Vol. 77, 2015, pp. 167–181.

Vij

Understanding Consumer Demand for New Transport Technologies and Services, and Implications for the Future of Mobility. In Data-Driven Multivalence in the Built Environment ( N.

Biloria

, ed.), Springer International Publishing, Cham, 2020, pp. 91–107. https://doi.org/10.1007/978-3-030-12180-8_5.

Beck

M. J.

Hensher

D. A.

Australia 6 Months After COVID-19 Restrictions- Part 1: Changes to Travel Activity and Attitude to Measures. Transport Policy, Vol. 128, 2021, pp. 286–298.

Sanko

Travel Demand Forecasts Improved by Using Cross-Sectional Data from Multiple Time Points. Transportation, Vol. 41, No. 4, 2014, pp. 673–695. https://doi.org/10.1007/s11116-013-9464-7.

Golob

T. F.

Kitamura

Long

Panels for Transportation Planning. Springer, Boston, MA, 1997.

Kitamura

Panel Analysis in Transportation Planning: An Overview. Transportation Research Part A: General, Vol. 24, No. 6, 1990, pp. 401–415.

Ortúzar

Willumsen

L. G.

Modelling Transport. John Wiley and Sons, Chichester, 2011.

Anowar

Eluru

Miranda-Moreno

L. F.

Analysis of Vehicle Ownership Evolution in Montreal, Canada Using Pseudo Panel Analysis. Transportation, Vol. 43, No. 3, 2016, pp. 531–548. https://doi.org/10.1007/s11116-015-9588-z.

10.

Sanko

Travel Demand Forecasts Improved by Using Cross-Sectional Data from Multiple Time Points: Enhancing Their Quality by Linkage to Gross Domestic Product. Transportation, Vol. 45, No. 3, 2018, pp. 905–918. https://doi.org/10.1007/s11116-016-9755-x.

11.

Salem

Habib

K. M. N.

Use of Repeated Cross-Sectional Travel Surveys to Develop a Meta Model of Activity-Travel Generation Process Models: Accounting for Changing Preference in Time Expenditure Choices. Transportmetrica A: Transport Science, Vol. 11, No. 8, 2015, pp. 729–749. https://doi.org/10.1080/23249935.2015.1066900.

12.

Salem

Habib

K. M. N.

Use of Repeated Cross-Sectional Travel Surveys for Developing Meta Models of Activity-Travel Scheduling Processes. Transportation, Vol. 46, No. 2, 2019, pp. 395–423. https://doi.org/10.1007/s11116-018-9954-8.

13.

Habib

K. M. N.

Swait

Salem

Using Repeated Cross-Sectional Travel Surveys to Enhance Forecasting Robustness: Accounting for Changing Mode Preferences. Transportation Research Part A: Policy and Practice, Vol. 67, 2014, pp. 110–126.

14.

Forsey

Habib

K. M. N.

Miller

E. J.

Shalaby

Temporal Transferability of Work Trip Mode Choice Models in an Expanding Suburban Area: The Case of York Region, Ontario. Transportmetrica A: Transport Science, Vol. 10, No. 6, 2014, pp. 469–482. https://doi.org/10.1080/23249935.2013.788100.

15.

Matějka

McKay

Rational Inattention to Discrete Choices: A New Foundation for the Multinomial Logit Model. American Economic Review, Vol. 105, No. 1, 2015, pp. 272–298. https://www.aeaweb.org/articles?id=10.1257/aer.20130047.

16.

Fosgerau

Melo

de Palma

Shum

Discrete Choice and Rational Inattention: A General Equivalence Result. International Economic Review, Vol. 61, No. 4, 2020, pp. 1569–1589. https://doi.org/10.1111/iere.12469.

17.

Deaton

Panel Data from Time Series of Cross-Sections. Journal of Econometrics, Vol. 30, No. 1–2, 1985, pp. 109–126.

18.

Dargay

J. M.

Vythoulkas

P. C.

Estimation of a Dynamic Car Ownership Model: A Pseudo-Panel Approach. Journal of Transport Economics and Policy, Vol. 33, No. 3, 1999, pp. 287–301. http://www.jstor.org/stable/20053816.

19.

Dargay

J. M.

Determinants of Car Ownership in Rural and Urban Areas: A Pseudo-Panel Analysis. Transportation Research Part E: Logistics and Transportation Review, Vol. 38, No. 5, 2002, pp. 351–366.

20.

Huang

The Use of Pseudo Panel Data for Forecasting Car Ownership. University of London, 2007.

21.

Matas

Raymond

J. L. L.

Changes in the Structure of Car Ownership in Spain. Transportation Research Part A: Policy and Practice, Vol. 42, No. 1, 2008, pp. 187–202.

22.

Bush

Forecasting 65+ Travel: An Integration of Cohort Analysis and Travel Demand Modeling. Doctoral dissertation. Massachusetts Institute of Technology, Cambridge, 2003.

23.

Goulias

K. G.

Blain

Kilgren

Michalowski

Murakami

Catching the Next Big Wave: Do Observed Behavioral Dynamics of Baby Boomers Force Rethinking of Regional Travel Demand Models?

Transportation Research Record: Journal of the Transportation Research Board, 2007. 2014: 67–75.

24.

Weis

Axhausen

K. W.

Induced Travel Demand: Evidence from a Pseudo Panel Data Based Structural Equations Model. Research in Transportation Economics, Vol. 25, No. 1, 2009, pp. 8–18.

25.

Sobhani

Eluru

Pinjari

Evolution of Adults’ Weekday Time Use Patterns from 1992 to 2010: A Canadian Perspective. Presented at 93rd Annual Meeting of the Transportation Research Board, Washington, D.C., 2014.

26.

Habib

K. M. N.

Weiss

Evolution of Latent Modal Captivity and Mode Choice Patterns for Commuting Trips: A Longitudinal Analysis Using Repeated Cross-Sectional Datasets. Transportation Research Part A: Policy and Practice, Vol. 66, No. 1, 2014, pp. 39–51.

27.

Data Management Group (DMG). Transportation Tomorrow Survey. 2018. http://dmg.utoronto.ca/pdf/tts/2016/2016TTS_Conduct.pdf.

28.

Dias

F. F.

Kim

Bhat

C. R.

Pendyala

R. M.

Lam

W. H. K.

Pinjari

A. R.

Srinivasan

K. K.

Ramadurai

Modeling the Evolution of Ride-Hailing Adoption and Usage: A Case Study of the Puget Sound Region. Transportation Research Record: Journal of the Transportation Research Board, 2020. 2675: 81–97.

29.

Borysov

S. S.

Rich

Introducing Synthetic Pseudo Panels: Application to Transport Behaviour Dynamics. Transportation, Vol. 48, No. 5, 2021, pp. 2493–1520. https://doi.org/10.1007/s11116-020-10137-5.

30.

Sims

C. A.

Implications of Rational Inattention. Journal of Monetary Economics, Vol. 50, No. 3, 2003, pp. 665–690.

31.

Sims

C. A.

Chapter 4 - Rational Inattention and Monetary Economics. In Handbook of Monetary Economics ( B. M.

Friedman

Woodford

, eds), Elsevier, North-Holland, 2010, pp. 155–181. https://www.sciencedirect.com/science/article/pii/B9780444532381000041.

32.

Simon

H. A.

Theories of Decision-Making in Economics and Behavioral Science. American Economic Review, Vol. 49, No. 3, 1959, pp. 253–283.

33.

Shannon

C. E.

A Mathematical Theory of Communication. The Bell System Technical Journal, Vol. 27, No. 3, 1948, pp. 379–423.

34.

Caplin

Leahy

Matějka

Rational Inattention and Inference from Market Share Data. 2016. https://www.ecb.europa.eu/pub/conferences/shared/pdf/20170925_2nd_ecb_annual_research_conference/04_John_Leahy_paper.pdf.

35.

Joo

Rational Inattention as an Empirical Framework: Application to the Welfare Effects of New Product Introduction. 2019. https://econ2017.sites.olt.ubc.ca/files/2019/10/pdf_seminar-paper_Joonhwi-Joo_25-Oct.pdf.

36.

Habib

K. N.

Rational Inattention in Discrete Choice Models: Estimable Specifications of RI-Multinomial Logit (RI-MNL) and RI-Nested Logit (RI-NL) Models. Transportation Research Part B: Methodological, Vol. 172, 2023, pp. 53–70.

37.

Shakib

Habib

K. M. N.

The Application of Rational Inattention Theory in Modelling Residential Location Choices: A Cross-Sectional Investigation Using a Stated Preference Dataset. Presented at 102nd Annual Meeting of the Transportation Research Board, Washington, D.C., 2023.

38.

Lebo

M. J.

Weber

An Effective Approach to the Repeated Cross-Sectional Design. American Journal of Political Science, Vol. 59, No. 1, 2015, pp. 242–258. https://doi.org/10.1111/ajps.12095.

39.

GAUSS Programming Language. Aptech Inc., 2022. https://www.aptech.com/.

40.

Mitra

Habib

K. M. N.

Siemiatycki

Keil

Bowes

StudentMoveTO - From Insight to Action on Transportation for Post-Secondary Students in the GTHA: 2019 Transportation Survey Findings. 2020. http://www.studentmoveto.ca/wp-content/uploads/2020/10/StudentMoveTO-2019-Report-Final-5-Updated-October-15-2020.pdf. Accessed August 1, 2021.

41.

Daisy

N. S.

Hafezi

M. H.

Liu

Millward

Understanding and Modeling the Activity-Travel Behavior of University Commuters at a Large Canadian University. Journal of Urban Planning and Development, Vol. 144, No. 2, 2018, p. 04018006.

42.

Guerrero

T. E.

Guevara

C. A.

Cherchi

de D. Ortúzar

Addressing Endogeneity in Strategic Urban Mode Choice Models. Transportation, Vol. 48, No. 4, 2021, pp. 2081–2102. https://doi.org/10.1007/s11116-020-10122-y.

43.

Akar

Clifton

K. J.

Influence of Individual Perceptions and Bicycle Infrastructure on Decision to Bike. Transportation Research Record: Journal of the Transportation Research Board, 2009. 2140: 165–172.

44.

Wang

C. H.

Akar

Guldmann

J. M.

Do Your Neighbors Affect Your Bicycling Choice? A Spatial Probit Model for Bicycling to The Ohio State University. Journal of Transport Geography, Vol. 42, 2015, pp. 122–130.

Fusing Repeated Cross-Sectional Revealed Preference Datasets based on Rational Inattention Theory: Accounting for Changing Modal Preferences

Abstract

Keywords

Literature Review

Methodology

Empirical Analysis

Data Description and Study Area

Results

Validation

Conclusion and Future Works

Footnotes

Author Contributions

Declaration of Conflicting Interests

Funding

ORCID iDs

References