Abstract
Accurate destination choice modeling is vital for transportation and urban planning, influencing demand prediction, infrastructure development, and policy assessment. Conventional methods for choice set generation—both deterministic and stochastic—often face limitations such as failing to capture individual variability, requiring extensive data, and inadequately representing decision-making processes, thus compromising reliability. This study introduces a novel application of variational autoencoders (VAEs) for generating choice sets in destination choice modeling, focusing on home-based work trips in Montreal, Canada. Leveraging the deep generative capabilities of VAEs enhances the realism and diversity of choice sets, addressing implicit perception and availability of alternatives. Our methodology integrates VAE-generated choice sets with a multinomial logit model to predict destination choices. The results show significant improvements in model fit and predictive accuracy. The VAE-based model achieved a
Introduction
Destination choice modeling is an important component in transportation and urban planning, involving the prediction of individual location choice among numerous potential destinations. Accurate destination choice models are essential for applications such as demand prediction, infrastructure planning, and policy evaluation. The process typically comprises two main steps: choice set generation and the modeling of choices from the generated choice sets. These two steps are required because the actual set of alternatives considered by individuals before making a decision is generally unobservable and can vary significantly among individuals; therefore, any misspecification in the choice set can lead to biased and inconsistent parameter estimates, affecting the reliability of the model’s predictions (
The generation of choice sets poses several challenges. First, it is generally impossible to identify the exact alternatives an individual considers before making a decision. Moreover, when dealing with a vast number of potential alternatives, enumerating all feasible options becomes impractical (
Deterministic methods involve predefined rules or criteria to generate choice sets. Common examples include rule-based methods where alternatives are included or excluded based on specific attributes or thresholds. For instance, a rule might exclude destinations beyond a certain travel time or distance. While straightforward, deterministic methods often oversimplify the decision-making process by excluding viable alternatives that do not meet rigid criteria, potentially leading to biased choice sets (
Stochastic methods incorporate randomness into the choice set generation process, utilizing techniques such as random sampling and importance sampling. These methods aim to represent the decision-making process more flexibly by allowing for variability in the choice sets generated for different individuals or scenarios. While stochastic methods are often computationally manageable in essence, their complexity can increase in practical applications, particularly when dealing with large datasets, numerous alternatives, or the need for repeated sampling and weighting adjustments (
The limitations with traditional methods necessitate a need for developing a more comprehensive modeling framework for choice set formation. It is also important to note that the concept of choice set generation is prevalent in other areas of transportation research, especially when dealing with many potential alternatives. A notable example is route choice modeling, where recent advancements have been particularly noteworthy. For instance, Yao and Bekhor combined machine learning (ML) with discrete choice models to generate choice sets with high prediction accuracy (
Similarly, deep generative methods such as generative adversarial networks (GANs) and variational autoencoders (VAEs) have demonstrated significant success in generating realistic high-dimensional data across various domains, including image synthesis and natural language processing (
In contrast, VAEs stand out as particularly well-suited for choice set generation because of several key attributes. VAEs are typically self-supervised; their performance is directly evaluated by comparing regenerated samples against the original input data. This characteristic is particularly beneficial in scenarios involving unlabeled datasets, as is the case in choice set generation. Given the typically unknown true consideration sets, the VAE framework provides a flexible means of estimating the likelihood that an alternative belongs within an individual’s implicit consideration set. Furthermore, VAEs explicitly infer structured latent representations from observed samples. This explicit probabilistic modeling enables meaningful interpretation of latent factors underlying decision-making processes. Together, these features make VAEs especially appropriate and effective for generating realistic and behaviorally meaningful choice sets. In the context of route choice modeling, Yao and Bekhor proposed a VAE-based approach that considers the implicit perception and availability of alternatives (
The potential of VAEs in choice set generation, extends beyond route choice modeling, as the underlying principles of choice set generation and perception of alternatives are applicable to other similar domains such as destination choice modeling. However, as far as we know, these deep generative models, especially VAE, have not yet been adapted for choice set generation in destination choice modeling. By leveraging the VAE approach, we can enhance the generation of destination choice sets, accounting for the implicit perception of alternatives and improving the robustness of destination choice models.
In this paper, we draw inspiration from the VAE-based choice set generation method proposed by Yao and Bekhor and adapt its principles to the context of choice set generation in destination choice modeling (
This study makes two key contribution. First, we address the challenges of large and unobservable true choice sets in destination choice modeling by using a VAE to generate choice sets. Secondly, we provide a comprehensive evaluation of the VAE approach using the Montreal Origin-Destination (OD) Survey dataset, demonstrating its applicability and effectiveness in real-world scenarios (
The rest of the paper is structured as follows. The Literature Review discusses previous methods of choice set generation, while the Methodology section details the VAE method used in this paper for choice set generation. The Data section details the dataset used in the study and the attributes incorporated into the models. The Results and Discussion section presents the findings, and the Conclusion summarizes the key insights and implications of the study.
Literature Review
Choice set formation is a critical step in the specification, estimation, and application of destination choice models. As noted by Thill, misspecification of choice sets can adversely affect parameter estimates and the accuracy of predicted choice probabilities (
Deterministic Methods
Deterministic methods, such as those used by Scott and He, and Ben-Akiva and Lerman, rely on predefined rules to generate alternatives (
Stochastic Methods
Stochastic methods introduce randomness to capture variability in decision-making processes. For example, Pel et al. used random utility models to generate choice sets, integrating randomness to better reflect real-world decision-making processes (
In addition, sampling methods are frequently used to generate choice sets by incorporating randomness. Two commonly employed sampling approaches are random sampling and importance sampling. In random sampling approaches, a random sample of locations is drawn from the universe of feasible choices to constitute the consideration choice set. Each alternative in the universe of feasible choices has an equal probability of being selected. This method is straightforward and ensures that all potential choices have an equal chance of inclusion, which can provide a broad representation of possible alternatives. In contrast, in importance sampling approaches, some destinations are considered more desirable based on variables like size and distance. Monte Carlo simulations sample destinations from an importance probability distribution, requiring correction factors to maintain the properties of the maximum likelihood estimator.
Stochastic methods that incorporate these sampling techniques have been extensively applied in transportation and urban planning research. Jonnalagadda et al. employed a combination of stratification by importance and random sampling to generate choice sets for work and intermediate trips in San Francisco (
Recent Advances in Choice Set Generation
As mentioned before, recent advances in choice set generation using ML and deep learning (DL) algorithms have been primarily applied in route choice models. Because of the similarities between route choice models and destination choice models, particularly in handling many alternatives, these advanced methods hold significant potential for application in choice set generation for destination choice models.
In this regard, Yao and Bekhor proposed a novel data-driven framework for choice set generation in route choice modeling by combining ML techniques with observed route data (
VAEs have been used for choice set generation in route choice modeling. Yao and Bekhor proposed using VAEs to generate choice sets by maximizing the likelihood of including chosen alternatives (
In summary, traditional choice set generation methods, whether deterministic or stochastic, face limitations that can affect their practicality and accuracy. Deterministic methods often exclude feasible alternatives because of strict criteria, while stochastic methods, despite their flexibility, may still struggle with retrieving the true consideration set and rely on simplified representations of the decision-making process. Both approaches have inherent challenges that can affect the reliability of destination choice models. Recent advancements in using ML methods for choice set generation highlight the potential of these techniques to enhance the predictive power and reliability of choice models. Despite these advancements, there remains a significant gap in evaluating the adoption of such advanced methods for choice set generation in destination choice modeling. Given the features of VAEs, such as their self-supervised nature which is advantageous for handling unlabeled data, and their strength in learning the latent distribution of observed samples, along with promising results in previous studies, this paper implements a VAE approach to generate choice sets in destination choice modeling. Our findings demonstrate the effectiveness of this approach in this new context.
Methodology
The methodology for choice set generation in destination choice modeling consists of three main components: choice set generation using VAEs, computing the implicit availability/perception of alternatives, and the discrete choice model for destination choice prediction. This section closely follows Yao and Bekhor and we refer interested readers to their paper for further discussions and in-depth explanations of the mathematical foundations (
Choice Set Generation Using VAE
The VAE model architecture comprises an encoder and a decoder. The encoder maps observed alternatives to a lower-dimensional latent space capturing the essential features of the alternatives. The decoder reconstructs the alternatives from the latent space, generating new samples consistent with the observed data.
The encoder
where
The decoder
where
The VAE model aims to maximize the likelihood of including chosen alternatives in the generated choice set by learning the underlying distribution of the alternatives. The underlying distribution learned by the VAE represents the distance distribution of chosen trip destinations across census tracts in the study area. Specifically, it captures the normalized distance between origin and destination zones as the primary attribute. This focus on distance ensures the generated alternatives reflect realistic spatial patterns observed in the data and aligns with its critical role in destination choice modeling. While the VAE framework is capable of incorporating multiple distributions, such as population density, total residents, and trip density, this study focuses on distance as a foundational attribute for model evaluation. Including additional distributions in the latent space would enhance the richness of generated choice sets and is an important direction for future work.
The objective function, specifically formulated in this paper for training the VAE, includes three main components: the reconstruction loss, the Kullback-Leibler (KL) divergence, and the total correlation regularizer. This combination helps in learning a compact representation of the data while promoting independence among the latent variables. The objective function is expressed as:
where
The reconstruction loss (mean squared error) measures how well the model reconstructs the input data, ensuring that the generated alternatives are close to the original ones (
The training procedure for the VAE involves several key steps. First, the attributes of the observed alternatives are normalized to ensure consistent scaling across features. Next, the encoder maps the preprocessed input data into a lower-dimensional latent space, capturing the underlying structure of the alternatives in relation to compact latent variables. Once these latent representations are learned, the decoder reconstructs new alternatives by sampling from this latent space, generating alternatives that follow the same underlying distribution as the observed data. Finally, the model optimizes an importance-weighted log-likelihood objective, ensuring that the reconstructed and generated alternatives remain statistically consistent with the patterns observed in the original dataset.
Implicit Perception of Alternatives
In choice modeling contexts, individuals typically do not explicitly reveal all alternatives they consider; instead, some alternatives are implicitly perceived or considered within their choice sets without direct indication. We define the implicit availability/perception of alternatives as the likelihood that an alternative is implicitly included within an individual’s choice set, even though such perception is not directly observable or explicitly stated. This approach assumes that each alternative has an implicit degree of availability or perception represented as a probability. To quantify these implicit perception probabilities, we use the Bayesian conditional (BC) metric (
where
Because of the intractable nature of the posterior distribution
Consequently, the
Further theoretical details and broader context are comprehensively discussed by Yao and Bekhor (
Alternative Generation Steps
Draw
Draw a new alternative
Implicit Perception Estimation Steps
Draw
Draw from the decoder
Draw from prior distribution
Compute
Repeat steps 1–4 for
Calculate
To validate the generated choice sets, we adopted a two-step approach. First, we compared the distribution of generated alternatives with the distribution of chosen alternatives in the dataset. This comparison involved visual methods, such as density plots, and statistical measures. These methods were employed to assess how well the generated choice sets represent observed behavior.
In the second step of validation, the generated alternatives were used to create choice sets for individuals, incorporating the
To provide robust benchmarks, we included two traditional methods for choice set generation: simple random sampling and importance sampling. Simple random sampling serves as a baseline, where alternatives are selected purely at random, without any behavioral considerations. The importance sampling method was implemented following a gravity-based framework introduced by Ben-Akiva and Lerman (
where
The implementation of importance sampling involves several steps. First, a size measure, such as population density or area, is assigned to each destination zone. When specific size data is unavailable, zones are assumed to have equal size, making distance the primary determinant of selection probability. The distance decay parameter
Discrete Choice Model for Destination Prediction
We employ an MNL model for destination choice prediction. The utility function
where
The probability of choosing alternative
Figure 1 illustrates the methodology employed in this paper for choice set generation and destination choice modeling using a VAE. The process starts with data preprocessing to normalize the data. The VAE generates choice sets through an encoder, which compresses input data into latent variables, and a decoder, which reconstructs the data. The encoder consists of two hidden layers producing mean (

Process of implementing the suggested variational autoencoder (VAE) approach in destination choice modeling.
Study Area and Data
The primary focus of this study is the Island of Montreal, a part of the Greater Montreal metropolitan area. The study area covers a population of approximately 1.9 million people over an area of 462 square kilometers.
The dataset used in this study originates from the Enquête Origine-Destination 2018 survey which has been conducted every 5 years since 1970 (
The OD data covered the Greater Montreal area, but in this study, we filtered our data and focused on the Island of Montreal, as most trip destinations were located within this area. Moreover, the data were filtered to focus specifically on home-based work trips. Individuals under 16 years old and trips with unreasonable travel times (exceeding 3 hours or having zero travel time) were excluded from the analysis. The final database comprised 19,389 observations, representing 4.92% of the total dataset. Figure 2 shows the density distribution of each zone selected as a destination in the study area.

Density (per square kilometer) distribution of each zone selected as a destination in the study area.
Census tract data further enriched the dataset, providing additional attributes such as population, number of dwellings, employment rate, density, and the number of occupations in different categories. The study area was divided into 533 census tract zones, forming the basis for analyzing travel patterns and modeling destination choices in the Island of Montreal. Figure 3 illustrates the distribution of total number of all kinds of residents’ occupations across census tracts and the distribution of population density per square kilometer in each census tract of the study area.

Distribution of: (
Travel time and distance were initially considered as impedance measures for each trip between the zone centroids. However, because of a high correlation between distance and travel time, the models were estimated using only the distance attribute. Table 1 provides summary statistics of the data and Table 2 details the attributes used in the estimated destination choice models.
Summary Statistics of the Data Features
Summary of Attributes Applied in Destination Choice Models
Results and Discussion
To train the VAE, a vector of normalized distances between the origin and all destinations is used as the attributes for each chosen alternative. The VAE is then trained to learn the distribution of chosen alternatives. Based on this distribution, 30 new alternatives are generated for each observation, resulting in a total of 581,670 generated alternatives (30 alternatives per observation multiplied by 19,389 observations). The BC term for these generated alternatives is calculated and incorporated into Equation 8 to predict destination choices. We normalized the input data using MinMaxScaler and used the Tanh activation function in hidden layers of both the encoder and decoder. Batch normalization and a dropout rate of 0.5 were applied to improve training stability and prevent overfitting. Weights were initialized using He initialization (
Specifications of Neural Network Variational Autoencoder Model
The primary criteria for evaluating each hyperparameter combination were the alignment of the generated alternatives’ distribution with the chosen alternatives’ distribution and the final log-likelihood of the destination choice model. We visually and statistically compared the density plots of generated and chosen alternatives to assess how well the VAE model captured the underlying data distribution, and we used higher log-likelihood values to indicate better performance in capturing the observed data. Based on these evaluation criteria, we selected the hyperparameters that resulted in the generated alternatives having a distribution most similar to that of the chosen alternatives and achieved the highest log-likelihood in the destination choice model.
Alternative Generation Using VAE
To analyze the performance of generated attributes of alternatives by the VAE in recognizing the distribution of the chosen alternatives, we used a KDTree algorithm for efficient nearest neighbor search (
Evaluating the quality of generated choice sets is inherently challenging because of the unobservable nature of true consideration sets. In our study, we compared the distribution of chosen alternatives from the survey with the distribution of alternatives generated by VAE. While this is not a common approach in the literature, traditional deterministic and stochastic methods do not explicitly aim to capture the underlying distribution of consideration sets. By contrast, the VAE approach seeks to approximate the true distribution of individuals’ consideration sets, making this comparison a meaningful way to evaluate its performance.
The plot shown in Figure 4 presents the comparison between the density of chosen alternatives and generated alternatives by VAE across all zones. The chosen alternatives show specific patterns and the alignment of peaks and variations in the density of generated alternatives with those of the chosen alternatives suggests that the VAE model accurately learns the underlying data distribution, enabling it to generate plausible and diverse alternatives. However, the trained VAE model tends to produce higher density peaks in certain zones, particularly noticeable at the first peak. This happens because the model might generate alternatives biased toward regions with higher prior probabilities when sampling from the latent space, causing the generated density to be more concentrated around popular zones. This bias could actually be beneficial for the choice set generation case, as zones with higher probabilities of being chosen are more represented in the choice sets. Further validation of the quality of generated choices will be discussed in the estimation results of the destination choice model with these generated choices by VAE. Also, to quantitatively assess the similarity between these distributions, we employed two metrics: the histogram intersection index and the Jensen-Shannon (JS) divergence. The histogram intersection index, introduced by Swain and Ballard, measures the overlap between the density distributions of the chosen and generated alternatives (

Density plot of chosen and generated alternatives.
The JS divergence, as introduced by Lin, is a symmetric measure of divergence between two probability distributions, providing an additional perspective on the similarity between the distributions (
Further validation is provided by the cumulative distribution functions (CDFs) of chosen and generated alternatives in Figure 5. The VAE-generated alternatives exhibit a cumulative distribution that closely follows the trajectory of the chosen alternatives’ CDF, further validating the effectiveness of the VAE model in replicating the observed choice behavior.

Cumulative distribution functions of chosen and generated alternatives.
To provide additional support for the behavioral realism of choice sets generated by different methods, we analyzed how frequently each generated set included the actual chosen destination. Figure 6 provides a visual comparison among the VAE-based method, importance sampling, and simple random sampling, each generating sets of 30 alternatives per individual. As depicted in Figure 6, the VAE-generated choice sets included the actual chosen alternatives in approximately 25.15% of cases, significantly outperforming importance sampling (7.66%) and simple random sampling (5.59%). These results provide additional support that the VAE approach generates choice sets that more closely reflect observed decision-making behavior.

Inclusion frequency of chosen alternatives by choice set generation method.
Model Estimates
The model estimates for the destination choice models provide a comparative analysis of the performance of four different models: the destination choice model with all alternatives available (DC-FA), the destination choice model with choice sets generated by a simple random sample method (DC-SRS), the destination choice model with choice sets generated by importance sampling (DC-IS), and the destination choice model with choice sets generated by VAE (DC-VAE). All these models follow the MNL framework.
Validation is essential to prevent overfitting, where the model fits the estimation dataset well but performs poorly on other datasets. The validation process involves estimating the model with a subset of the total dataset, then applying the estimated parameters to the remaining validation subset. Accuracy is used as a performance measure. In this study, 80% of the database (15,517 out of 19,389 observations) is used for estimation and 20% (3,872 out of 19,389 observations) for validation. The percentage match rate between estimated and observed individual choices is analyzed as a performance metric. Performance measures are detailed in Table 4, including final log-likelihood,
Goodness-of-Fit and Prediction Performance Results
The goodness-of-fit and prediction performance metrics provide a comparative evaluation of the destination choice models. While DC-IS achieves the best (least negative) final log-likelihood at −36,295.9, followed by DC-VAE at −41,566.3, DC-SRS at −39,815.2, and DC-FA at −82,497.8, final log-likelihood alone does not fully capture overall model performance. Each model’s strengths must also be assessed in relation to their explanatory power and predictive accuracy, where clear differences are observed.
The
Prediction accuracy index measures the percent of correct predictions which highlights further distinctions between the models. DC-VAE achieves the highest accuracy at 49.61%, followed by 41.19% for DC-IS, 30.10% for DC-SRS, and 10.82% for DC-FA. The improved accuracy of DC-VAE underscores its ability to generate realistic choice sets that closely align with observed behavior, thereby enhancing predictive performance. DC-IS also demonstrates strong predictive accuracy compared with DC-SRS and DC-FA, confirming the effectiveness of probabilistic approaches such as importance sampling in creating behaviorally consistent choice sets.
To investigate the practicality and scalability of the proposed VAE method in large-scale applications, we compared the computational complexity and resource consumption across the different choice set generation methods (DC-FA, DC-SRS, DC-IS, and DC-VAE) and subsequent discrete choice model estimations. All experiments were executed on a system with an Intel Core i7 14700K CPU, 32 GB of RAM, and Nvidia RTX 4000 Ada GPU. The VAE model was implemented using PyTorch, while discrete choice models were estimated using the Apollo package (version 0.3.2) (
Computational Times for Different Methods
The VAE approach has an initial non-trivial computational overhead associated with training (approximately 55 s for 100 epochs), but the training is a one-time investment. After training, the VAE’s inference phase for generating choice sets is extremely efficient, requiring only about 2 s, which is significantly faster than the importance sampling method (70 s). Although the simple random sampling (DC-SRS) method involves negligible computational effort because of its straightforward random draw process, it performs considerably worse in predictive accuracy than the VAE-based approach. Furthermore, while the DC-IS method offers improved predictive accuracy over DC-SRS, it still requires substantially longer inference times and exhibits some behavioral inconsistencies in its parameter estimations. In contrast, the DC-VAE method strikes a balance: after its initial training period, it achieves excellent inference-time efficiency combined with superior predictive performance.
The parameter estimates for the destination choice models are presented in Table 6. The results indicate that all parameter estimates are statistically significant at the 5% level, as evidenced by t-test values exceeding the critical threshold of 1.96. Across most models, the coefficient for distance (B_DIST) is negative, aligning with the well-established behavioral principle that individuals prefer closer destinations, typically characterized by a negative exponential relationship between distance and destination utility. This negative exponential form represents the diminishing likelihood of selecting a destination as its distance increases, reflecting realistic travel cost minimization and distance decay effects. However, the positive B_DIST coefficient in DC-IS contradicts this expectation and suggests a deeper examination. In the importance sampling approach, alternatives closer to the origin are disproportionately oversampled in the choice set. This results in a choice set where the actual chosen destination often has a longer distance than the majority of sampled alternatives. When the model estimates the likelihood of choosing a destination, it finds that longer distances are systematically associated with chosen alternatives relative to the sampled set, leading to an artificially positive relationship between distance and destination selection. This is a methodological artifact rather than a genuine behavioral preference. Our findings underline that the importance sampling approach, while offering improved predictive performance compared with simpler sampling methods such as random sampling, inherently carries significant methodological bias. Therefore, using importance sampling naively in choice set generation can lead to unrealistic behavioral interpretations and compromised parameter validity.
Parameter Estimation Results
In continuation, the negative coefficient for total zone population (B_POP) across all models suggests that zones with higher populations are less likely to be chosen. This effect is strongest in the DC-IS model and weakest in the VAE-based model, reflecting variations in how these models account for population effects. For total private dwellings (B_TPD), the positive coefficient demonstrates that zones with more dwellings are more likely to be chosen, with the strongest effect observed in the DC-FA and DC-SRS models, followed by DC-IS, while the VAE-based model shows a more modest effect. The negative coefficient for population density (B_PDENS) suggests that higher density zones are less attractive as destinations, with the effect being most pronounced in the DC-IS model and least in the VAE-based model. The coefficient for employment (B_EMP) is positive in most models, indicating that zones with higher employment are generally more attractive. However, the DC-IS model shows a negative coefficient, which could suggest complex interactions within the choice set generated through importance sampling. Employment-related attributes, such as B_BFA, B_NAS, B_HLTH, and others, exhibit varying effects across the models, highlighting differences in how each method interprets these attributes. For instance, the positive coefficient for B_NAS is strongest in the DC-IS model, whereas the negative coefficient for B_HLTH is most pronounced in the VAE-based model. These variations underscore the differing strengths and limitations of each model in capturing the impact of destination attributes on choice behavior.
By closely examining the estimated parameters in Table 6 alongside the performance metrics in Table 4, we see that the DC-VAE model produced the most intuitive and realistic parameters, as well as showing strong overall performance. This alignment suggests the DC-VAE method better captures genuine decision-making behavior. In contrast, simpler methods such as DC-FA and DC-SRS, despite yielding generally reasonable parameter estimates, struggled to achieve similarly high predictive accuracy and model fit. This emphasizes that both behavioral realism and predictive performance need to be jointly considered when comparing alternative approaches. The DC-IS method fell between these extremes: it improved predictive performance compared with simpler methods but still exhibited some behavioral inconsistencies in its parameter estimates.
According to Bernardin et al., the most frequently used validation method for destination choice models involves comparing trip length frequency distributions (

Comparison of trip length distributions.
To ensure a fair and robust comparison, kernel density estimation with a standardized bandwidth was applied to smooth all distributions consistently, and the analysis was restricted to the range of 0–40 km, which covers the most relevant distances in the dataset. The KL divergence results demonstrate that the DC-VAE model achieves the smallest divergence (0.0197), indicating the closest match to the real data. The DC-IS and DC-SRS models follow with divergences of 0.0694 and 0.0891, respectively, showing reasonable alignment but some discrepancies. Conversely, the DC-FA model has the largest divergence (4.8588), highlighting significant deviations from the real data. These findings confirm that the DC-VAE model not only visually aligns better with the real data but also statistically provides the most accurate representation of travel distance distributions.
Further illustration of the comparison of distance distributions between origin and destination across the four models and the real data using box plots (Figure 8) provides additional insights. The real data show a relatively wide spread with a median distance around 10 km and a broad interquartile range (IQR), indicating significant variability in travel distances. Several outliers at longer distances highlight the diversity of observed destination choices.

Distance frequency distribution comparison across models and real data.
The DC-FA model underestimates the spread of the real data, with a narrower IQR and a lower median distance. It fails to effectively capture longer-distance choices, as shown by the smaller number of outliers. This misrepresentation may stem from its rigid assumption of equal availability for all alternatives, which disregards key behavioral factors. The DC-SRS model shows a spread closer to the real data than DC-FA but has a slightly lower median. It captures some longer-distance choices but does so less effectively than the real data, as evidenced by a smaller number of outliers and a flatter distribution overall. The DC-IS model demonstrates notable improvements over both DC-FA and DC-SRS. It has a wider IQR and a median that is closer to the real data. Additionally, it captures a reasonable number of outliers, representing longer-distance trips more accurately. However, DC-IS slightly overestimates choices at longer distances, which introduces some deviations from the real data. The DC-VAE model aligns most closely with the real data, effectively capturing the median, IQR, and outliers. It replicates the spread of travel distances observed in the real data and accurately represents longer-distance trips.
Overall, the box plot analysis highlights that DC-VAE is the best-performing method, as it most closely matches the observed variability and central tendency of the real data. DC-IS follows as the second-best method, with noticeable improvements over DC-SRS and DC-FA. This analysis reinforces conclusions from the density plot comparison, further confirming that the VAE-based model offers the most effective approach for capturing realistic destination choice behavior.
Conclusions
This study has explored the potential of using VAEs to enhance choice set generation in destination choice models, focusing on home-based work trips in Montreal, Canada.
The VAE method integrates variational Bayesian techniques with autoencoders to estimate the probability distribution for the underlying choice set generation process, aiming to maximize the likelihood of including the chosen alternatives in the choice set. Initially, the VAE maps alternatives to a lower-dimensional latent space using an encoder model, then generates new alternatives with a decoder model based on these latent representations. The choice set is implicitly created by sampling from this latent space and feeding the samples into the decoder model. Additionally, the VAE model can produce the implicit perception measure when generating new alternatives.
The findings from this study revealed the VAE model’s capability to effectively capture the underlying distribution of chosen alternatives, producing plausible and diverse alternatives. This capability was demonstrated through density plots and CDFs, which exhibited a high degree of similarity between the chosen and generated alternatives. Such visual and quantitative alignment suggests that VAE can mimic the decision-making process of individuals, thereby generating realistic and varied choice sets that enhance the robustness of the destination choice model. This alignment underscores VAE’s potential to address the biases and limitations associated with traditional deterministic and stochastic methods which often fail to capture the full spectrum of potential alternatives.
Moreover, the application of the VAE method to generate choice sets for destination choice models was validated using real-world data and several model structures: DC-FA, DC-SRS, DC-IS, and DC-VAE. The results indicated that models utilizing VAE-generated choice sets (DC-VAE) outperformed those with traditional choice sets with regard to model fit and predictive accuracy. Specifically, the model’s
While this study focused on destination choice modeling for home-based work trips, future research should extend the VAE-based approach to other trip types and different urban areas. This would involve comparing the prediction accuracy of discrete choice models with choice sets generated by VAEs across various contexts and trip purposes. Additionally, using actual job data instead of resident occupation data for census tracts may provide more explanatory power in destination models, making it a valuable direction for future research. Future studies should also examine the performance of destination choice models with various choice set sizes generated by different methods. This would help identify the optimal choice set size and method that balances model complexity with predictive accuracy. Furthermore, integrating VAEs with other DL techniques developed for choice modeling (e.g., Wang et al., Wong and Farooq, and Kamal and Farooq) could enhance model performance and predictive accuracy (
Investigating whether the training process of the VAE and the ML-based choice modeling can be done simultaneously presents an exciting opportunity. This simultaneous training could streamline the modeling process and potentially lead to even more-accurate and efficient models, as the generative and predictive components would be optimized together. Another future direction could be to adopt other methods for choice set generation in route choice modeling, such as Metropolis-Hastings-based sampling and path size logit-based sampling, for choice set generation in destination choice modeling (
Although the VAE-based approach shows promise, it also has potential limitations. The quality of generated choice sets depends heavily on the quality and quantity of the training data. Additionally, while the computational complexity of VAEs can be significant during training, they benefit from modern hardware acceleration (e.g., GPUs) and efficient optimization algorithms, making them scalable for large datasets. Compared with stochastic methods, VAEs offer a unified framework for choice set generation that avoids the need for iterative sampling during inference. Future work should explore techniques for further optimizing VAE training and evaluating the impact of different data preprocessing methods on choice set generation.
Footnotes
Acknowledgements
The authors would like to thank Autorité régionale de transport métropolitain (ARTM) for providing the origin-destination survey data used in this study.
Author Contributions
The authors confirm contribution to the paper as follows: study conception and design: H. Haghi, Z. Patterson, B. Farooq; data collection: xxx; analysis and interpretation of results: H. Haghi, Z. Patterson, B. Farooq; draft manuscript preparation: H. Haghi, Z. Patterson, B. Farooq. All authors reviewed the results and approved the final version of the manuscript.
Declaration of Conflicting Interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This research was supported by the Canada First Research Excellence Fund and the Bridging Divides program funded under it.
