A Comprehensive Business Location Choice Model Leveraging Machine Learning in Systematic Choice Set

Abstract

This study develops a comprehensive two-stage location choice framework for business establishments as part of a goods movements modeling. This study aims to formulate a systematic methodology for investigating the location choice of business establishments within Halifax Regional Municipality. This study presents a novel approach that leverages machine learning techniques to generate a systematic choice set, thereby improving the representation of realistic and reasonable location alternatives. Info Canada Business Establishments dataset 2022 is employed to achieve the aim of this study. Combining an unsupervised machine learning technique with the mixed multinomial logit model facilitates a data-driven approach to enhance the precision and robustness of business establishment location choice models. This approach possesses the potential to unveil latent patterns and heterogeneity among potential choice alternatives that may remain obscured when utilizing a conventional multinomial logit model. This thorough analysis offers robust insights into the factors influencing the location choice of business establishments. The findings obtained from this comprehensive study suggest that wholesalers prioritize proximity to highways and positions within business parks for their operations while avoiding higher population density and central business district proximity. Transportation businesses seek larger sites and locations near highways, favoring clustering with related transport companies and valuing accessibility and cost-effectiveness over proximity to business parks or rural settings. The findings of this study could provide valuable insights for commercial vehicle and goods movement modeling, business location strategies, and policymaking concerning sustainable urban development.

Keywords

location choice goods movement machine learning systematic choice set business establishment

Business establishment plays a crucial role in spatial development and is a fundamental component of integrated land use and transportation models. As the primary sources of economic activities, business establishments exert a strong influence in drawing and generating a significant number of personal and commercial trips in the morning and evening peaks ( 1 , 2 ). Commercial trips involving both light and heavy-duty trucks lead to traffic congestion and significantly contribute to emissions, including greenhouse gases (GHGs). The growing concerns arise from the upward trend in commercial vehicle emissions witnessed over the past few decades ( 3 , 4 ). However, the existing literature has extensively examined passenger travel demand, yet the realm of travel demand forecasting modeling has not given comparable attention to commercial vehicles ( 5 – 7 ). The volume and frequency of commercial vehicle movements, along with the resulting emissions, are greatly influenced by the location of businesses. Therefore, a better understanding of the process involved in choosing business establishment locations is essential to develop a comprehensive freight demand modeling.

The location choice of business establishments is a discrete choice involving two distinct steps: the search process and the location choice process ( 8 ). During the search process, businesses gather information, evaluate market conditions, and identify potential locations that meet their specific requirements ( 9 ). This search procedure attempts to create a group of alternatives that are feasible and realistic. Following the search process, businesses carefully evaluate the shortlisted options based on predetermined criteria and preferences before making a decision on the final location ( 10 ). However, most existing studies on the location choice of businesses have focused primarily on the second step while neglecting the initial search process ( 11 ). The location search step is imperative for reliable and precise location choice estimations and predictions, and by ignoring this step, the findings might be unable to capture the full complexity of businesses’ location decisions.

The business establishment location choice model is primarily implemented utilizing the framework of multinomial logit (MNL) models ( 12 , 13 ). Most earlier studies used random sampling of alternatives to generate the choice set ( 14 ). However, these methodologies do not fully depict the behavioral dynamics of a business establishment’s location search process. The inclusion of one or more unreliable alternatives in the choice set is, therefore, quite likely to take place, which will end up resulting in biased parameter estimations for the choice model. Few studies have attempted to create a systematic choice set to address this issue ( 10 , 14 ). However, business location choice decision is influenced by various factors, including distance to the central business district (CBD), proximity to transport infrastructure, socioeconomic attributes, and agglomeration factors ( 12 , 15 ). Considering several characteristics to generate a reasonable choice set yields the challenge of building different types of location alternatives, which may be challenging from a methodological and computational perspective. Therefore, the primary research question of this study is how to develop a more sophisticated approach to capture the multi-dimensional factors to develop a better location choice model of businesses. Thus, this study advances a novel machine learning (ML)-based clustering method to generate a plausible choice set for business location choice modeling.

The rest of this paper is organized as follows: In the next section, a literature review synthesizing establishment location choice models is presented. The study area and relevant data are then introduced. Following this, the conceptual framework of the study is outlined. The ensuing section details the methodology employed in the study. The subsequent section unveils the findings of this study, accompanied by a thorough discussion. Finally, the paper concludes by summarizing key findings and suggesting directions for future research.

Literature Review

Integrated land use and transportation models have primarily focused on residential location choices and associated commuting patterns. However, businesses play a crucial role in generating trips involving commercial vehicle movements. Although the development of large-scale integrated models has grown over the last two decades, commercial vehicles have been underrepresented in travel demand modeling. In recent years, there has been an increase in the number of commercial vehicle movement models ( 16 ). Most existing studies have implemented freight demand forecasting modeling utilizing two main methods: (1) commodity-based modeling and (2) trip/tour-based modeling. Commodity-based modeling focuses on analyzing the flow of goods based on their characteristics, origin, destination, and volume, providing insights into infrastructure needs and policy impacts ( 17 ). On the other hand, trip/tour-based modeling examines individual vehicle trips and interactions with the transportation network, offering detailed insights into congestion, travel times, and operational decisions ( 18 ). In both commodity-based modeling and trip/tour-based modeling approaches for freight demand modeling, the location of business establishments plays a vital role.

Business location choice modeling is a multidisciplinary field that draws insights and methodologies from various academic disciplines ( 19 , 20 ). The location decisions of businesses can often be categorized into three main factors: accessibility, office profile, and business profile ( 21 ). These factors encompass various aspects that influence where a business chooses to locate its operations ( 22 , 23 ). Transportation and infrastructure are critical considerations, as businesses seek locations with good access to highways, airports, ports, and public transit systems ( 24 – 27 ). Such proximity minimizes transportation costs and facilitates the efficient movement of goods and personnel. The locational preferences of businesses are influenced by household income, population density, and proximity to the CBD ( 24 , 28 ). High household incomes attract businesses seeking affluent consumers, densely populated areas appeal to those targeting larger customer bases, and proximity to the CBD affects accessibility and networking opportunities for various firms. Additionally, businesses often favor locations with agglomeration benefits, where the concentration of similar or related industries fosters synergies, knowledge-sharing, and improved access to skilled labor and resources ( 29 – 33 ). In the existing literature, a limited number of studies specifically address business establishment location choice from a freight perspective, focusing on the movement and transportation of goods and commodities ( 34 – 38 ).

Business establishment location choice modeling involves considering a wide range of available alternatives for a business, which can vary from hundreds at the zonal levels to thousands at the parcel levels. The two-step business establishment location choice model that aims to mimic search process has some difficulties in dealing with a wide range of spatial alternatives. One of the most widely adopted techniques to address the challenges associated with many alternatives is a random sampling of alternatives. Manski introduced a two-step discrete-choice modeling framework ( 39 ), wherein the first stage involves deriving a subset of choice alternatives from the universal choice set. This initial selection process can be executed through the application of predetermined criteria for choice set selection or through a random selection approach ( 40 , 41 ). McFadden demonstrated that a subset including the observed choice and a random sample from the potential choices might be used as a substitute for the entire choice set ( 42 ). Although random sampling-based approaches offer computational advantages, it is important to be aware of the potential bias and inconsistent parameter estimates that can arise through improper representation of alternative choice sets ( 43 ). Several studies on modeling the location choice of business establishments have been conducted to address this issue, utilizing observed data to create structured choice sets. For instance, Elgar et al. searched for businesses employing two geographical anchor points: the existing location and the location of the firm’s owner ( 10 ). The chosen set was then constructed by drawing an ellipse around these two points. Then, a MNL model based on random sampling is estimated utilizing the constructed choice set. De Bok and Sanders applied route choice modeling principles to generate systematic choice sets for each identified business relocation ( 14 ). A representative collection of possible business locations was constructed by developing progressive subsets of alternatives ( 44 ). These systematic choice sets have considered a specific location choice attribute, although multiple factors affect the location choice process of businesses. Recently, there have been attempts to utilize ML techniques in the residential location choice model to develop a systematic choice set. For instance, Orvin and Fatmi employed the Gaussian mixture model for their residential location search model ( 45 ). However, the application of ML in the business location choice model remains scarce in literature.

The literature review suggests that there are limited studies which represent the behavioral process of business location choice. Therefore, the primary research question of this study revolves around developing a more refined approach to include the multi-dimensional location choice attributes in the two-step location choice model of businesses. The significance of this research lies in its potential to deepen the understanding of business location choice behavior, improve predictive precision, guide strategic decision-making for businesses, inform policymaking, and optimize resource allocation. In essence, bridging this research gap holds both theoretical implications and practical applications for urban freight movement. In this study, a novel ML approach has been introduced in the two-step location choice model of the business establishment to generate a finite number of plausible alternatives while precisely preserving the multi-dimensional location choice attributes in a single frame. An extensive dataset, Info Canada Business Establishment data for 2022, is utilized in this study to develop the location choice model of businesses as part of a comprehensive freight demand modeling for Halifax Regional Municipality (HRM).

Contribution of the Study

This study contributes significantly to the transportation and land use modeling and geography literature as it models the location choice of business establishments. The strength and novelty of this study lie in its proposed framework, which effectively combines ML techniques with conventional econometric modeling to develop a comprehensive two-stage location choice model of business establishments. One of the substantial contributions of this study is adopting an unsupervised ML algorithm to address the multi-dimensionality of influential factors of the location choices of businesses to address the first stage of the location decision. A Density-Based Spatial Clustering of Applications with Noise (DBSCAN) clustering algorithm was implemented to identify clusters of locations based on multi-dimensional attributes of business location choices. Following the clustering process, econometric modeling (logit/mixed logit model) has been utilized to model the location choice of businesses as part of a comprehensive freight demand modeling.

Potential of DBSCAN for Generating Business Location Choice Sets

DBSCAN holds significant potential for generating business location choice sets. It is a popular clustering algorithm that efficiently identifies dense regions in spatial data. When applied to business location data, it can help identify clusters of potential business sites that share similar characteristics and are located near each other. The applicability lies in its ability to handle noise and outliers in the data, which is crucial when dealing with real-world business location datasets that may contain inaccuracies or irregularities. By eliminating noise and focusing on dense regions, DBSCAN can identify meaningful clusters of potential business locations, helping decision-makers understand the distribution and spatial patterns of suitable areas for new business establishments. In addition, it does not require specifying the number of clusters beforehand, unlike some other clustering algorithms. This attribute is particularly advantageous in generating business location choice sets, as it allows for a more flexible and adaptive approach when dealing with varying market demands and changing geographical conditions. Additionally, the ability to detect irregularly shaped clusters makes it well-suited for capturing complex spatial patterns that might not be easily identifiable through other clustering methods. Consequently, it assists in creating a more accurate and realistic location choice sets, providing valuable insights for businesses to make informed decisions.

Study Area and Data

The area of interest for this research is the HRM, the capital city of Nova Scotia, Canada (Figure 1). This region covers an estimated area of 5,577 square kilometers. A substantial dataset sourced from Info Canada Business Establishments for 2022 has been employed for this research. Table 1 summarizes information about the business establishments in the study area. This reliable and comprehensive dataset contains detailed records of firms with a 7-digit NAICS (North American Industry Classification System) code for the HRM. The data include valuable information about each establishment’s name, type, geographic location, total number of employees, sales volume, and year of establishment. In total, there are twenty different types of establishments. Businesses are categorized into five broad economic sectors—industry, retail, service, transportation, and wholesale—based on their NAICS code to group them more effectively (Table 1).

Figure 1.

Distribution of establishments within the study area—the Halifax Regional Municipality.

Table 1.

Descriptive Statistics of Info Canada Dataset 2022

Economic sector	Elements (parenthesis numbers indicate 2-digit NAICS code)	Observations (%)
Industry	Manufacturing (31–33)	3.32
	Construction (23)	9.43
	Agriculture, Forestry, Fishing and Hunting (11)	0.16
	Mining and Oil and Gas Extraction (21)	0.21
Service	Information and Cultural Industries (51)	1.85
	Professional, Scientific and Technical Services (54)	9.40
	Administrative and Support, Waste Management, and Remediation Services (56)	4.11
	Finance and Insurance (52)	5.50
	Arts, Entertainment and Recreation (71)	2.48
	Real Estate and Rental and Leasing (53)	5.11
	Accommodation and Food Services (72)	7.48
	Management of Companies and Enterprises (55)	0.04
	Utilities (22)	0.04
	Health Care and Social Assistance (62)	9.31
	Public Administration (92)	2.99
	Educational Services (61)	3.29
	Other Services (except Public Administration) (81)	12.98
Retail	Retail Trade (44, 45)	13.84
Transportation	Transportation and Warehousing (48,49)	2.56
Wholesale	Wholesale Trade (41)	4.66

Note: NAICS = North American Industry Classification System.

For geospatial analysis, the exact longitude and latitude coordinates of each business were used to geocode them in ArcGIS Pro. Additionally, the 219 traffic analysis zones in the Halifax Transport Network Model ( 46 ) were spatially joined with the establishments to extract several zonal variables. This study uses entropy, which is a crucial metric for capturing the diversity and heterogeneity of businesses within a specific zone. It allows us to evaluate the range of establishments present in an area, shedding light on regions with dynamic and competitive commercial landscapes. A high zonal entropy value indicates a mixture of consumer preferences, empowering businesses to cater to a broader customer base and potentially discover niche markets that align with specialized offerings. The zonal entropy formula is as follows:

H (i) = - \sum \Pr (k) \times \log (\Pr (k))

(1)

where $H (i)$ : Zonal entropy value for zone $i$ , and $\Pr (k)$ : Proportion of establishments in zone $i$ belonging to industry type $k$ .

Additional information has been collected from 2021 Canadian Census. In this study, the location choice of a business establishment is considered at the parcel level. Table 2 demonstrates the descriptive statistics of the explanatory variables used in the location choice model. Because the census data are presented in an aggregate manner, it was necessary to disaggregate the information to obtain data at the individual parcel level. To achieve this, the approach employed in this study followed the method proposed by Bracken and Martin ( 47 ), known as cross-area interpolation. This technique utilizes kernel estimation to convert a spatial distribution based on census centroid data into a continuous density surface. Subsequently, this density surface is overlaid onto parcels, making it compatible with other Geographic Information System data. By implementing this method, several variables were generated at the parcel level.

Table 2.

Descriptive Statistics of Explanatory Variables

Descriptive statistics	Minimum	Maximum	Mean
Parcel area (m²)	10.43	80,881.38	308.90
Distance to central business district (m)	0	92,233.72	9,223.37
Distance to highway (m)	8.15	82,777.46	4,468.84
Distance to business park (m)	0	115,952.00	9,223.37
Distance to bus stop (m)	1.80	9,223.37	3,972.54
Distance to local street (m)	0	9,223.37	51.50
Distance to mall (m)	0	9,2233.72	9,223.37
Population density	1.1	18,897.80	1,489.98
Employment number	90	1,300	416.73
Zonal entropy	0.01	0.57	0.14
Industry count within 500 m	0	64	2.03
Retail count within 500 m	0	130	3.14
Transport count within 500 m	0	21	0.44
Wholesale count within 500 m	0	50	0.69

Conceptual Framework

A comprehensive two-stage location choice model of business establishments is developed in this study (see Figure 2). The developed business establishment location choice model operates at the parcel level, representing the utmost precision in its characterization.

Figure 2.

Conceptual framework for two-stage business establishment location choice model.

The DBSCAN generates clusters of parcels based on the relative importance of the multi-dimensional business location choice attributes. It starts with an initial input of data points and the value of epsilon (ε), which defines the maximum distance between points for them to be considered neighbors. The algorithm initializes all points as “unvisited.” Then, it begins with the first unvisited point and finds its neighbors within the specified $ε$ distance. It checks if the number of neighbors is greater than or equal to the specified minimum points (minPts). If the condition is met, a new cluster is created, and the current point is assigned to it. The algorithm then expands the cluster by visiting unvisited neighbors and marks them as visited. This process repeats until all points are visited. If there are unvisited points remaining, the algorithm loops back to start with the next unvisited point. Once all points are visited, the algorithm ends, and all points are assigned to appropriate clusters or marked as noise. This approach enables DBSCAN to identify clusters of arbitrary shapes in the data while also detecting outliers as noise points. The generated clusters, eliminating the noises, are then utilized as a reliable selection pool to create the choice set of a business. The decision to discard outlier clusters in the selection pool is based on specific objective—to develop viable business locations within the study area of 162,623 real estate objects, encompassing both residential and commercial real estate. The use of DBSCAN helps identify and eliminate individual real estate objects considered as noise, such as those with unusual characteristics, in the context of business location choice. These outliers may not align with the typical profile of a viable business location. Consequently, these outlier clusters are excluded during the development of business location choice sets, streamlining the focus on clusters with higher potential. Next, econometric modeling is implemented to model business establishment location choice employing the generated plausible choice sets. In this study, a total of ten parcels are selected randomly, including the one that has been predetermined, to create the final choice set. The rationale for selecting this specific number is to minimize computation time.

This study hypothesized that businesses prioritize multi-dimensional location choice attributes in selecting a business location to depict the first stage of the location choice process, mimicking the search process. For instance, industries encompassing transportation and warehousing, real estate, and rental and leasing, manufacturing, construction, and public administration exhibit a propensity to cluster within regions characterized by low population density. This phenomenon is conjectured to be attributable to several influential factors, notably the cost of land, accessibility to transportation routes, zoning regulations, availability of space for expansion, considerations pertaining to noise and environmental impacts, as well as the advantages arising from specialization and agglomeration economies. The most significant multi-dimensional location choice attributes are included in the unsupervised ML algorithm to categorize real estate objects for each establishment. The generated cluster based on the multi-dimensional location choice attributes have the potential to provide a comprehensive understanding of the economic landscape and potential market opportunities in an area.

Utility maximization is assumed to influence businesses to mimic the second stage of the location choice model. In this stage, the decision-maker is assumed to evaluate a set of alternatives and choose the one that maximizes their utility or satisfies their preferences. This holistic approach aids decision-making processes related to business establishments, enabling stakeholders to identify optimal locations based on market size, competition level, and economic conditions.

Methodology

Density-Based Spatial Clustering of Applications with Noise

DBSCAN is a widely used clustering algorithm for multi-dimensional spatial data based on density. This algorithm seeks to effectively partition data points into clusters, concurrently detecting noise points that do not exhibit strong association with any cluster. This algorithm fundamentally depends on two key parameters: the neighborhood radius denoted as $ε$ , and the minimum number of points required to establish a dense region (minPts).

Let, $X$ be a dataset, which consists of $n$ data points { $x_{1}, x_{2,} . . ., x_{n}$ }.

A data point x_i is considered a core point if the number of points within its neighborhood $N (x_{i})$ , defined by the threshold distance ε, is greater than or equal to minPts. Mathematically, it is expressed as:

/ N (x_{i}) / \geq minPts .

(2)

The neighborhood $N (x_{i})$ of a data point $x_{i}$ consists of all data points $x_{j}$ from the dataset $X$ that are within a distance $ε$ from $x_{i}$ , formally represented as:

N (x_{i}) = {x_{j} \in X | dist (x_{i}, x_{j}) \leq ε}

(3)

The reachability distance is a crucial concept in DBSCAN, which is the maximum of the distance between a core point $p$ and another data point $q$ (denoted as dist( $p$ , $q$ )) and the threshold $ε$ . The reachability distance is represented as:

Reachability distance (p, q) = \max (ε, dist (p, q)

(4)

This measure enables the identification of directly density-reachable points, wherein a data point $q$ is directly density-reachable from a core point $p$ if $q$ is included in the neighborhood $N (p)$ of $p$ , i.e., $q$ ∈ $N (p)$ .

Additionally, DBSCAN employs the concepts of density-reachability and density-connectivity. A data point $q$ is density-reachable from a core point $p$ if there exists a sequence of core points { $p_{1}, p_{2}, . . ., p_{n}$ } such that $p_{1}$ = $p$ , $p_{n}$ = $q$ , and each $p_{i}$ is directly density-reachable from $p_{i - 1}$ for $i$ = 2, 3, ..., $n$ . Furthermore, a data point $q$ is density-connected to a core point $p$ if both $p$ and $q$ are density-reachable from a shared core point $o$ .

DBSCAN identifies clusters as non-empty sets of data points that satisfy specific conditions. A cluster $C$ contains points where each point $p$ in $C$ is directly density-reachable from some core point $q$ within $C$ , and all core points in $C$ are density-connected. Data points that are not part of any cluster are considered noise points or outliers. To identify optimal values for $ε$ and minPts in the DBSCAN algorithm, a comprehensive approach can be employed. The analysis commences with reachability plots, facilitating a visual assessment of distances between data points and aiding in the initial determination of a suitable ε. Following this, an iterative testing process involves experimenting with different ε and minPts values to fine-tune parameters while mitigating the risk of over-segmentation. To ensure a robust parameter selection, validation metrics like silhouette analysis and cluster stability measures can be applied, providing quantitative insights into cluster quality and consistency. Thus, DBSCAN effectively discovers clusters of arbitrary shapes and demonstrates resilience to noisy datasets, making it a valuable tool in various spatial data clustering applications.

Silhouette Coefficient

To assess the quality of clustering results obtained through the application of the DBSCAN algorithm, one of the commonly used matrices is Silhouette Coefficients. The Silhouette Coefficient value represents the similarity of each data point to its assigned cluster compared with other clusters, with a range from −1 to +1. The Silhouette Coefficient plot showcases horizontal bars, one for each cluster, with their widths indicating the number of data points within the respective clusters. Interpretation of the Silhouette Plot involves examining the Silhouette Coefficient values for each cluster. Positive values close to +1 indicate well-clustered data points appropriately assigned to clusters, whereas values near 0 suggest data points near decision boundaries and suboptimal clustering. The Silhouette Coefficient for a single data point i is calculated using the following equation:

S (i) = \frac{b (i) - a (i)}{\max (a (i), b (i))}

(5)

where

$S (i) =$ Silhouette coefficient for data point $i$

$a (i) =$ average distance from $i$ to other data points in the same cluster (intra-cluster distance)

$b (i) =$ average distance from $i$ to data points in the nearest cluster $i$ is not a part of (inter-cluster distance).

Mixed Logit Model

This study has formulated distinct location choice models for each economic sector: Industry, Retail, Service, Wholesale and Transportation. To comprehensively address unobserved heterogeneity within business establishments in a given sector, this study employs a mixed multinomial logit (MMNL) modeling approach. Recognizing the diverse preferences inherent in businesses, the adoption of a standard logit model assuming homogeneity proves insufficient in adequately capturing these intricacies. The utilization of the mixed logit model allows for the accommodation of variations in preferences and considers unobservable factors that affect location decisions. This sophisticated modeling approach enhances predictive accuracy, thereby enabling businesses and policymakers to make more informed decisions.

Let $U_{ni}$ represent the random utility of an establishment $n$ derived from a chosen location $i$ . The random utility $U_{ni}$ can be described according to the following equation:

U_{ni} = V_{ni} + ε_{ni}

(6)

where

$V_{ni}$ = the deterministic part in the utility (a function of attributes related to location of businesses)

$ε_{ni}$ = a random error term (assumed to be identically and independently distributed) across individuals and alternatives).

Here $β$ represents the difference in preferences; $f (β | θ)$ denotes the density function under the overall parameter $θ$ . In the scenario where there is no difference in preferences ( $β$ remains fixed), the probability $L_{ni}$ of choosing a location from a choice set is calculated following equation:

L_{ni} (β) = \frac{e^{[V_{ni} (β)]}}{\sum_{j = 1}^{I} e^{[V_{nj} (β)]}}

(7)

In the mixed logit model, the parameter $β$ is randomly changed. Consequently, it becomes essential to multiply the distribution of $β$ to derive the conditional selection, accounting for the existence of random preference differences. The ultimate form of the mixed logit model with random preference differences is as follows:

P_{ni} = \int L_{ni} (β) f (β | θ) d β

(8)

The form of the distribution for $(β | θ)$ is typically chosen based on practical considerations and can include options such as a normal distribution, a lognormal distribution, or a uniform distribution, among others.

The expression for the random coefficient, utilizing a normal distribution as the mixing distribution, can be stated as follows:

β = \bar{β} + σ * η

(9)

where $\bar{β}$ is mean of the random coefficient, $σ$ is the standard deviation of the random coefficient, $η$ is standard normal, and using 200 Halton draws it is found that the model becomes stable and exhibits reliable parameter estimation results.

Within the NLOGIT6.0 platform, the model is estimated, and its goodness of fit is measured using the log likelihood function, Akaike information criterion (AIC), and R² values.

Results and Discussion

Choice Set Generation Based on Cluster Analysis

In the clustering process, three distinct clusters (Cluster 0, Cluster 1, and Cluster 3) have been identified. The presence of cluster label -1 indicates noise points that were not assigned to any specific cluster.

Figure 3 demonstrates a reasonably separated clusters obtained through the application of the DBSCAN algorithm, offering a reliable and meaningful clustering outcome. The following meaningful insights are extracted from the generated cluster by utilizing density plots (Figure 4) and a heatmap (Figure 5) for all attributes within each cluster:

Cluster 0 (Accessible but less affluent suburban areas): It comprises locations that exhibit a relatively near proximity to key urban amenities, such as the CBD, highways, business parks, public transit, and local streets. However, these areas are notably further away from shopping malls. The population density and income levels in this cluster hover around the average, whereas employment levels are below average. Moreover, the diversity of employment in Cluster 0 is also below average, indicating a lower range of economic activities. These characteristics collectively suggest that Cluster 0 represents accessible but less affluent suburban areas with some clustering of economic activity, albeit with lower diversity. Businesses that might thrive in this cluster include discount retailers catering to value-conscious customers, fast food restaurants strategically located close to residential communities, and various service-based establishments serving the surrounding suburban population.

Cluster 1 (Less accessible, lower density suburban areas): Cluster 1 is situated at considerable distances from the CBD, highways, and business parks. However, these areas enjoy average accessibility to public transit, local streets, and shopping malls. Cluster 1 exhibits significantly lower population density, slightly below-average income, and employment levels. Moreover, the diversity of employment in this cluster is approximately average. This suggests that Cluster 1 represents less accessible, lower density suburban areas with a moderate mix of industries. Businesses likely to thrive in this cluster include big box retailers, requiring substantial land area, automotive-related ventures such as car dealerships and repair shops, and storage facilities well-suited for lower density regions.

Cluster 2 (Accessible, higher density areas near business parks and highways): It comprises locations that are farthest from public transit, local streets, and shopping malls but enjoy the closest proximity to highways and business parks. The distance to the CBD is approximately average. This cluster exhibits significantly higher population density, whereas income and employment levels are slightly below average. Moreover, the diversity of employment in Cluster 2 is below average, indicating a more limited mix of industries. These characteristics collectively suggest that Cluster 2 represents accessible, higher density areas situated near business parks and highways. Businesses that might flourish in this cluster include office buildings seeking locations with excellent transportation access and amenities, hotels catering to business travelers, restaurants targeting affluent customers willing to travel further for a premium dining experience, and specialty retailers offering unique products to an upscale demographic.

Figure 3.

Silhouette plot for density-based spatial clustering of applications with noise (DBSCAN).

Figure 4.

Density plots for all attributes within each cluster.

Figure 5.

Heatmap for all attributes within each cluster.

The revealed underlying structure of the clusters provides essential support for data-driven decision-making. Once the noises are removed, the clusters of parcels that emerge serve as the basis for establishing a rational selection pool for a business establishment. Subsequently, ten parcels are randomly chosen, including the one that has been pre-selected, to form the ultimate choice set. The rationale behind choosing this specific number is to minimize computation time. In addition, it reduces the likelihood of violating the independence of irrelevant alternatives as the unobserved attributes of locations in the same neighborhood are likely to be similar ( 42 ). McFadden ( 42 ) demonstrated that a subset, comprising the observed choice and a random sample from the potential choices, can serve as a substitute for the entire choice set. This study postulated that such an approach would eliminate unreliable alternatives from the selection pool. The effectiveness of the systematic choice set was assessed using a MMNL model. In this analysis, ten alternatives were randomly selected, without taking into account the systematic choice set generation, encompassing the pre-selected alternative, forming the final choice set. The findings (Table 3) reveal that the systematic choice set model achieved better goodness of fit compared with the model with randomly chosen alternatives, thus confirming the validity of the initial hypothesis, which suggests that eliminating inaccurate alternatives through the systematic choice set generation improves the goodness of fit.

Table 3.

Comparison between One-stage and Two-stage Location Choice Model

Goodness of fit	Industry		Retail		Service		Wholesales		Transportation
Goodness of fit	One-stage model	Two-stage model	One-stage model	Two-stage model	One-stage model	Two-stage model	One-stage model	Two-stage model	One-stage model	Two-stage model
Restricted log likelihood	−2516.13	−2309.49	−3009.72	−2763.14	−892.53	−626.34	−1003.84	−849.65	−564.41	−439.79
Log likelihood function	−1683.56	−1409.23	−1863.23	−1550.76	−311.47	−173.15	−563.43	−411.49	−237.62	−155.39
R-squared (constants only)	0.331	0.39	0.381	0.4388	0.651	0.7235	0.439	0.5157	0.579	0.6467
R-squared (constants only) (adjusted)	0.326	0.3889	0.373	0.4382	0.641	0.7224	0.431	0.5142	0.571	0.6448
AIC	3395.12	2846.5	3746.46	3121.5	642.8	366.3	1146.36	843.62	491.24	328.8

Note: AIC = Akaike information criterion.

Location Choice Model of Five Economic Sectors

The estimated coefficients of the explanatory variables, accompanied by their respective t-statistics in parentheses are presented in Table 4. The coefficients represent the effect of each explanatory variable on the likelihood of a business choosing that location, holding all else constant. Positive coefficients indicate a higher likelihood and negative coefficients a lower likelihood.

Table 4.

Parameter Estimation of Business Location Choice Model

Explanatory variables	Industry	Retail	Service	Wholesales	Transportation
(Number of employees < 10)×(Distance to highway < 500 m)					.80 (2.18)**
(Number of employee 10–99)*(Distance to business park < 500 m)	.28 (2.20)*
Parcel area (ln)	.85 (18.49)***		1.39 (8.78)***		1.04 (4.25)***
Distance to CBD < 500 m	−.21 (−1.98)**	.26 (1.81)*	2.78 (4.63)***	−0.60 (−1.82)*	−1.01 (−2.85)***
Distance to highway < 500 m		.49 (5.29)***	−1.87 (−3.69)***	.44 (2.09)**	0.55
Distance to business park < 500 m	−1.26 (−2.93)***	.72 (1.92)*	−3.58 (−2.67)***	1.24 (2.09)**	−2.19 (−1.96)**
Distance to bus stop < 1000 m	.28 (1.84)*	.61 (3.32)***	1.53 (3.23)***		1.004 (1.72)*
Distance to local street < 200 m	−.57 (−1.94)*	−.92 (−2.48)**
Distance to mall < 500 m		−.62 (−3.25)***
Population density per square km (ln)				−.12 (−1.98)**
Employment (ln)	.23 (2.58)***
Entropy	1.91 (1.86 )*	.74 (5.89)***	1.03 (2.81)***	.68 (2.75)***
Rural	.26 (1.67)*			−1.14 (−3.75)***	.83 (1.75)*
Suburban		.34 (3.18)***
Urban			−6.66 (−6.91)***
Industry count within 500 m	.29 (12.59)***		.09 (2.29)**	−.09 (0.10345)***	−.08 (−2.03)**
Retail count within 500 m		.21 (18.04)***	.14 (4.77)***
Transport count within 500 m				.23 (2.23)**	2.45 (9.10)***
Wholesale count within 500 m	−.13 (−4.10)***			.57 (8.72)***
Entropy (SD)	5.69 (0.04)**
Employment number (ln) (SD)	.349 (0.02)*
Rural (SD)	.42 (0.04)*
Suburban (SD)		2.01 (3.75)***
Distance to highway < 500 m (SD)			1.43 (2.27)**
Distance to CBD < 500 m (SD)				1.99 (2.96)***
Parcel area (ln) (SD)					.95 (2.82)***
Goodness of fit
Restricted log likelihood	−2309.49	−2763.10	−626.30	−849.65	−439.79
Log likelihood function	−1409.23	−1550.76	−173.15	−411.49	−155.39
R-squared (constants only)	0.39	0.4388	.7235	0.5157	.6467
R-squared (constants only)(adjusted)	0.3889	0.4382	.7224	0.5142	.6448
AIC	2846.5	3121.5	366.3	843	328.8

Note: CBD = central business district; AIC = Akaike information criterion; SD = standard deviation.

***

, **, * ==> Significance at 1%, 5%, 10% level.

Industry: The proximity to a business park within a 500 m radius has a substantial impact on attracting medium-sized industry businesses with 10–99 employees. These firms seem to find the presence of business parks important, likely because of the access to shared resources, suppliers, and expertise that can support their growth and expansion. Business parks typically offer large, flexible spaces, which meet the requirements of growing medium-sized firms. On the other hand, being close to a CBD decreases the likelihood of industry businesses choosing that location. The higher land costs near CBDs make them less attractive, and the limited, inflexible spaces may not suit industrial needs. Additionally, traffic congestion near CBDs leads to longer travel times for suppliers and employees, making such areas less appealing. Interestingly, despite the advantages of business parks, the overall proximity to these parks reduces the likelihood of industry firms choosing to locate there, suggesting a general preference for locations away from business parks. However, other significant factors, such as proximity to bus stops and local streets, play crucial roles in determining location choices for industry businesses. Moreover, locating in a rural area slightly increases the likelihood of industry businesses selecting that option, whereas proximity to other industry businesses has a positive impact, likely because of potential supply chain links and synergies. Conversely, proximity to wholesale businesses strongly decreases the likelihood of location, possibly because of competition for similar spaces. The significant standard deviations for certain variables indicate heterogeneity within the industry sector, suggesting that different factors influence location choices for various industry firms, depending on their specific needs and characteristics. These determinants might encompass establishment-specific attributes such as employment numbers, annual sales volume, and floor area. Furthermore, unobserved contextual effects and potential measurement errors may contribute to the observed dispersion, thereby underscoring the multifaceted nature of the decision-making processes within this sector.

Retail: In the context of choosing a location for retail, certain factors have been identified as influential in the decision-making process. One of the key findings is that proximity to the CBD, highways, bus stops, business parks, and existing retail outlets is associated with a higher likelihood of selection. This suggests that being closer to customers, essential infrastructure, and other businesses improves accessibility and overall desirability for retail establishments. Additionally, larger retail outlets, such as supermarkets and big box stores, tend to see higher volumes of goods movements and truck deliveries as a result of their extensive inventory. Consequently, having easy access to highways and major roads, which the model confirms as desirable, facilitates efficient truck deliveries and the smooth movement of goods to and from these retail locations. Another significant factor affecting the choice of retail location is land use diversity, quantified by the concept of entropy. The results indicate that higher land use diversity positively influences the likelihood of selecting a particular location. This suggests that having a diverse mix of potential customers in the vicinity can be advantageous for retail businesses.

Moreover, the suburban setting emerges as a notable aspect affecting the decision of retail location. On average, being in a suburban area increases the likelihood of selection. However, the situation is not uniform among all retailers, as indicated by the standard deviation term. Some retailers strongly prefer suburban locations, whereas others do not find them as appealing. The attractiveness of suburban areas to retailers may stem from various factors, including potentially lower costs, larger land parcels, and more parking options. Nevertheless, the observed variation in retailers’ preferences for factors like customer base, costs, and agglomeration contributes to the heterogeneity in the desirability of suburban locations for retail businesses. Thus, understanding these diverse factors is essential for making informed decisions when choosing the most suitable retail location.

Service: The choice of location for service uses is influenced by several key factors. Larger parcel areas, proximity to the CBD, and access to public transit are associated with higher log odds of selecting a suitable location for a service business. This is because larger sites can accommodate the operational needs of service enterprises, such as storage, vehicle parking, and maneuvering areas. Being close to the CBD is advantageous for service uses as it allows them to reach corporate clients located in downtown areas, thereby expanding their customer base. Moreover, easy access to public transit is particularly beneficial for lower-income workers, as it facilitates their commute to service locations.

However, certain factors are found to have a negative impact on the log odds of choosing a location for service uses. Proximity to highways, business parks, and urban areas is associated with a decrease in the likelihood of selection. Although major road access is still necessary, service businesses often prefer locations with less highway proximity to avoid noise and congestion. Standalone locations are also preferred over business parks because of the flexibility and cost-effectiveness they offer. Non-urban locations are favored as well, as they provide ample space, lower costs, and less congestion, while still being accessible, as indicated by the positive effect of CBD proximity. Higher land use diversity and the presence of surrounding industries and retailers also play a role in location selection, albeit to a lesser extent. One notable finding is the significant heterogeneity in preferences in relation to highway proximity among service uses. This variation is attributed to the diverse needs and customer bases of different service businesses, leading to differing opinions on the importance of highway proximity in their choice of location.

Wholesale: Wholesalers consider various factors when choosing a location for their operations. Findings shows that proximity to highways and positions within business parks positively influence the likelihood of wholesalers selecting a specific site. Easy access to highways enables efficient transportation networks for shipping and receiving goods, and business parks offer valuable benefits such as larger spaces, well-established infrastructure, and access to other businesses that can cater to the wholesalers’ needs. On the other hand, certain factors are associated with lower odds of wholesalers choosing a location. Higher population density, rural settings, and proximity to CBD tend to discourage their selection. Wholesalers typically prefer less densely populated, more suburban areas that provide ample space for storage, parking, and logistics activities. However, accessibility remains essential, and even in non-rural areas with lower density, the necessary infrastructure and accessibility are still available. Interestingly, wholesalers’ preferences for proximity to CBD show significant heterogeneity. The variations in their supply chains and customer bases lead to diverse needs for CBD accessibility and location preferences in general. Overall, the findings align with wholesalers’ focus on efficient goods movement.

Transportation: Results reveal several distinct preferences within the transportation industry. Larger sites are favored because of their ability to accommodate crucial operational needs such as vehicle staging, loading/unloading, and material storage. On the other hand, smaller transport uses prefer locations near highways, as they heavily rely on efficient road networks. Moreover, businesses within the transportation sector tend to opt for locations that are farther from the CBD, which offer more flexibility, ample space, and cost advantages while still ensuring accessibility. Another key finding is the preference for clustering with related transport businesses, indicating a desire for collaboration and synergy. However, surrounding industries are not preferred, suggesting a desire for a dedicated space within the transportation sector. Surprisingly, proximity to business parks and rural locations has only marginal effects, with public transit access having a relatively smaller impact on location preferences. Standalone locations are favored over business parks because of their advantages of flexibility, space, and cost. Despite being non-urban, such locations offer large and cost-effective sites, making them appealing options for transportation businesses despite their requirement for accessibility. Site size preferences also vary significantly among different transportation uses, reflecting the diverse needs and operations within the industry.

Conclusions

The framework presented in this study to develop a behaviorally plausible location choice alternatives of business establishments incorporated multi-dimensional built environment and neighborhood attributes. The integration of the unsupervised ML technique with the MMNL model allows a data-driven approach to understand the intricate phenomenon of the two-stage location choice process of business establishment. This approach can help reveal underlying patterns and connections that might not be visible through the traditional one-step MNL model. The rigorous analysis yields insights into how certain types of businesses conduct the initial search process considering the influential built environment and neighborhood characteristics in a two-step location choice model. The outcomes from this study indicate that the choice of location for service businesses is influenced by several factors, such as larger parcel areas, proximity to the CBD, and access to public transit, which are associated with higher log odds of selection. On the other hand, factors like proximity to highways, business parks, and urban areas have a negative impact on the likelihood of selection, with standalone, non-urban locations preferred for their flexibility, cost-effectiveness, and access to diverse customer bases. The transportation industry prefers larger sites for crucial operational needs, and smaller transport businesses choose locations near highways. They prioritize standalone locations over business parks, prefer clustering with related transport businesses, and the volume of goods movements and truck traffic varies widely based on the scale and type of each transportation use.

However, this study is subject to certain limitations and future directions, as it exclusively focuses on a constrained set of the accessibility factors (transportation infrastructure, proximity to major roads, highways, airports, and public transportation options) as the influencing features governing business establishments’ location choice. Although these attributes significantly influence business location decisions, they do not encompass the entirety of the factors that affect location choice of businesses. The inclusion of office and business profile factors in the location choice model for business establishments holds the potential to significantly enhance decision-making accuracy. By incorporating these factors, businesses can make well-informed choices about their preferred locations. The office profile factor allows businesses to identify spaces that cater to their specific spatial needs, work environment preferences, and technological requirements, resulting in increased productivity and employee satisfaction. Simultaneously, the business profile factor aids in selecting locations aligned with strategic objectives and target markets, promoting growth opportunities and networking prospects within the region. By reinforcing brand identity and considering long-term viability, businesses optimize resource allocation and cost-effectiveness.

Furthermore, to enhance the comprehensiveness of location choice modeling, the proposed model can be extended to incorporate qualitative data, including interviews with business owners. With regard to practical considerations, exploring efficient data collection methods is crucial, with a focus on participant selection, interview methodologies, and the reliability of gathered information. The integration of qualitative data offers potential advantages, providing a more holistic understanding of business location choice dynamics. Moreover, incorporating qualitative insights can refine and validate the quantitative model, potentially improving its predictive accuracy and making the location choice model more relevant to real-world scenarios. Additionally, the developed model has the potential to address temporal dynamics in real-world applications through the development of hybrid models (a combination of a time-series forecasting model to predict future conditions and then using DBSCAN to cluster locations based on those predicted conditions). However, the proposed model may encounter certain challenges in practical implementation, particularly in the areas of parameter tuning, cluster adaptability, and memory efficiency. Making strategic parameter choices and implementing optimizations are crucial for ensuring accurate modeling.

The developed two-step location choice model focusing on major economic sectors, including industry, retail, service, wholesale, and transportation, constitutes a noteworthy contribution to the existing literature. The integration of this novel and potentially insightful method provides valuable and in-depth insights for various domains, such as commercial vehicle movement modeling, urban planning, business location strategies, and policymaking concerning economic development. Most importantly, this study represents a substantial advancement in location choice modeling in developing a commercial goods movement modeling for an integrated transport, land use and energy modeling system.

Footnotes

Acknowledgements

The authors would like to thank Climate Action Awareness Fund (CAAF) and Natural Sciences and Engineering Research Council of Canada (NSERC) Discovery Grant for their contributions in supporting this research. The authors are grateful to Maria Lutes for her time to proofread the manuscript.

Author Contributions

The authors confirm contribution to the paper as follows: study conception and design: NM, MAH; data collection: NM, MAH; analysis and interpretation of results: NM, MAH; draft manuscript preparation: NM, MAH. All authors reviewed the results and approved the final version of the manuscript.

Declaration of Conflicting Interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) received no financial support for the research, authorship, and/or publication of this article.

ORCID iDs

Niaz Mahmud

Muhammad Ahsanul Habib

References

Bell

D. A.

Travel Impacts Arising from Office Relocation from City to Suburbs. Transportation, Vol. 18, 1991, pp. 239–259.

Armstrong

R. B.

The Office Industry: Patterns of Growth and Location. MIT Press, Cambridge, 1972.

Environment Canada. Canada’s Emissions Trends. 2014. https://www.canada.ca/en/environment-climate-change/services/climate-change/publications/emission-trends-2014.html

Environment and Climate Change Canada. Greenhouse Gas Emissions Canadian Environmental Sustainability Indicators. 2022. https://www.canada.ca/en/environment-climate-change/services/environmental-indicators/greenhouse-gas-emissions.html

US Environmental Protection Agency. Smart Growth Network. 2011. https://www.epa.gov/smartgrowth

Ziomas

Suppan

Papayannis

Melas

Fabian

Tzoumaka

RappenglÜCk

Balis

Zerefos

A Contribution to the Study of Photochemical Smog in the Greater Athens Area. Contributions to Atmospheric Physics, Vol. 68, No. 3, 1995, pp. 191–204. https://doi.org/refwid:16736.

Grenzeback

L. R.

Reilly

W. R.

Roberts

P. O.

Stowers

J. R.

Urban Freeway Gridlock Study: Decreasing the Effects of Large Trucks on Peak-Period Urban Freeway Congestion. Transportation Research Record: Journal of the Transportation Research Board, 1989. 1256: 16–26.

Alcácer

Chung

Location Strategies for Agglomeration Economies. Strategic Management Journal, Vol. 35, No. 12, 2014, pp. 1749–1761. https://doi.org/10.1002/smj2186.

Balbontin

Hensher

D. A.

Firm-Specific and Location-Specific Drivers of Business Location and Relocation Decisions. Transport Reviews, Vol. 39, No. 5, 2019, pp. 569–588. https://doi.org/10.1080/01441647.2018.1559254.

10.

Elgar

Farooq

Miller

E. J.

Modeling Location Decisions of Office Firms. Transportation Research Record: Journal of the Transportation Research Board, 2009. 2133: 56–63.

11.

Maoh

H. F.

Modeling Firm Demography in Urban Areas with an Application to Hamilton, Ontario: Towards an Agent-Based Microsimulation Model. McMaster University, Hamilton, Canada, 2005.

12.

Abraham

J. E.

Hunt

J. D.

Firm Location in the MEPLAN Model of Sacramento. Transportation Research Record: Journal of the Transportation Research Board, Vol. 1685, No. 1, 1999, pp. 187–198. https://doi.org/10.3141/1685-24.

13.

Khan

A. S.

A System for Microsimulating Business Establishments: Analysis, Design and Results. University of Calgary, Canada, 2002.

14.

De Bok

Sanders

Firm Relocation and Accessibility of Locations. Transportation Research Record: Journal of the Transportation Research Board, 2005. 1902: 35–43.

15.

Maoh

H. F.

Kanaroglou

P. S.

Agent-Based Firmographic Models: A Simulation Framework for the City of Hamilton. Proc., PROCESSUS Second International Colloquium on the Behavioural Foundations of Integrated Land-use and Transportation Models: Frameworks, Models and Applications, Toronto, Canada, 2005.

16.

Eisele

W. L.

Schrank

D. L.

Bittner

Larson

Incorporating Urban-Area Truck Freight Value into the Urban Mobility Report. Transportation Research Record: Journal of the Transportation Research Board, 2013. 2378: 54–64.

17.

Fischer

Ang-Olson

External Urban Truck Trips Based on Commodity Flows: A Model. Transportation Research Record: Journal of the Transportation Research Board, 2000. 1707: 73–80.

18.

Belar

P. L.

Habib

M. A.

Development of an Urban Transport Network and Emission Model for the Port City of Halifax, Canada. Proc., Transportation Association of Canada and ITS Canada 2019 Joint Conference and Exhibition, Halifax, Canada, 2019.

19.

van Wissen

A Micro-Simulation Model of Firms: Applications of Concepts of the Demography of the Firm. Papers in Regional Science, Vol. 79, No. 2, 2000, pp. 111–134. https://doi.org/10.1007/s101100050039.

20.

Kumar

Kockelman

K. M.

Tracking Size, Location, and Interactions of Businesses. Transportation Research Record: Journal of the Transportation Research Board, 2008. 2077: 113–121.

21.

Balbontin

Hensher

D. A.

Understanding Business Location Decision Making for Transport Planning: An Investigation of the Role of Process Rules in Identifying Influences on Firm Location. Journal of Transport Geography, Vol. 91, 2021, p. 102955. https://doi.org/10.1016/j.jtrangeo.2021.102955.

22.

Carlton

D. W.

Why New Firms Locate Where They Do: An Econometric Model. Joint Center for Urban Studies of MIT and Harvard University, Cambridge, MA, 1979.

23.

Lee

K. S.

A Model of Intraurban Employment Location: An Application to Bogota, Colombia. Journal of Urban Economics, Vol. 12, No. 3, 1982, pp. 263–279. https://doi.org/10.1016/0094-1190(82)90018-3.

24.

Maoh

Kanaroglou

Intrametropolitan Location of Business Establishments. Transportation Research Record: Journal of the Transportation Research Board, 2009. 2133: 33–45.

25.

Iseki

Jones

R. P.

Analysis of Firm Location and Relocation in Relation to Maryland and Washington, DC Metro Rail Stations. Research in Transportation Economics, Vol. 67, 2018, pp. 29–43. https://doi.org/10.1016/j.retrec.2016.11.003.

26.

Weterings

Knoben

Footloose: An Analysis of the Drivers of Firm Relocations over Different Distances. Papers in Regional Science, Vol. 92, No. 4, 2013, pp. 791–809. https://doi.org/10.1111/j.1435-5957.2012.00440.x.

27.

Bodenmann

B. R.

Axhausen

K. W.

Destination Choice for Relocating Firms: A Discrete Choice Model for the St. Gallen Region, Switzerland. Papers in Regional Science, Vol. 91, No. 2, 2012, pp. 319–341. https://doi.org/10.1111/j.1435-5957.2011.00389.x.

28.

Waddell

Shukld

Manufacturing Location in a Polycentric Urban Area: A Study in the Composition and Attractiveness of Employment Subcenters. Urban Geography, Vol. 14, No. 3, 1993, pp. 277–296. https://doi.org/10.2747/0272-3638.14.3.277.

29.

Hansen

E. R.

Industrial Location Choice in São Paulo, Brazil. Regional Science and Urban Economics, Vol. 17, No. 1, 1987, pp. 89–108. https://doi.org/10.1016/0166-0462(87)90070-6.

30.

Gabe

T. M.

Bell

K. P.

Tradeoffs between Local Taxes and Government Spending as Determinants of Business Location. Journal of Regional Science, Vol. 44, No. 1, 2004, pp. 21–41. https://doi.org/10.1111/j.1085-9489.2004.00326.x.

31.

Rosenthal

S. S.

Strange

W. C.

Evidence on the Nature and Sources of Agglomeration Economies. In Handbook of Regional and Urban Economics ( J. V.

Henderson

Thisse

J.-F.

, eds.), Elsevier, Amsterdam, Netherlands, 2004, pp. 2119–2171.

32.

De Bok

Van Oort

Agglomeration Economies, Accessibility and the Spatial Choice Behavior of Relocating Firms. Journal of Transport and Land Use, Vol. 4, No. 1, 2011, p. 5. https://doi.org/10.5198/jtlu.v4i1.144.

33.

Backman

Karlsson

Location of New Firms: Influence of Commuting Behaviour. Growth and Change, Vol. 48, No. 4, 2017, pp. 682–699. https://doi.org/10.1111/grow.12200.

34.

de Jong

Ben-Akiva

A Micro-Simulation Model of Shipment Size and Transport Chain Choice. Transportation Research Part B: Methodological, Vol. 41, No. 9, 2007, pp. 950–965. https://doi.org/10.1016/j.trb.2007.05.002.

35.

Wisetjindawat

Sano

Matsumoto

Raothanachonkun

Microsimulation Model for Modeling Freight Agents Interactions in Urban Freight Movement. Presented at 86th Annual Meeting of the Transportation Research Board, Washington, D.C., 2007.

36.

Roorda

M. J.

Cavalcante

McCabe

Kwan

A Conceptual Framework for Agent-Based Modelling of Logistics Services. Transportation Research Part E: Logistics and Transportation Review, Vol. 46, No. 1, 2010, pp. 18–31. https://doi.org/10.1016/j.tre.2009.06.002.

37.

Pourabdollahi

(kouros) Mohammadian

Kawamura

A Behavioral Freight Transportation Modeling System. Proc., 14th Annual International Conference on Electronic Commerce, August 7, 2012, Singapore, Association for Computing Machinery, New York, NY, pp. 196–203.

38.

Cavalcante

R. A.

Roorda

M. J.

Freight Market Interactions Simulation (FREMIS): An Agent-Based Modeling Framework. Procedia Computer Science, Vol. 19, 2013, pp. 867–873. https://doi.org/10.1016/j.procs.2013.06.116.

39.

Manski

C. F.

The Structure of Random Utility Models. Theory and Decision, Vol. 8, No. 3, 1977, pp. 229–254. https://doi.org/10.1007/BF00133443.

40.

Swait

Ben-Akiva

Incorporating Random Constraints in Discrete Models of Choice Set Generation. Transportation Research Part B: Methodological, Vol. 21, No. 2, 1987, pp. 91–102. https://doi.org/10.1016/0191-2615(87)90009-9.

41.

Ben-Akiva

M. E.

Lerman

S. R.

Discrete Choice Analysis: Theory and Application to Travel Demand. MIT Press, London, 1985.

42.

McFadden

Modelling the Choice of Residential Location. In Spatial Interaction Theory and Planning Models (A. Karlqvist, L. Lundqvist, F. Snickars, and J. W. Weibull eds.), Vol 25, Amsterdam, North Holland, 1978, pp. 75–96.

43.

Rashidi

T. H.

(kouros) Mohammadian

Behavioral Housing Search Choice Set Formation. International Regional Science Review, Vol. 38, No. 2, 2015, pp. 151–170. https://doi.org/10.1177/0160017612461356.

44.

Adler

J. L.

Route Choice: Wayfinding in Transport Networks. Transportation Research Part A: Policy and Practice, Vol. 27, No. 4, 1993, pp. 338–339. https://doi.org/10.1016/0965-8564(93)90007-8.

45.

Orvin

M. M.

Fatmi

M. R.

A Residential Location Search Model Based on the Reasons for Moving Out. Transportation Letters, 2023, pp. 1–15. https://doi.org/10.1080/19427867.2023.2222990.

46.

Bela

P. L.

A Framework of Multiclass Travel Demand Forecasting and Emission Modelling, Incorporating Commercial Vehicle Movement for the Port City of Halifax, Canada. Dalhousie University, Halifax, Canada, 2018.

47.

Bracken

Martin

The Generation of Spatial Population Distributions from Census Centroid Data. Environment and Planning A: Economy and Space, Vol. 21, No. 4, 1989, pp. 537–543. https://doi.org/10.1068/a210537.