Sage Journals: Discover world-class research

Abstract

Direct demand (DD) models are an important tool for estimating annual average daily pedestrian traffic (AADPT) for all intersections in a jurisdiction. These models associate socioeconomic and land-use variables with pedestrian exposure and allow the estimation of AADPT for sites where pedestrian counts are not (readily) available. However, some jurisdictions lack pedestrian volume counts from a sufficiently large number of intersections to develop their own DD model or do not have the institutional resources to carry out the model development. Under these circumstances, a cost-effective alternative is to use DD models that were developed in other jurisdictions. Previous research evaluated the spatial transferability of DD models in scenarios where no pedestrian counts are available (i.e., naïve transferability) and showed that this resulted in large estimation errors. This paper examines methods to improve the estimation accuracy of spatially transferred DD models by using AADPT that is readily accessible to jurisdictions (we call this local calibration). Five local calibration models were proposed and evaluated using observed field counts and synthesized counts from three jurisdictions. The best model to use is a function of the number of local jurisdiction sites for which pedestrian counts are available. When pedestrian volume is available for 10% of the sites, Model C presented the best results for the synthetic approach: an average improvement of 8.7% when comparing the locally calibrated and naïve estimates. Using real AADPTs and very limited samples for local calibration, Model C also presented the best performance: an average improvement of 35.0%.

Keywords

Pedestrian volume Direct Demand model Spatial transferability Local calibration Synthetic volume

Recent North American road safety reports have shown alarming trends in the frequencies and proportions of fatal crashes involving pedestrians. For instance, the yearly number of pedestrians killed on U.S. roadways increased from 4,092 to 6,205 between 2009 and 2019 ( 1 , 2 ). In Canada, although the number of pedestrians killed has remained relatively stable over the last few decades, the proportion of pedestrian fatalities has increased. Between 2005 and 2009, pedestrians accounted for 13% of the total road fatalities in Canada; this number grew to 17% in the period between 2015 and 2019 ( 3 , 4 ). These statistics show the importance and need for jurisdictions to develop strategies focused on pedestrian safety.

The set of opportunities for a jurisdiction to improve pedestrian safety is wide and varied, ranging from long-term educational actions to relatively simple infrastructure changes such as the installation of a raised crosswalk. In the context of signalized intersection—the object of study in this work—the selection of optimal countermeasures to benefit pedestrians is commonly preceded by a network screening step, in which sites are ranked according to some measure of risk ( 5 ). The development of the site ranking relies on information about pedestrian exposure at all the sites in a jurisdiction ( 6 – 8 ), which is often challenging to acquire.

The opportunity for jurisdictions to obtain pedestrian exposure for the entire road network depends on the type and quantity of data available. The two typical sources for pedestrian volume data at signalized intersections are short-term counts (STCs) and continuous counts (CCs). STCs are normally the product of fieldwork carried out for signal timing design purposes, where vehicular turning movement counts and pedestrian volumes are counted over one or a few nonholiday weekdays. On the other hand, CCs provide long-term information (i.e., years) and are useful for detecting systematic spatial and temporal oscillations in vehicular and pedestrian volumes. In sites where CC stations are available, annual average daily pedestrian traffic (AADPT) can be directly calculated. In addition, when CCs are available from a sufficient number of intersections then the CC data can be used to expand STCs to AADPT. The literature suggests that CCs should be available from between 6 and 25 stations as a minimum, depending on the characteristics of the jurisdiction ( 9 ), for the expansion of STCs to provide reliable estimates of AADPT. Because of resource limitations, jurisdictions may have sites for which no recent pedestrian counts are available, which prevents appropriate jurisdictionwide exposure estimates (according to the Federal Highway Administration [ 10 ], pedestrian STCs must be renewed every 3 to 7 years). To overcome this limitation, researchers have developed direct demand (DD) models to estimate pedestrian volumes for sites where pedestrian counts are not available.

DD models are statistical models calibrated using sites where pedestrian exposure is known by associating it with explanatory variables that are often readily available to jurisdictions: socioeconomic features (e.g., population and employment densities), land-use attributes (e.g., number of commercial and residential places), and geometric and operational aspects (e.g., presence of transit stops) ( 11 – 13 ). Once the DD model has been developed, practitioners can estimate the pedestrian exposure for all the sites in a given jurisdiction.

There is a relatively large body of literature on DD models. Different models considering varied explanatory variables, calibration sample sizes, modeling techniques, and types of jurisdictions have been developed and are available for the community (a comprehensive literature review on the development of DD models is available in other works [9, 11, 12]). This leads us to the prime motivator behind this study: the development of adequate DD models requires that jurisdictions have valid pedestrian data (ideally in the form of AADPT) for a sufficiently large number of sites to support the development (i.e., calibration and validation) of a DD model. For example, researchers have developed DD models with an R² of 0.70 using calibration sample sizes in the order of 50 to 70 sites ( 7 , 14 , 15 ). Considering that some jurisdictions may lack data to develop their own DD model, a cost-effective alternative is for them to apply a model available in the literature. However, very limited research has been done on the examination of the spatial transferability of DD models, raising the question of whether this alternative is appropriate.

To the authors’ knowledge, only one study investigates the spatial transferability of DD models to other jurisdictions. Sobreira and Hellinga evaluated the transferability of six DD models available in the literature to three different jurisdictions and found that the accuracy of the model estimation varied considerably depending on the characteristics of the jurisdiction that the model is being transferred to (referred to in this work as the target jurisdiction) ( 9 ). The evaluation methodology assumed that no information on pedestrian volume was known for any sites in the target jurisdiction, resulting in a “naïve” transferability. This represents an extreme case, as it can be reasonably expected that most jurisdictions have pedestrian counts for at least some of their signalized intersections, even when they do not have sufficient data to develop their own DD model. Consequently, there is an opportunity for jurisdictions that fall into this category to try to enhance the naïve transferability of DD models by employing the existing pedestrian count data, a procedure that is referred to in this work as local calibration. Thus, this research evaluates the local calibration of DD models transferred from other jurisdictions by using AADPT data that are readily available for signalized intersections in the target jurisdiction.

The next sections of this work are organized as follows: Section 2 summarizes the results found by Sobreira and Hellinga ( 9 ). Sections 3, 4, and 5 present the methods proposed for the local calibration. Section 6 comprises the results and discussion of the local calibration, and Section 7 presents the conclusions of this work.

Naïve Transferability of DD Models

Sobreira and Hellinga investigated the spatial transferability of DD models available in the literature ( 9 ). The authors chose six DD models with varying features in relation to the number of significant explanatory variables in the model, calibration dataset sample size, and level of pedestrian activity in the original calibration jurisdiction. The models were applied to three jurisdictions with quite different characteristics for pedestrian activity, land use, and socioeconomics. The first jurisdiction was Milton, Ontario, Canada, which presents a mid-level of pedestrian activity (average AADPT = 313) and similar land-use and socioeconomic characteristics to many other suburban U.S. and Canadian jurisdictions. The second jurisdiction was Pima County, AZ, representing sites with very low pedestrian activity (average AADPT = 28). The third jurisdiction was Downtown Toronto, Ontario, Canada, an environment with extremely high pedestrian activity (average AADPT = 23,481) and population and employment densities.

To help the reader understand the main differences between jurisdictions, Figure 1 presents the distribution of several variables for each jurisdiction. Figure 1 also shows the mean, minimum, and maximum statistics for the jurisdiction in which the model was originally developed. It was observed that the distribution of the variables followed the level of pedestrian activity found in each jurisdiction (i.e., Pima County < Milton < Toronto) and that the variables for Milton and Pima County, in general, were within the range of values of the original DD model calibration datasets. More information on the jurisdictions and the chosen DD models is available in Sobreira and Hellinga ( 9 ).

Figure 1.

Boxplots of variables for each jurisdiction.

To measure the accuracy of the transferability, Sobreira and Hellinga considered sites where at least 30 days of COVID-19 pandemic-free data were available to estimate the observed AADPT ( 9 ). The authors made use of expansion factors to deal with sites where less than a year of data were available (see Sobreira and Hellinga [ 9 ] for further details on this procedure). Table 1 presents the summary of the results for the mean average error (MAE). The ratio between the MAE and the average observed AADPT was also considered to standardize the errors across different studies and jurisdictions. The column “Original study” of Table 1 provides the performance reported by the original DD model authors based on the original jurisdiction calibration or validation dataset. It can be noted that a benchmark value for the standard error metric is around 0.60 to 0.65. The main findings obtained in the study are summarized below:

Although some of the models performed acceptably well when applied to other jurisdictions, there was no single DD model that performed well across all three target jurisdictions.

The models performed well in Milton. The authors linked this to the similarities between Milton and the jurisdictions in which the DD models were originally developed in relation to pedestrian activity, land-use, and socioeconomic attributes.

In general, the models performed poorly in Pima County and Toronto. The authors associated this with the peculiar environment observed in the considered sites of both jurisdictions (i.e., very low and high pedestrian activity, respectively), which is very different from the ones seen in the jurisdictions in the original models.

Table 1.

Summary of the Naïve Spatial Transferability

Study	Original study						Milton (n = 21)Avg. observed AADPT = 313			Pima County (n = 81)Avg. observed AADPT = 28			Toronto (n = 28)Avg. observed AADPT = 23,481
Study	Location (USA)	Number of sites	Avg. AADPT	MAE	MAE/AADPT	MAPE	¹Avg. AADPT	MAE	²MAE/AADPT	¹Avg. AADPT	MAE	²MAE/AADPT	¹Avg. AADPT	MAE	²MAE/AADPT
Griswold et al. ( 13 )	California	1,270	2,433	NA	na	NA	265	164	0.52	91	69	2.46	101,925	80,789	3.44
Hankey et al. ( 15 )	Blacksburg	72	192	NA	na	NA	329	244	0.78	172	145	5.18	5,448	18,190	0.77
Munira et al. ( 7 )	Austin	44	605	379	0.63	NA	138	243	0.78	74	57	2.04	38,473	40,721	1.73
Sanders et al. ( 14 )	Seattle	50	NA	NA	na	39%	2,668	2,355	7.52	1,295	1,267	45.25	89,602	86,847	3.7
Schneider et al. ( 12 )	Milwaukee,	260	186	120	0.65	NA	179	205	0.65	47	29	1.04	Inf.	Inf.	Inf.
Singleton et al. ( 11 )	Utah	1,494	267	NA	na	NA	360	166	0.53	112	87	3.11	13,185	15,461	0.66

Note: Avg. = average; AADPT = annual average daily pedestrian traffic; MAE = mean absolute error; MAPE = mean average percent error; Inf. = infinite; NA = not available; na = not applicable.

Average AADPT for the application.

MAE divided by the average observed AADPT.

Source: Sobreira and Hellinga ( 9 ).

These findings suggest that naïvely transferring DD models from the literature may be sufficient when the target jurisdiction shares characteristics with the original model jurisdiction (i.e., level of pedestrian activity and land-use and socioeconomic features). However, when this does not happen, the accuracy of the transferability is frequently substantially worse than the benchmark standard, demonstrating a clear need for methods to improve the transferability. This is the focus of the present work.The present work was developed using the same target jurisdictions and datasets as Sobreira and Hellinga ( 9 ). The three DD models that presented the best performance when being naïvely transferred were selected for the local calibration task: Singleton et al. ( 11 ), Schneider et al. ( 12 ), and Griswold et al. ( 13 ). The next section describes the local calibration method and the five local calibration models evaluated in this paper.

Proposed Local Calibration Models

The idea behind the local calibration is that the pedestrian volume data that are readily available to target jurisdictions may be used to enhance the naïve AADPT estimates from DD models that were developed in other jurisdictions. Figure 2 presents the proposed method for estimating AADPT using the local calibration approach. Some key terms of the method are described below:

$Y_{1}^{'}$ = predicted AADPT for the target jurisdiction (referred to as Jurisdiction #2) using a DD model developed in another jurisdiction (referred to as Jurisdiction #1). This represents the naïve transferability.

$Y_{2}$ = observed AADPT for the target jurisdiction.

$Y_{2}^{'}$ = predicted AADPT for the target jurisdiction using the locally calibrated DD model.

Figure 2.

Method for estimating AADPT using local calibration.

Step #1 of the method consists of detailing a given DD model from the literature in relation to significant independent variables, model coefficients, and model structure (e.g., log-linear or negative binomial distributions). In Step #2.1, the independent variables identified in Step #1 are collected for the target jurisdiction, then the AADPT is naïvely predicted ( $Y_{1}^{'}$ ). In Step #2.2, the observed AADPT is estimated using signalized intersections where pedestrian volume data are available. We decided to choose sites for which CCs are available so that AADPT can be either directly estimated when at least 1 year of data are available or estimated with the use of expansion factors when less than a year are available (we considered sites with time series of daily counts across at least 30 days). This prevents the entry of errors from using STCs from just one or a few days and expanding these to AADPT.

Steps #1, #2.1, and #2.2 represent the naïve transferability approach and are pre-requisites for the local calibration. More details on these steps, on the features of the DD models selected from the literature, and on how the observed AADPTs were estimated can be found in Sobreira and Hellinga ( 9 ).

Five models (A to E) were proposed for the development of the local calibration (Equations 2 to 6). In addition, a base model representing the naïve transferability was also considered for comparison purposes (Equation 1). Models A and B correspond to scaling factor models. Models C and D consist of a calibration of linear and power regression, respectively, using $Y_{2}$ as the dependent variable and $Y_{1}^{'}$ as the independent variable. In Model E, a coefficient is calibrated for each pair of independent variables and model coefficient in the DD model. It was expected that Models A to D would capture the “macro” aspects of the differences in pedestrian activity between the two jurisdictions (e.g., different levels of motorization or transit supply), whereas Model E could incorporate variations in pedestrian activity at an “independent variable” level. For example, a school in Jurisdiction #1 may produce N times more pedestrian trips than a school in Jurisdiction #2.

The selected DD models for this task were developed using either log-linear or negative binomial distributions ( 11 – 13 ). In both distributions, the dependent variable was ln-transformed before the model calibration. We tried the application of the local calibration models using both the natural logarithm of AADPT and AADPT itself. The former approach presented better results and was the one adopted. The exponential of the natural logarithm of AADPT was then taken to obtain the locally calibrated AADPT.

Base model: no change to the original model from Jurisdiction #1 (naïve transferability)

\ln (Y_{base, i, 2}^{'}) = \ln (Y_{i, 1}^{'}) = β_{0} + β_{1} X_{i, 1} + \dots + β_{j} X_{i, j}

(1)

Model A: calibration factor ( $α$ )

α = \frac{1}{k} \sum_{i = 1}^{k} \frac{\ln (Y_{i, 2})}{\ln (Y_{i, 1}^{'})} ∴ \ln (Y_{A, i, 2}^{'}) = α \ln (Y_{i, 1}^{'})

(2)

Model B: calibration factor Version 2 ( $α$ )

α = \frac{\sum_{i = 1}^{k} \ln (Y_{i, 2})}{\sum_{i = 1}^{k} \ln (Y_{i, 1}^{'})} ∴ \ln (Y_{B, i, 2}^{'}) = α \ln (Y_{i, 1}^{'})

(3)

Model C: linear regression to obtain slope ( $α$ ) and intercept ( $γ$ )

\ln (Y_{C, i, 2}^{'}) = α \ln (Y_{i, 1}^{'}) + γ

(4)

Model D: power regression to obtain coefficients $α$ and $γ$

\ln (Y_{D, i, 2}^{'}) = α \ln (Y_{i, 1}^{'})^{γ}

(5)

Model E: linear regression to obtain adjustment ( $α$ ) to each independent variable and coefficient

\ln (Y_{E, i, 2}^{'}) = α_{0} β_{0} + α_{1} β_{1} X_{1} + \dots + α_{j} β_{j} X_{j}

(6)

where

$i$ = a given signalized intersection for which AADPT is known,

$j$ = number of explanatory variables in the DD model,

$β$ = model coefficients from the literature DD model,

$X$ = explanatory variables extracted from Jurisdiction #2,

$k$ = number of sites used where AADPT is known,

$α$ and $γ$ = local calibration coefficients, and

$Y_{M, i, 2}^{'}$ = locally calibrated prediction of AADPT for site i in the target jurisdiction using local calibration model, M, where M = {A, B, C, D, E}.

Evaluation Methodology

The performance of the different proposed local calibration models will depend on (i) the number of sites ( $k$ ) in the target jurisdiction for which AADPT are available, (ii) the characteristics of these sites, (iii) the DD model adopted, and (iv) the characteristics of the target jurisdiction. The first two factors can be considered random variables. As a result, a Monte-Carlo simulation method was proposed to assess the performance of each of the five local calibration models across a range of sites. The steps are described below. Steps 3 to 6 were simulated 300 times for each model and each value of $k$ . The reason why we performed the simulation 300 times is discussed in the next section.

Identify sites where AADPT is known. Sample size is equal to $n$ .

Set sample sizes of sites used for local calibration ( $k = 5, 10, 15, \dots, n$ ).

Randomly select $k$ sites from the entire sample to perform the local calibration.

Calibrate the coefficients $α$ and $γ$ .

Obtain $Y_{2}^{'}$ for the entire sample.

Compute the error metric ( $E$ ) as the ratio between MAE and average observed AADPT (Equation 7).

E = \frac{MAE}{\bar{Y_{2}}} = \frac{\frac{1}{n} \sum_{i = 1}^{n} | Y_{i, 2} - Y_{i, 2}^{'} |}{\frac{1}{n} \sum_{i = 1}^{n} Y_{i, 2}}

(7)

The median of the error indicators ( $E)$ obtained in Step 6 was calculated for each local calibration model and each value of $k$ and then compared with the result achieved with the base model (i.e., naïve transferability). The magnitude of this indicator obtained in the development of the original models is typically between 0.60 and 0.65 (Table 1). This value was used as a benchmark for determining the quality of the local calibration. Note that the error metric carries errors from the original DD model development and from the spatial transferability itself. We were not able to separate these errors.

As the reader may have noticed, small samples of sites where AADPT is known were available in the jurisdictions of the study (Table 1: Milton = 21 sites, Pima County = 81, and Toronto = 28). This constrained our ability to carry out in-depth analyses, for example, defining the optimal number or percentage of sites that maximize the cost–benefit of the local calibration. Thus, we complimented our evaluation using the empirical datasets with an evaluation based on synthesized AADPT data for real sites in the same three jurisdictions. The method of generating the synthetic AADPT data is described in the next section.

Generating Synthetic AADPT Data

The objective of synthesizing AADPT data was to create a larger dataset on which to evaluate the five local calibration models. The main challenge in this task was to generate synthetic AADPT values that were realistic given the characteristics of the site and the jurisdiction. Considering that, we proposed the use of DD models to generate synthetic estimates. The rationale behind this decision was that the DD models had been calibrated to capture the relationships between site characteristics and AADPT and were calibrated in different jurisdictions, so each DD model represented a particular environment with regard to the relationship between pedestrian activity and socioeconomic and land-use features. These site-specific- and jurisdiction impacts were captured by the DD model via the explanatory variables and associated coefficients. As long as the explanatory variables of a given DD model are available for the sites, synthetic AADPTs can be generated. An example is provided to illustrate the method: Griswold et al. developed their DD model using sites from the state of California in the United States ( 13 ). We applied this DD model to all sites in a target jurisdiction (for example Milton) to create synthetic AADPT to represent $Y_{2}$ . Then, it was possible to assess the spatial transferability and local calibration of a different DD model.

Generation of the synthetic AADPT starts with the naïve estimate using a given DD model (Equation 8).

Y_{i, 2}^{*} = e^{β_{0} + β_{1} X_{i, 1} + \dots + β_{n} X_{i, n}}

(8)

where $Y_{i, 2}^{*}$ = initial synthetic AADPT estimate.

We decided to set an upper limit threshold ( $Y_{\max, j}$ , where j refers to the jurisdiction) for $Y_{i, 2}^{*}$ for two reasons: 1) to avoid unrealistic AADPT estimates, and 2) to avoid datasets with a very wide range of AADPT values because sites with large AADPT estimates would dominate performance evaluations (i.e., the $E = MAE / {\bar{Y}}_{2}$ indicator would be biased toward the performance of these sites). Thus, we investigated the $Y_{i, 2}^{*}$ estimates obtained for all sites in each jurisdiction and selected a value for $Y_{\max, j}$ for each jurisdiction, j, by considering the maximum AADPT value observed in the “real” dataset and the number of sites that would be available for the analysis (i.e., if $Y_{i, 2}^{*} > Y_{\max, j}$ , then site i was eliminated from the dataset). For example, in Pima County, $Y_{\max, Pima}$ was set at 489. This number represents the maximum “real” AADPT value (163) multiplied by three, and resulted in the removal of 160 out of 744 sites (signalized intersections).

After filtering the datasets based on the $Y_{\max, j}$ threshold, an error term was included in the synthetic AADPT estimates to ensure realistic variability across sites was considered (Equation 9) and so that different populations of synthesized AADPTs could be generated for the sites in each jurisdiction from each DD model. The error term follows a normal distribution with a mean equal to zero and a given variance (Equation 10). A variance (represented by the standard deviation) was obtained for each site as a function of a given coefficient of variation (COV) and the $Y_{i, 2}^{*}$ estimate (Equation 11). A single COV was calibrated for the application of each jurisdiction and DD model with the goal of generating synthetic datasets with an error metric ( $E = MAE / {\bar{Y}}_{2})$ indicator of approximately 0.60 to 0.65 when comparing the AADPT estimates with and without the error term. In other words, we aimed to create synthetic datasets with similar AADPT variability (i.e., errors between observed and predicted AADPTs) than the one achieved in the original DD model calibration.

Y_{i, 2} = Y_{i, 2}^{*} + ε_{i}

(9)

ε_{i} ~ Normal (0, s_{i}^{2})

(10)

s_{i} = COV \times Y_{i, 2}^{*}

(11)

where

$ε_{i}$ = random error assigned to site i. This term was constrained by $\pm 2 s_{i}$ ;

$s_{i}^{2}$ = variance of $ε_{i}$ ;

$s_{i}$ = standard deviation of $ε_{i}$ ; and

$COV$ = coefficient of variation calibrated for the application of each jurisdiction and DD model.

The inclusion of the random error resulted in the estimation of sites with zero or negative AADPTs, especially for sites with small values of $Y_{i, 2}^{*}$ (i.e., the estimate obtained from the naïve transferability of the DD model). Although negative AADPTs could be treated as zero, we would get 10% to 15% of sites with zero AADPT. One would expect several sites with very low pedestrian activity, but this quantity of null activity was unrealistic. To prevent an overrepresentation of AADPT = 0, a secondary error term was added to $Y_{i, 2}$ when its result was nonpositive (Equation 12),

ε_{i}^{*} = random between (1, max {\begin{matrix} 50 \\ 0.25 \times Y_{i, 2}^{*} \end{matrix}})

(12)

where $ε_{i}^{*}$ = secondary error term added to $Y_{i, 2}$ when it is nonpositive.

The functional form and parameters of Equation 12 are subjective but were based on examination of the observed AADPT data from the three jurisdictions used in this study and on the requirements to retain stochasticity in the synthesized data while limiting the number of sites with synthesized AADPT values equal to zero.

Because of the addition of the random error terms, there are an unlimited number of synthetic AADPTs for a given jurisdiction using a given DD model. As a result, 10 sets of synthetic AADPTs were created for each jurisdiction and DD model combination, and 30 repetitions for each set were used in the simulation process outlined in the previous section. The 300 simulations used in the real AADPT approach were set to match the number of simulations in the synthetic approach. The performance of the five local calibration models considering both “real” and synthetic AADPTs are presented in the next section.

Results and Discussion

Table 2 provides an overview of the number of sites and descriptive statistics of the site AADPT values as obtained from the real AADPT and synthetic AADPT approaches. For the synthetic approach, we decided to split the City of Toronto sites into two datasets—Downtown and Other—because we observed that the AADPT estimates in Downtown were much larger than the ones in the rest of the city. This division was made to avoid biasing the results of the local calibration toward the sites in Downtown. Synthetic datasets were generated for each jurisdiction using the following DD models: Singleton et al. ( 11 ), Schneider et al. ( 12 ), and Griswold et al. ( 13 ), except for the application of Schneider et al. ( 12 ) in Downtown Toronto, which was not possible owing to the generation of unrealistically large estimates (the Schneider et al. model [ 12 ] provides highly unrealistic AADPT estimates when employment density becomes large, as was the case for the Downtown Toronto sites). The evaluation using the real AADPT data used the datasets from Milton, Pima County, and Downtown Toronto.

Table 2.

Descriptive Summary: Sample Size and AADPT

$Y_{2}$ obtained from	Number of sites	Statistics – AADPT¹
$Y_{2}$ obtained from	Number of sites	Average	Median	Min.	Max.
Milton
Real AADPT	21	313	226	22	873
Griswold et al. ( 13 )	94	180	153	0	922
Schneider et al. ( 12 )	94	106	74	11	824
Singleton et al. ( 11 )	94	120	73	2	500
Pima County
Real AADPT²	79	28	16	2	163
Griswold et al. ( 13 )	584	119	90	0	469
Schneider et al. ( 12 )	584	89	51	12	478
Singleton et al. ( 11 )	584	152	124	1	489
Toronto (Downtown)
Real AADPT	28	23,481	22,409	2,443	55,707
Griswold et al. ( 13 )	197	32,174	18,200	368	107,873
Singleton et al. ( 11 )	197	11,805	7,303	288	73,986
Toronto (Other)
Griswold et al. ( 13 )	1,275	317	217	0	3,979
Schneider et al. ( 12 )	1,275	1,181	618	22	5,960
Singleton et al. ( 11 )	1,275	1,109	664	9	5,871

Note: AADPT = annual average daily pedestrian traffic; Max. = maximum; Min. = minimum.

Statistics for the synthetic approach were calculated before the inclusion of the error term ( $Y_{i, 2}^{*}$ ).

Two sites were removed because they presented very small AADPT estimates using Griswold et al.’s ( 13 ) model, which was biasing the local calibration process.

From Table 2 it can be observed that the synthetic data showed similar, but not exactly the same, tendencies across jurisdictions as do the real AADPT data. For example, the real data showed that the Downtown Toronto sites had much larger AAPDT than the other jurisdictions. This difference was also reflected in the synthesized data. However, the real data also exhibited a larger mean and median for Milton than Pima County, but for the synthesized data, the mean and median values were much more similar. It is also worth noting that some of the synthetic datasets presented AADPTs with similar central tendencies, for example the models of Singleton et al. ( 11 ) and Schneider et al. ( 12 ) provided very similar synthesized data when applied to Milton and when applied to Toronto-Other. This had an impact on the performance of the local calibration models, and this is discussed in the next section.

Local Calibration Using Synthetic AADPTs

Figure 3 presents a sample of the results obtained with the local calibration using the synthetic datasets. Each graph represents the local calibration of a given DD model to a given target jurisdiction in which synthetic AADPTs were generated using a given DD model. In total, 31 graphs were created (for Milton, Pima County, and Toronto-Other, nine graphs were generated for each jurisdiction. For Toronto-Downtown, Schneider et al.’s ( 12 ) model could not be applied, so there were only four combinations for this jurisdiction). Note that we also considered the case where a DD model was transferred to a jurisdiction in which the synthetic AADPTs were generated using the same DD model (Figure 3f). Our intention was to evaluate the local calibration in scenarios where the target jurisdiction has an identical relationship between pedestrian activity and socioeconomic/land-use features to the one observed in the jurisdiction where the model was originally developed.

Figure 3.

Results of local calibration using synthetic AADPTs: (a) Milton (Griswold et al./Schneider et al.); (b) Pima County (Singleton et al./Griswold et al.); (c) Toronto-Downtown (Griswold et al./Singleton et al.); (d) Toronto-Other (Singleton et al./Schneider et al.); (e) Pima County (Griswold et al./Singleton et al.); and (f) Toronto-Other (Singleton et al./Singleton et al.)

Each point in the Figure 3 graphs represents the median of the error indicator, $E = MAE / {\bar{Y}}_{2}$ , for the 300 observations considered in the simulation procedure. The x-axis presents the percentage of sites used for the local calibration, ranging from 2.5% to 100% (with the associated number of sites, k, provided in parentheses). Each color represents one local calibration model (Equations 1 to 6). As a reminder, Models A and B correspond to scaling factor models; Models C and D consist of a calibration of linear and power regression, respectively. In Model E, a coefficient is calibrated for each pair of independent variables and model coefficient in the DD model. The calibration of Model E requires the number of observations to be greater than the number of explanatory variables in the DD model, which is why Model E is not visible when the number of sites used for local calibration is small. The y-axis scale of each graph is selected to be appropriate for the associated jurisdiction. The performance of the base model (i.e., naïve transferability) does not vary with the number of sites available for local calibration and is therefore a horizonal line.

The performance of the local calibration models varied significantly across the graphs in Figure 3. Figure 3, a to d , presents situations in which one or more local calibration models produced improvements when compared with the base model (i.e., naïve transferability). However, Figure 3, e and f , shows cases where the effect of local calibration was null or negative. The main reason for the variability in the performance was associated with the level of similarity of the relationship between pedestrian activity and socioeconomic/land-use features in the target jurisdiction and in the one where the DD model was originally developed. One would expect the local calibration to provide improvements when the two jurisdictions exhibit different pedestrian activity relationships, especially if systematic underestimation or overestimation in the AADPT obtained from naïve transferability is observed. Local calibration Models A to D (Equations 2 to 5) were formulated to address scaling differences between pedestrian activity in the two jurisdictions. Conversely, the local calibration would be expected to be less beneficial when the two jurisdictions are very similar.

Model E (Equation 6) was expected to be beneficial when differences in the two jurisdictions were not predominantly a global scaling but reflected different levels of pedestrian activity for specific explanatory variables (e.g., a retail place generates X times more trips in the target jurisdiction than in the jurisdiction where the model was developed, but schools generate Y times more trips and X and Y are not similar). However, a review of all 31 graphs produced revealed that it was necessary for observed AADPT data to be available from at least 70% of the total sites before Model E provided results that were better than the base model. From a practical perspective, this suggests that local calibration Model E was only preferred under conditions when it would be both possible and desirable to develop a DD model directly for the target jurisdiction using the available data.

Figure 4 illustrates the effect of similarity of jurisdictions on local calibration model performance. Figure 4a shows a scatterplot between the naïve AADPT estimates using Griswold et al. ( 13 ) and the synthetic AADPT estimates using Schneider et al. ( 12 ) for Milton. This represents one of the 10 datasets used to obtain Figure 3a, where the local calibration performed well. One may notice that overestimation prevailed in Figure 4a, which is the reason why the $E = MAE / {\bar{Y}}_{2}$ indicator of 1.06 from the base model quickly improved to a magnitude of 0.80 when local calibration Models B, C, or D were used. Figure 4b shows one of the 10 datasets used to produce Figure 3e, a scenario where the local calibration showed a null or negative effect. Although notable differences between the estimates were observed, no systematic under/overestimation occurred, which is why the local calibration models were not able to provide any improvement over the base model.

Figure 4.

Scatterplots of estimated AADPT versus synthetic AADPT: (a) Milton, and (b) Pima County.

Local calibration Model A presented a different pattern (i.e., error increased as more sites were available for local calibration) than the other local calibration models in Figure 3, a and b , and this difference was also observed in some of the graphs that were not displayed in this work. This pattern occurred because of how Model A was formulated (Equation 2). Take the example where the ratios between the observed AADPT ( $Y_{2}$ ) and predicted AADPT ( $Y_{1}^{'}$ ) for two sites were 2 and 0.5. Ideally, one would set the local calibration coefficient ( $α$ ) to be equal to one. However, a value of 1.25 was obtained when using Equation 2. Therefore, the sites where $Y_{2}$ was greater than $Y_{1}^{'}$ dominated how Model A was calibrated, resulting in poor performances when no under/overestimation was systematically observed.

Consistent patterns for the performance of Models B to E across the graphs were observed in Figure 3. When small percentages of the total sites were available for local calibration (e.g., 2.5% and 5%), Models B to D outperformed Model E. Model E turned out to be the best option when a relatively large sample of sites was available for the local calibration (between 50 and 100 sites, depending on the jurisdiction). However, if this amount of information was availabl it would probably be preferable if the jurisdiction developed its own DD model instead of transferring and locally calibrating a DD model developed in another jurisdiction. Further research is required to confirm this statement.

Table 3 presents a summary of all 31 synthetic applications, considering the use of 10% of the sites for the local calibration. Each cell of Table 3 represents the median of $E = MAE / {\bar{Y}}_{2}$ . The shading represents the comparison of the performance within each application. When the same DD model was used for both synthetic AADPT generation and local calibration, the error metric $(E)$ . of the base model was between 0.60 and 0.65, as imposed by the procedure of generating the synthetic estimates. Three main findings were made based on the results in Table 3:

Although similar performances were observed between Models B to D, Models B and C performed slightly better than the others.

The “cost” of applying the local calibration to a jurisdiction that is similar to the one where the DD model was originally developed was low. This was exemplified by the 11 cases where the same DD model was used for both synthetic generation and local calibration. For these 11 cases, the use of local calibration Models B and C resulted in errors that were, on average, 5.7% (95% CI: 4.4% to 7.0%) larger than using naïve transferability (base model). Note that in these cases, the base model represented the best estimate that one could get, since the only difference between observed and predicted AADPTs was the random error term that was added when generating the synthetic AADPTs. Therefore, the cost may be overestimated for practical applications because it is unlikely that two different jurisdictions will present exactly the same relationship between pedestrian activity and socioeconomic/land-use features.

The “benefit” of applying the local calibration to a jurisdiction that is not necessarily similar to the one where the DD model was originally developed was about 2.5 times greater than the estimated cost in the previous item. This scenario was represented by the 20 cases in which different DD models were used for generating the synthetic AADPTs and for transferability. For these cases, the use of local calibration models B and C resulted in errors that were, on average, 14.6% (95% CI: 6.7% to 22.4%) lower than using naïve transferability (base model).

Table 3.

Summary of the Synthetic AADPT Approach using 10% of Sites for Local Calibration

Note: AADPT = annual average daily pedestrian traffic; na = not applicable.

The shading represents the comparison within each application: green = lowest value, gray = mid value, and red = highest value.

Detailing the performance of Models B and C, Figure 5 presents the average and the 95% confidence interval around the average of the percent difference in error indicator, E, when comparing Models B and C to the base model. Each element of Figure 5 was calculated using information from the 31 different applications (i.e., including cases where jurisdictions were similar and when they were different). Positive values mean an increase in error indicator E with the local calibration, whereas negative values refer to improvements in the transferability (decrease in error). It was observed that Model B outperformed Model C when 2.5% of the sites were used for the local calibration, but we could not statistically confirm the difference in the average performance of the models (p-value for a two-tailed hypothesis test = 0.13). When 5% or more was available, the performance was similar, with Model C showing slightly better results. It was also seen that the performance was steady after reaching 10% of sites. Consequently, collecting AADPT for 10% of the sites may be the ideal number for jurisdictions that are interested in applying this approach. The application of Model C using 10% of sites resulted in an average improvement of 8.7% (95% CI: 0.6% to 16.9%). The smaller improvement values were associated with cases in which the jurisdictions were the same or similar, and large improvement values with cases in which jurisdictions exhibited greater differences.

Figure 5.

Performance of Models B and C when compared with the base model (negative values indicate improved accuracy versus base model).

We frequently associated the performance of the local calibration with the level of similarity between the target jurisdiction (where AADPT was synthetically generated using a DD model) and the jurisdiction where the DD model that was being transferred was developed. We tested three DD models that included different explanatory variables collected within different radii. In addition, we only had access to the brief descriptive analysis (e.g., average, standard deviation, minimum, and maximum) presented in the original papers to characterize each jurisdiction. Consequently, our ability to comprehensively understand the synthetic AADPTs that were generated and compare the land-use and socioeconomic attributes of the jurisdictions was constrained. This is why we could not perform a thorough analysis and limited ourselves here to investigating the differences between synthetic and estimated AADPT (Figure 4). The local calibration using real AADPTs is assessed in the next section. This allowed us to have better control over the observed AADPTs, the characteristics of the target jurisdictions (Figure 1), and the performance of the naïve transferability (Table 1); therefore, we were able to explore the effect of the level of similarity between the two jurisdictions in more detail.

Local Calibration Using Real AADPTs

Table 4 summarizes the local calibration using real AADPTs. As the number of sites where real AADPT was known was very limited, a local calibration was conducted employing small samples (at five or six sites). The results indicated that on average Model C performed the best, an outcome that was consistent with the results from the synthetic approach.Before detailing the results, we suggest the reader revisit Figure 1 and Table 1. Milton was observed to be similar to the jurisdictions where the DD models were originally calibrated with regard to attributes (Figure 1) and pedestrian activity (similar average AADPT for the application in Table 1). Pima County presented reduced magnitudes in both attributes and pedestrian activity (average observed AADPT = 28); however, the transferability of the DD models resulted in systematical overestimation of the AADPTs, suggesting that the relationship between attributes and pedestrian activity (i.e., model coefficients) was different in Pima County when compared with the jurisdictions where the DD models were developed. Downtown Toronto was a particular case because overestimation and underestimation were observed when transferring the Griswold et al. ( 13 ) and Singleton et al. ( 11 ) models, respectively.

Table 4.

Summary of the Real AADPT Approach

Note: na = not applicable.The shading scale represents the comparison within each application: green = lowest value, gray = mid value, and red = highest value.

The naïve transferability (base model) of the DD models for Milton performed well, providing accuracy that was compatible with—and sometimes better than—the benchmark measure of $E = MAE / {\bar{Y}}_{2}$ between 0.60 and 0.65. As expected, the application of local calibration provided only modest improvements over the base model. The performance was slightly improved with the local calibration of the Schneider et al. ( 12 ) and Singleton et al. ( 11 ) DD models and marginally worsened with the local calibration of the Griswold et al.’s model ( 13 ).

The results for the Pima County and Toronto jurisdictions were different. The base model (naïve transferability) performed poorly, and systematic overestimation and underestimation were observed for Pima County and Toronto, respectively. Significant improvements in the error metric $(E)$ . were observed even with the use of very small sample sizes for local calibrati, leading to a reduction in the errors to the benchmark level of accuracy. This indicated that the locally calibrated DD model performed as well as we expected a locally developed DD model to do, demonstrating the potential of the technique for real-world applications.

Considering the results of the eight applications presented in Table 4, Model C provided an average improvement of 35.0% (95% CI: 4.5% to 65.4%) when compared with the base model (naïve transferability). These results corroborated with the findings using the synthetic AADPT approach, specifically:

If the target jurisdiction and the one where the DD model was originally developed are similar in the relationship between pedestrian activity and socioeconomic/land-use features, local calibration provided modest improvements over naïve transferability. This situation can be observed in Table 4 for the application of Griswold et al. ( 13 ) and Singleton et al. ( 11 ) models to the Milton sites. That local calibration did not provide significant benefits for these cases is acceptable because when the jurisdictions are very similar, then naïve transferability can be expected to provide adequate results.

When the jurisdictions were different, local calibration showed great potential for improving performance versus naïve transferability, even when the number of sites where AADPT was known was very limited. This is evident from Table 4 for the application of the local calibration models to the Toronto and Pima County sites.

Conclusions

This work examined the enhancement of the spatial transferability of DD models available in the literature. The models were locally calibrated in the target jurisdictions by making use of sites where AADPT is readily known. To perform the local calibration, five models were proposed and tested using datasets with (a) synthetic AADPT and (b) real AADPT. The results showed great potential for the local calibration to improve the naïve transferability of DD models.

For the synthetic application, 31 datasets were generated. In general, it was observed that the local calibration provided substantial benefits when the target jurisdiction and the one where the DD model was originally calibrated were different in relation to the relationship between pedestrian activity and socioeconomic/land-use features. In this case, an average improvement of 14.6% (95% CI: 6.7% to 22.4%) was observed when compared with the naïve transferability. Conversely, when the two jurisdictions were identical, use of the local calibration made the prediction slightly worse (average: 5.7%; 95% CI: 4.4% to 7.0%). The benefits were about 2.5 times greater than the “costs,” and one would not expect two jurisdictions to show the exact same pattern between pedestrian activity and socioeconomic/land-use features, indicating that the costs were probably overestimated.

As to the best model for performing the local calibration, Models B and C stood out. When 2.5% of the sites were used for local calibration, Model B outperformed Model C. The performances were comparable when 5% or more was available, with Model C producing slightly better results. The application of Model C using 10% of sites resulted in an average improvement of 8.7% (95% CI: 0.6% to 16.9%).

The results achieved from the real AADPT approach corroborated the findings from the synthetic approach. The naïve transferability for Milton already provided very reasonable results ( $E$ of 0.52 to 0.65), so local calibration provided only modest improvements. Concerning Pima County and Downtown Toronto, the naïve estimates showed systematic overestimation and underestimation, respectively. As a result, local calibration provided substantial improvements, even with the use of very small sample sizes. Considering the three jurisdictions, Model C provided an average improvement of 35.0% (95% CI: 4.5% to 65.4%) when compared with the naïve transferability. The magnitude of errors observed in the locally calibrated DD models was similar to that expected if a locally developed DD model was used, demonstrating the potential of the technique for real-world applications.

From a practical point of view, we offer some recommendations if one is willing to transfer and locally calibrate a given DD model:

Examine the similarities between the target jurisdiction and the one where the DD model was developed by comparing the magnitude of pedestrian activity (e.g., central tendencies of AADPT) and socioeconomic/land-use features considered in the DD model. This may give a clue as to the level of similarity between the two jurisdictions.

If a very limited sample size of sites where AADPT is known is available (i.e., 2.5% or lower), the use of Model B is recommended. Otherwise, Model C should be used.

If the target jurisdiction contains zones with different relationships between pedestrian activity and socioeconomic/land-use features, it is prudent to develop the local calibration separately for each zone. The City of Toronto illustrates this case. Downtown Toronto has a unique environment where, although trip generation attributes are also increased, the level of pedestrian activity is extremely high. One would expect the relationship between socioeconomic/land-use features and AADPT (i.e., DD model coefficients) to be different in Downtown than in the rest of the city.

Finally, we make three suggestions for future work:

This study used a combination of synthesized data and empirical data because a sufficiently large empirical dataset (in which pedestrian counts had not been affected by COVID-19) was not available. And though the empirical data analysis provided results that were consistent with those obtained from the analysis of the synthesized data, it would be prudent to confirm the results obtained from the synthetic approach using a database with larger sample sizes of sites where the real AADPT is known.

This work used only sites where CC stations were available, so relatively reliable AADPT estimates were considered. Many jurisdictions have STCs for a large proportion of their intersections, but a relatively small number of continuous count stations. It is recommended to extend the work from this study to examine the potential for performing local calibration of DD models with STCs expanded to AADPT or even unexpanded STCs.

It is recommended to investigate the optimal threshold (i.e., number or percentage of sites with AADPT available), separating the use of the local calibration approach from the development of a DD model for the target jurisdiction.

Footnotes

Acknowledgements

The authors gratefully acknowledge the financial support provided by Transport Canada and Natural Sciences and Engineering Research Council of Canada. The authors further gratefully acknowledge (i) the jurisdictions of Milton, Pima County, and Toronto for providing permission to use the pedestrian volume data and for providing rich open data portals that were essential sources of information for this research; and (ii) Miovision for providing access to the pedestrian data.

Author Contributions

The authors confirm contribution to the paper as follows: study conception and design: L. T. P. Sobreira, B. Hellinga; data collection: L. T. P. Sobreira, B. Hellinga; analysis and interpretation of results: L. T. P. Sobreira, B. Hellinga; draft manuscript preparation: L. T. P. Sobreira, B. Hellinga. All authors reviewed the results and approved the final version of the manuscript.

Declaration of Conflicting Interests

The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This research received financial support from Transport Canada and Natural Sciences and Engineering Research Council of Canada.

ORCID iDs

Lucas Tito Pereira Sobreira

Bruce Hellinga

Data Accessibility Statement

Some or all data, models, or code used during the study were provided by a third party. Direct requests for these materials may be made to the provider as indicated in the Acknowledgments.

The research was carried out by the authors, and the work presented in this paper solely reflects the views of the authors. No explicit or implicit endorsement of the methods or findings by funding agencies or by any of the aforementioned jurisdictions/companies is claimed or implied.

References

NHTSA. Highlights of 2009 Motor Vehicle Crashes. Traffic Safety Facts Research Note. Report No. DOT HS 811 363. National Highway Traffic Safety Administration, 2010.

NHTSA. Overview of the 2019 Crash Investigation Sampling System. Traffic Safety Facts Research Note. Report No. DOT HS 813 060. National Highway Traffic Safety Administration, 2020.

Transport Canada. Canadian Motor Vehicle Traffic Collision Statistics: 2009. 2009. https://tc.canada.ca/en/road-transportation/publications/canadian-motor-vehicle-traffic-collision-statistics-2009. Accessed 5 May 2022.

Transport Canada. Canadian Motor Vehicle Traffic Collision Statistics: 2019. 2019. https://tc.canada.ca/en/road-transportation/statistics-data/canadian-motor-vehicle-traffic-collision-statistics-2019. Accessed 5 May 2022.

American Association of State Highway Transportation Officials. Highway Safety Manual. AASHTO, Washington, D.C., 2010.

Stipancic

Miranda-Moreno

Strauss

Labbe

Pedestrian Safety at Signalized Intersections: Modelling Spatial Effects of Exposure, Geometry and Signalization on a Large Urban Network. Accident Analysis & Prevention, Vol. 134, 2020, p. 105265. https://doi.org/10.1016/j.aap.2019.105265

Munira

Sener

I. N.

Dai

A Bayesian Spatial Poisson-Lognormal Model to Examine Pedestrian Crash Severity at Signalized Intersections. Accident Analysis & Prevention, Vol. 144, 2020, p. 105679. https://doi.org/10.1016/j.aap.2020.105679

Pulugurtha

S. S.

Sambhara

V. R.

Pedestrian Crash Estimation Models for Signalized Intersections. Accident Analysis & Prevention, Vol. 43, No. 1, 2011, pp. 439–446. https://doi.org/10.1016/j.aap.2010.09.014

Sobreira

L. T. P.

Hellinga

Comparing Direct Demand Models for Estimating Pedestrian Volumes at Intersections and Their Spatial Transferability to Other Jurisdictions. Transportation Research Record: Journal of the Transportation Research Board, 2023. 2677: 260–271. https://doi.org/10.1177/03611981231161061

10.

FHWA. Traffic Monitoring Guide. Federal Highway Administration, U.S. Department of Transportation, Washington, D.C., 2016.

11.

Singleton

P. A.

Park

Lee

D. H.

Varying Influences of the Built Environment on Daily and Hourly Pedestrian Crossing Volumes at Signalized Intersections Estimated From Traffic Signal Controller Event Data. Journal of Transport Geography, Vol. 93, 2021, p. 103067. https://doi.org/10.1016/j.jtrangeo.2021.103067

12.

Schneider

R. J.

Schmitz

Qin

Development and Validation of a Seven-County Regional Pedestrian Volume Model. Transportation Research Record: Journal of the Transportation Research Board, 2021. 2675: 352–368. https://doi.org/10.1177/0361198121992360

13.

Griswold

J. B.

Medury

Schneider

R. J.

Amos

Grembek

A Pedestrian Exposure Model for the California State Highway System. Transportation Research Record: Journal of the Transportation Research Board, 2019. 2673: 941–950. https://doi.org/10.1177/0361198119837235

14.

Sanders

R. L.

Frackelton

Gardner

Schneider

Hintze

Ballpark Method for Estimating Pedestrian and Bicyclist Exposure in Seattle, Washington: Potential Option for Resource-Constrained Cities in an Age of Big Data. Transportation Research Record: Journal of the Transportation Research Board, 2017. 2605: 32–44. https://doi.org/10.3141/2605-03

15.

Hankey

Mondschein

Buehler

Spatial Models of Active Travel in Small Communities: Merging the Goals of Traffic Monitoring and Direct-Demand Modeling. Journal of Transport & Health, Vol. 7, 2017, pp. 149–159. https://doi.org/10.1016/j.jth.2017.08.009

Enhancing the Spatial Transferability of Direct Demand Models for Estimating Pedestrian Volumes at Intersections

Abstract

Keywords

Naïve Transferability of DD Models

Proposed Local Calibration Models

Evaluation Methodology

Generating Synthetic AADPT Data

Results and Discussion

Local Calibration Using Synthetic AADPTs

Local Calibration Using Real AADPTs

Conclusions

Footnotes

Acknowledgements

Author Contributions

Declaration of Conflicting Interests

Funding

ORCID iDs

Data Accessibility Statement

References