Abstract
This paper describes a straightforward method for calculating an open-source Walkable Accessibility Score (WAS) that measures walkability at the block group scale based on walking distance to business establishments, schools, and parks. Exploratory analysis of the WAS reveals high concentrations of walkable accessibility in the centres of the densest and/or largest cities. Our optimised specification (decay = 0.008, upper = 800, k = 30) performs very well, achieving a Spearman rank correlation of 0.912 with proprietary Walk Score® values (for 2011). We provided pre-calculated data for each year from 1997 to 2019 and Python code for calculating the WAS at the project’s GitHub repository. The method is particularly useful in that it uses simple Euclidean distance calculations, and thus can be run at scale on a laptop or personal computer.
Introduction
Measuring walkability – the ability to access everyday services and amenities comfortably by foot – and its relationship to health, equity, the environment, and economic development has long been of interest to urban planners, urban designers, and researchers (Jacobs, 1961). This largely stems from the fact that urban development in the 20th-century – particularly in North America – focused almost exclusively on creating land use patterns and built environments that were conducive to widespread and easy travel by automobile. Unfortunately, this auto-orientation has produced numerous concomitant negative externalities for urban areas, including increased air pollution and greenhouse gas emissions (Fang et al., 2015; Frank et al., 2006), higher rates of obesity and low physical activity (De Nazelle et al., 2011; Lindström, 2008), and lack of access to employment opportunities (Kain, 1968; Kawabata, 2003; Lee et al., 2018; Ong and Houston, 2002).
Many of these negative externalities arise because the physical layout of the city structures people’s transportation choices, life opportunities, and outcomes through proximity to various necessities and amenities. In a completely auto-oriented environment, lack of regular use of an automobile significantly restricts an individual’s ability to reach the daily necessities and opportunities of life (Hägerstrand, 1970). A large body of research has measured the negative effects of disparities in access to fresh food and grocery stores (Beaulac et al., 2009; Wrigley et al., 2002), healthcare (Kwan, 2013; Saxon and Snow, 2020; Shi et al., 2005; Starfield et al., 2005), childcare (Van Ham and Mulder, 2005), green and blue space (Jarosz, 2022; White et al., 2021), and jobs (Andersson et al., 2018; Clampet-Lundquist and Massey 2008) that is a feature of most North American cities.
Widespread acknowledgement of the benefits of walkability (and dissatisfaction with auto-oriented sprawl) has fuelled a growing interest – both inside and outside of academia – in measuring walkable neighbourhoods, including recently popularised concepts such as the ‘15-min city’ (Ewing and Cervero, 2010; Moreno et al., 2021; Talen and Koschinsky, 2013). This is often focused on accessibility metrics, including the ‘gravity’-based access score, the floating catchment area (FCA) approach (and extensions), and the rational agent access model (Isard, 1960; Luo, 2004; Saxon and Snow, 2020; Wang and Luo, 2005; Luo and Qi, 2009; Wan et al., 2012). A number of useful open-source packages and resources aimed at measuring walkability have recently been created (Pereira and Herszenhut, 2023; Saxon et al., 2020). However, there is also interest in metrics that incorporate the pedestrian experience of walking (Harvey, 2022). For example, one particular methodological focus has been on the use of street level imagery to assess walkability (Wang et al., 2024). Studies link GIS and 3D virtual environments to provide a more pedestrian-focused measure of walkability (Ki et al., 2023). Audits using Google Street View images found that high-quality pedestrian infrastructure had a significant impact on urban walkability and accessibility (Koo et al., 2023).
However, while there is widespread interest in walkability, and the academic literature on walkability measurement and effect is robust, the fact remains that for many professional practitioners, members of the public, and researchers interested in studying walkability, employing current methods is beyond the scope of their available data and/or technical capabilities. The most well-known walkability measures, such as Walk Score® (https://www.walkscore.com/), a tool that estimates walkability based on the proximity of amenities to a given location, have been developed and provided by commercial entities. As Figure 1 shows, Walk Score® continues to enjoy widespread use among academics, despite concerns over the lack of purpose fit for many research questions (Brown et al., 2023; Frank et al., 2021). For proprietary reasons, the specific configuration of this metric is not made directly available to the public, although some information can be gleaned from the Walk Score® patent and methodology documents (Lerner et al., 2008; Walk Score, 2011). While some researchers have used the general principles of gravity-based accessibility and available insights from Walk Score® to develop walkability measures tailored to specific contexts (Zhou and Homma, 2022), there is a need for a direct assessment of Walk Score® data alongside various combinations of methodological parameters (e.g. distance decay functions and amenity weights). Such an analysis would clarify the exact features that Walk Score® (and similar metrics) captures and, perhaps more importantly, explore ways to make walkable accessibility methods more open-sourced, customisable for user-specific needs, and ultimately improved. Web of Science publications (all fields) containing ‘WalkScore’ or ‘Walk Score’ through 2023.
The purpose of this paper is to outline a flexible, open-source methodology for computing a Walkable Accessibility Score (WAS) – similar to the commercially available Walk Score® – that allows researchers to empirically operationalise walkability concepts. We then apply this method to create a comprehensive measure of walkability at the block group scale for the continental United States for every year from 1997 to 2019 and examine interesting spatial patterns and trends in the data. Ultimately, the goal of this project is to provide historical data and reproducible code as an open resource for academics, planners, and members of the public to use to study relationships and better understand their own neighbourhoods. The code to operationalise this method and a link to download the full set of aggregated WAS values at the block group level for the continental US from 1997 to 2019 are available at the project’s GitHub repository.
Methodology
Spatial accessibility metrics can generally be divided into two broad types: the floating catchment area (FCA) method and its various extensions (Luo, 2004; Saxon and Snow, 2020; Wang and Luo, 2005), and the gravity-based ‘access score’ (Isard, 1960). In both, the data are structured as demand units (i.e. where the population lives, often administrative areal units) and supply sites (i.e. the facilities where services are provided, often points of interest of some type), and the goal is to aggregate the accessibility from each demand unit to nearby supply sites. This means that the resulting aggregate measure of accessibility is ultimately a feature of the demand units rather than the supply sites.
Generally, the FCA family of methods considers the ratio of supply to demand and is thus most useful in contexts where limited or differential supply is a concern, for example, physician availability. However, in the walkability application this makes less sense; proximity to many nearby retail businesses improves the walkability of every nearby demand unit no matter the size of the total demand in the area. For this reason, we conceptualise walkability as a function purely of the amount of nearby supply sites using the ‘access score’ or ‘gravity potential’ method (Isard, 1960). As shown in equation (1), the access score (a) for a given demand unit i (Census block groups, in this case) is simply the count of nearby supply sites j (out of a total k) discounted by some distance decay function
Amenity types, NAICS codes, and data sources used in the calculation of the walkable accessibility score (WAS).
While the importance of varying amenity weights has been highlighted both in the Walk Score® methodology and in the literature (Brown et al., 2023; Lerner et al., 2008; Walk Score, 2011; Zhou and Homma, 2022), the choice of weights is often arbitrary (based on the needs of a particular user or context), so for this paper’s primary comparison, we weighted all amenity types equally. We nevertheless added code to allow users to weight amenities according to their needs, with the default weighting being equal to 1. To preserve computational efficiency – and thus enhance the usability of the code for small-scale and non-enterprise users – we calculated distances between demand units and supply sites using Euclidean distance.
With the general form of the distance decay function, amenities, weights, and distance metric chosen, two remaining features need to be specified, which form the basis for the paper’s analysis. The first is the choice of k-nearest amenities to sum over. In this paper, we run 93 tests to determine the optimal k value, computing the WAS for all multiples of five between 5 and 155, which allow us to assess the impact of the choice of k on the resulting score. Second, the specific parameters for the logistic distance decay function must be determined. The specific form of the logistic distance decay function
In particular, the decay parameter, which influences the slope of the function, is set to 0.008 based on the Walk Score® methodology document (2011). Graphs of the tested distance decay function specifications for three different values of upper: (A) 800, (B) 1600, and (C) 2400.
Results
The goal of the analysis is to test 93 total combinations of the k and upper parameters, calculated according to equations (1) and (2), against the 2011 ‘Street Smart’ Walk Score®,
2
to see which parameter combination best matches the Walk Score® data. The demand units for the test are the 2015 US block group centroids (for the 48 contiguous States), and the supply sites are the (equally weighted) amenities listed in Table 1 (see Appendix A.2 for a note about the comparison). As shown in Figure 3, we calculated the WAS for 93 combinations of k (every multiple of five between 5 and 155) and upper values (800, 1600, 2400) for the 48 contiguous states at the block group level,
3
computing the Spearman rank correlation with the Street Smart Walk Score® data each time. Comparison of 93 combinations of upper and k parameters for judging similarity to proprietary 2011 Street Smart Walk Score® values using a logistic (decay = 0.008) function with equal amenity weights. The graphs show the distribution of Spearman rank correlations (
In the end, summing over the nearest 30 amenities with an upper value of 800 (1600 m decay threshold) provided a correlation of 0.911566, which is surprisingly large, especially given the fact that we used a simple Euclidean distance measure. The performance of our Euclidean distance-based metric is especially noteworthy given the difficulty in computing street network distances to large numbers of amenities at scale. Thus, these results show that for some purposes, using computational resources to extract street network distances is overkill, since the correlation with WalkScore® that we can achieve using Euclidean distance is already very high. In essence, this means that a user can calculate the WAS using this approach for hundreds of thousands of demand units and a large k on their laptop with a relatively simple piece of open-source code.
With the parameters for computing a high-performing open-source walkable accessibility score (WAS) identified (upper = 800, k = 30), we then used those to compute the WAS at the block group scale for the contiguous US for all years (1997-2019) for which we have relevant data available on amenities (i.e. InfoUSA). The measure ranges from 0 (no specified amenities within roughly 1600 m of the block group centroid) to 30 (all 30 of the nearest amenities are located within around 100 m of the block group centroid); the observed maximum value across the time span is 29.641.
4
The full pre-computed dataset from 1997 to 2019 is available for download (as a CSV or shapefile) on the project’s GitHub. Initial exploratory analysis of the data indicates that high values of the WAS using these parameters are generally concentrated in the centres of the country’s densest and/or largest cities (with a few exceptions), as shown in Figure 4 and described in Appendix A.3. Average Walkable Accessibility Score (WAS), 1997-2019, by block group.
Conclusions and future directions
Beyond the ability to provide the aggregated WAS dataset as an open resource for researchers and the public, these data also offer a variety of interesting avenues for future research that we are just beginning to explore. Perhaps most fundamentally, further analysis of any systematic deviations between WAS (calculated using the parameters specified in this paper) and Street Smart Walk Score® is needed to further improve the fit of our open-source measure (while maintaining the computational efficiency of Euclidean distance). While 0.912 is a very high correlation, it is possible that further parameter tuning, including specifying amenity weights and including features of street network design (such as intersection density and average block length), might provide an even better fit. Certainly, the use of Euclidean distance, while computationally efficient, does not fully capture real-world pedestrian travel patterns, such as barriers or street network complexity, and thus it is possible that our measure may indicate ‘high walkability’ in areas with high levels of business establishment density that we would not typically associate with walkable urban environments from a conceptual or experiential perspective (e.g. a suburban shopping centre or mall). Also, while our method allows for flexible weighting of amenities and distance decay functions, different populations may have distinct needs and preferences that are not explicitly accounted for in the default specification. In our comparison, the uniform weighting of amenities may not reflect the varying importance of different destinations in different contexts.
Interestingly, our preliminary investigations that include the street network design variables intersection density and average block length – which are often identified as components of the Walk Score® methodology (Brown et al., 2023; Frank et al., 2021; Lerner et al., 2008; Walk Score, 2011) – do not appear to improve the correlation between WAS and Street Smart Walk Score®. Additionally, the inclusion of these street network design variables adds considerable complexity and computation time to the calculation of the WAS for the entire US (due to the need to query street networks). However, more work is needed to confirm whether these – or other – features might improve the performance of the open-source WAS in comparison to Street Smart Walk Score®.
Beyond that, there is also a need to conduct external validation of the WAS with more dynamic measures of walking behaviour, for example, footfall data. It is very likely that the true benchmark for the WAS should be some objective measure of walking behaviour (and/or satisfaction), rather than Walk Score®. As many have argued, despite its popularity, the proprietary Walk Score® may not actually capture the conceptual essence of walkability – or real walking behaviour – very well (Brown et al., 2023; Frank et al., 2021). This is a significant problem when we are not able to fully access the methodology used for calculating a given accessibility measure, as we have no way to modify or tweak it. However, given the open nature of the WAS methodology presented here – and the flexibility inherent in the code – any changes necessary to better match footfall data or other measures of walking behaviour can easily be incorporated and shared.
Finally, these data – as both dependent and independent variables – offer a number of interesting opportunities to examine important urban relationships, including the relationship between socio-economic factors and walkability, the causal impact of infrastructure investments on walkability, and spatio-temporal changes in walkability over time. In particular, an examination of the relationships between the WAS and specific urban retail typologies, gentrification, and density – and how these relationships evolve over time – would be quite interesting to conduct.
Footnotes
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was funded, in part, under the National Institutes of Health (NIH) project number 5R01AR078342-03.
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Data Availability Statement
Data is available at https://github.com/kcredit/Walkable-Accessibility-Score as part of the Walkable Accessibility Score project (Credit and Farah, n.d.).
