Abstract
Location and accessibility are core concepts for land-value research. However, the perspective is still limited in their conceptual and methodological application to cities from the Global South. The objective of this research is to bridge concepts and definitions to comprehensively operationalize accessibility indicators and uncover its relation with residential land-values in Guatemala City. We developed a multivariate regression model using the following access metrics: (1) geographic-access indices that were computed using time-based analyses per transport mode; (2) geometric-access metrics estimated via Space Syntax at various spatial scales; (3) a proposed geometric via geographic-access metric computed as potential access to network centrality. A variable selection process allowed to assess the information contribution of each variable in building a parsimonious model. We assessed the model in the context of model variations that represent common approaches used in existing literature. Geographic access to the core business district has the highest impact on the land-values, followed by proximity to urban areas with high geometric-access, measured as geometric via geographic access. Geometric accessibility at neighbourhood and city-wide scales add spatialized information that contributes to a parsimonious model and reduces spatial dependence. The model yielded the highest goodness of fit and prediction accuracy compared with the model variations. We concluded that Guatemala City land-values follow a predominant monocentric structure. Additionally, potential access to vital urban areas as identified via Space Syntax denotes the presence of economic activities, or potential for such, which were not explicitly addressed through the geographic-access metrics. The results have limitations but pose methodological possibilities relevant for research and practice in similar Latin American cities.
Introduction
Following Krause and Bitter (2012), trends in land and property value research can be grouped in three. First, uncovering the relation between the various factors that can influence the value formation is commonly done using hedonic pricing, and implemented through multivariate linear regression techniques (Liu et al., 2010; Munroe, 2007). However, there is a growing body of literature concerned with the presence of spatial dependence and the statistical methodological expansions to address it, such as spatial econometrics (Bourassa et al., 2007; Lesage and Pace, 2009). Second, literature is placing attention on studying the value of land after recognizing differences between this value and the value of the improvements (i.e. construction). Land is fixed both in territorial location and its supply (Evans, 1987), which could make it more volatile and sensitive to the effects of demand, location and economic shocks (Krause and Bitter, 2012). Third, interest is taking place on studying the value effects derived from sustainable urban processes such as mixed-uses, mass transport and street connectivity (Anantsuksomsri and Tontisirin, 2015; Matthews and Turnbull, 2007). It can be said that this has always been central in this research area, but only implicitly by means of the urban accessibility concept (Adair et al., 2000; Ahlfeldt, 2007; Des Rosiers et al., 2000; Giuliano et al., 2010; Liu et al., 2010).
Location and accessibility are core concepts in land-value and house-price research (Orford, 2002; Webster, 2010). Important urban economic theories (Alonso, 1964; Evans, 1987; Muth, 1969) and a large body of literature prove that relative location and access externalities are associated with the economic value of various land-uses (Dale-Johnson and Brzeski, 2001; Evans, 1987; Kivell, 1993; Liu et al., 2010; Peiser, 1987). Basic accessibility concepts and methods have been commonly at the core of previous research on land-value modelling, particularly in urban areas (Iacono and Levinson, 2011; Saeid, 2011). However, cities are no simple phenomena and there is still room for exploring their application, particularly in cities from the Global South.
We address location using two types of accessibility. Geographic-accessibility reflects the easiness to reach a location or to be reached given available infrastructure to overcome impedance between origin and destination (Batty, 2009). Distance, travel-time, cumulative opportunities and gravity-type measurements are metrics commonly used in research and practice (Geurs and Van Eck, 2001). In turn, Geometric-accessibility is defined as a resource given the topologic-geometric characteristics of a city urban layout (Jiang et al., 1999; Webster, 2010). Network-closeness and network-betweenness are common network centrality metrics that are associated with geometric-accessibility (Freeman, 1977; Porta et al., 2005). Space Syntax (SSx) is a set of theories and methods based on graph theory and dual representations of the urban layouts that is pioneer in network centrality applied to urban areas (Bafna, 2003; Hillier et al., 2012; Webster, 2010). Integration and choice are SSx metrics, adopted in this research, equivalent to closeness and betweenness correspondingly.
The trade-off neoclassic theory (Alonso, 1964; Evans, 1987) constitutes a fundamental basis for land-value and house-price literature (Ahlfeldt and Wendland, 2011; Kivell, 1993). Accessibility is then mainly addressed as proximity to a core business district (CBD) and other location externalities. Proximity measurements are commonly included as aerial or network-based distances (Bourassa et al., 2007; Dale-Johnson and Brzeski, 2001; Heikkila et al., 1989; Liu et al., 2010; Munroe, 2007; Orford, 2002; Waddell et al., 1993). Metric distances are not realistic measurements as road characteristics and transport modes pose different travel times or costs over the same distance (Mavoa et al., 2012; Ryan, 1999). Some literature used travel-times instead (Ahlfeldt and Wendland, 2011; Iacono and Levinson, 2015; Ottensmann et al., 2008; Pujol et al., 2013). However, the focus has mostly been placed on travel times by private mobility. Relatively few research has tested more complex methods such as the potential measurements to replace the traditional CBD proximity (Adair et al., 2000; Ahlfeldt, 2007; Du and Mulley, 2012; Giuliano et al., 2010; Osland and Thorsen, 2013). The rationale is that a monocentric CBD assumption might not adequately capture access to economic opportunities due to emergence of polycentric structures.
The availability of public transport is mostly addressed as proximity to access points (i.e. stops and stations) to such infrastructure (Ryan, 1999) (e.g. Anantsuksomsri and Tontisirin, 2015; Iacono and Levinson, 2011; Ibeas et al., 2012; Rodríguez and Mojica, 2009; Waddell et al., 1993). Such approach is commonly used to sort out the effects of public transport investments on house-value properties, again assuming mono-centricity (Ryan, 1999). Only very limited research has used simultaneously travel-times measurements by more than one transport mode (Adair et al., 2000; Des Rosiers et al., 2000; Du and Mulley, 2012). Adair et al. (2000) used travel times per transport mode and estimated mean access values for each neighbourhood in its study area.
Only a small body of literature has addressed geometric accessibility combined with basic geographic access metrics. Desyllas (1997) investigated the relationships between the evolution of land-use and geometric-access with land-values in Berlin. Matthews and Turnbull (2007) investigated the effects of integration and proximities to selected land-uses on house-prices in Washington. Enström and Netzell (2008) investigated the effects of integration on office-rent variation in downtown Stockholm. Chiaradia et al. (2009) researched on the associations between geometric access and dwellings values in London. Saeid (2011) analysed the relation between integration with land-values in Wroclaw. Xiao et al. (2016) studied various spatial scales at where geometric-accessibility, together with selected geographic-access metrics, better explains house-price variability in Cardiff.
Overall, the outcomes of these works provide evidence on how geometric access can explain land-values more accurately. SSx metrics at various spatial scales provide additional information about the quality of the urban layout that other access metrics cannot capture. Furthermore, these metrics could have the potential to improve the quality of the model itself in terms of reducing heteroskedasticity and spatial autocorrelation (Xiao et al., 2016). SSx methodologies have evolved and developed in more sophisticated analyses compared to the early applications (Steadman, 2004; Turner, 2007), but only limited research have implemented along newer techniques in land-value investigations (Chiaradia et al., 2009; Xiao et al., 2016). Additionally, these works are mostly limited to case studies from developed and planned cities.
We define the research gap as the need of bridging available concepts and methods in the field of accessibility studies with the land-value modelling task. We introduce the following hypotheses. (1) Addressing the disparity of geographic-access opportunities due to available transport modes and the geometric-access at various spatial scales could contribute to an increased capacity to explain land-values variability and prediction accuracy. (2) Geometric-accessibility capitalizes land not only at location, but also as a reachable resource by means of geographic-access. Thus, geometric via geographic accessibility is defined as the easiness to reach geometric access by means of private or public transport-based mobility. Such metrics could reflect access to urban areas with presence of facilities or economic activities (or a potential for such), which are not explicitly addressed in other geographic-access metrics.
The objective of this research is to bridge concepts and definitions to comprehensively operationalize accessibility indicators and uncover its relation with residential land-values in Guatemala City. We developed one multivariate regression model that used the following access metrics: (1) geographic access indexes that were computed using time-based analyses per transport mode; (2) geometric access metrics estimated via SSx at various spatial scales; (3) a proposed geometric via geographic access metric computed as a potential access to network centrality as analysed in SSx. A parsimonious model is estimated following a variable selection procedure and then assessed using various performance statistics.
The remainder of this paper is organized as follows. Materials and methods introduce the study area, data pre-processing and methods. Then, the results are presented and discussed. Here, we interpret the impacts of the various access metrics on the value of residential land, as well as the performance of the model. Finally, we address the conclusions where we reflect on the limitations and implications of the results in further research and practice.
Materials and methods
Case study
Our case study city is Guatemala City, Guatemala. Similar to other Latin American cities with colonial origins, Guatemala’s historic centre has a gridiron urban structure (Gellert, 1990; Pacione, 2005: 447–602). First expansions by the end of 1800s and up to middle 1900s were carried out by local planners. Later, a combination of socio-political conditions, natural events and a massive migration from rural areas resulted in unplanned expansion towards the periphery. A CBD of white collars emerged towards the south of, and linked to, the historic core following important infrastructure such as the inter-American road (CA-1) and the international airport (Figure 1). Jobs and important economic activities are highly centralized there, whilst population density tends to show a decentralized pattern (DPU, 2009). The case study is relevant for the region as various Latin American cities display similar city structure and dynamics (Ford, 1996; Ingram and Carroll, 1981).
Appraisals location (top), and land-value frequency distributions before and after transformation (bottom).
Data pre-processing, variables and descriptive statistics
Land-value data were collected during fieldwork (August 2014–April 2015). We built a spatial database indexing 2000 records of real-estate property appraisals (observations) dating from the years 2008–2014, and carried out by a Guatemalan private office (AO). Observations were geo-referenced to the centroid of each property using the WGS84 co-ordinate system (Decker, 1986). According to the AO, the observations report “arm-length values” reflecting optimum transaction values where there is no pressure to sell or buy, and parties have complete information. Land-values are in local currency Quetzal (Q) per square meter of plot surface area. These were deflated to the year 2014 using the Guatemalan consumer price indexes (Ine, 2016), and transformed to natural logarithms to deal with a non-normal distribution and potential non-linear relations with the predictors. We only used observations of residential uses (excluding flats) and plots with surface areas between 100 and 1000 m2. Figure 1 shows the location of the 1026 observations used in this research.
Accessibility metrics were mostly produced in previous work and are aggregated in a hexagonal tessellation (Morales et al., 2016). The size of the hexagons (300 m inner-diameter) provided an adequate resolution at an affordable computational demand. Geographic access metrics were available per transport mode. Additional variables, further described, were also mapped and aggregated to this tessellation (except property-level characteristics). Where one hexagon contained more than one observation, same predictor values were attached to the observations.
List of variables and descriptive statistics.
Grey highlighted cell indicates measurements that were transformed into natural logarithms (nl).
The CBD, jobs location, large-scale malls, large-scale groceries, universities, culture, hospitals and large-scale sport facilities are destinations for which people are more willing to overcome impedance, and influence location quality at a city scale. Access was measured as shortest travel-times, except for jobs access (Morales et al., 2016). This was mapped as potential access using the Hansen (1959) formulation. Decay parameters (
Geometric access includes SSx metrics, integration and choice, at various spatial scales (radii). High geometric access at lower radii (e.g. r_0.8 Km) benefit areas associated with pedestrian flows and important streets at a neighbourhood level. Access at higher or global (r_N) radii analyses benefit areas associated with important economic activity and reveal a roads hierarchy at a city-wide scale. Proximity to mobility infrastructure includes distance to main roads and whether or not a location is within a threshold distance of access points to a bus line.
Following Bourassa et al. (2007), we used submarket variables to address auto-correlated errors. Auto-correlated errors indicate the presence of a spatial structure due to nearby observations sharing similar locational and unmeasured externalities (Basu and Thibodeau, 1998; Lesage and Pace, 2009). Submarket classifications were mapped based on location (east or west periphery) and using expert knowledge from the AO. Submarket predictors classify the urban areas in reference to the selling-price segmentation of available new residential projects. Thus, indicating the non-existence of projects in an area (0), and the incremental selling-price coded from 1 to 3.
Neighbourhood characteristics include the following information: population density projected to 2015 (World_Pop, 2014); classification of socio economic groups per census track (Urbanistica, 2009); percentage of private mobility users per TAZ (percent_priv); and density of new residential projects. The “percent_priv” was estimated as the proportion of trips generated by private vehicles, from the total of trips generated per TAZ. Densities of new residential projects (dens_flats, dens_condos) were assumed to be able to capture popularity and demand for residential location. Geo-referenced locations of projects dating 2013 and 2014, provided by the AO, were processed using a point-density analysis at a 2 km radius (average neighbourhood size).
The year variable aimed to capture any temporal trend that was independent to monetary inflation. The plot surface area aimed to capture the relation where appraised value per m2 decreases with increasing surface area (Lin and Evans, 2000). Construction area was a proxy variable included to isolate the land-value increase due to residence structural characteristics. We included dummy variables indicating whether the plot had a regular geometry or not, and the property intra-block location. These are locally used as appraisal “adjustment factors” (Dicabi, 2005). The POT variable, ranging from 1 to 5, attempted to capture the relation between the land-value with potential for future development into non-residential uses and/or larger constructions following current Territorial Ordinance Plan (DPU, 2009). Although it only applies for properties within Guatemala municipal boundaries, we classified the remaining plots using the same guiding principles.
Observation co-ordinates were used as predictors, also following Bourassa et al. (2007). Left side of Figure 2 shows initial scatterplots of each “x” and “y” co-ordinate at the upper and bottom areas correspondingly. Standardized co-ordinate values on the horizontal axis are plotted against land-values on the vertical axis. By fitting a quadratic function, the concave trends on both co-ordinates (x and y) reflect that the highest land-values are in the city core (matching CBD location) and these decrease as the co-ordinates move towards east or west (x co-ordinate), and south or north (y co-ordinate). Such trends were expected to be captured by the accessibility variables. Yet, we kept the co-ordinates as predictors to either capture any remaining trends or to reduce auto-correlated errors. Using the quadratic functions, we estimated new values to use as predictors (f_x and f_y) to remove the non-linearity. On the right side of the figure are the corresponding scatterplots.
Scatterplots of x and y co-ordinates against the nl_land-values (left side); and scatterplots of estimated f_x and f_y plotted against nl_land-values (right side).
Methods
We used multivariate regression as a commonly accepted method in this research area, particularly in hedonic pricing studies. The regression model included the following access metrics: (1) geographic access indexes were computed based on travel-times per transport mode to various land uses; (2) geometric access incorporated SSx network metrics at various spatial scales; (3) we proposed a geometric via geographic-access metric formulated as the potential access to network integration (as measured in SSx). Additionally, other relevant variables were also included, such as submarket information, neighbourhood and property-level information. Following an initial correlation exploration, the model was first fully specified. Then, a variable selection process was implemented to assess the relevance of the information added by each variable and deduce the most parsimonious model. Finally, the model was assessed using performance statistics contrasted with alternative model versions representing common approaches in land-value modelling literature.
Geographic access indexes per facility were estimated using equation (1). Computations were done at the tessellation level for the entire city area. Access measurements per transport mode were all standardized to ranges from 0 (poor access) to 1 (high access). Then,
Geometric via geographic access was estimated as “ Geometric via geographic-accessibility from low (red) to high (green).

The regression model assumes that land-values can be decomposed into numerical contributions from the locational predictor attributes (Liu et al., 2010). A generalized specification follows equation (3). Where the land-value (nl_lv) of a property is estimated as a function of a constant intercept
Our model takes the form of equation (4). Where the land-value predictors are grouped into geographic access, geometric access, geometric via geographic access, infrastructure proximity, submarket, neighbourhood and plot-level characteristics. The model was fitted via ordinary least squares in R software (Team, 2016) using 879 randomly selected observations (75%) to train the model, and the remaining observations for cross validation.
It was expected that not every predictor would add meaningful information to the model and that access variables would introduce some multicollinearity, which makes almost impossible to discriminate predictors based on their coefficients and significance. Therefore, we applied an automated bi-directional stepwise regression using the MASS package (Ripley et al., 2002), where variables are retained or discarded based on their contribution to the Akaike Information Criterion (AIC) statistic (Bozdogan, 1987). For the final reduced model we used the Breusch–Pagan test and White’s correction to correspondingly detect heteroskedasticity and adjust the coefficients, and the Variation Inflation Factor (VIF) to detect multicollinearity (Fox and Weisberg, 2011).
The estimated coefficients represent the impact of each variable on the land-value. However, those are not comparable among each other as they are scale-dependant. First, not all the predictors are expressed in the same units (e.g. logarithmic scales versus standardized access-scores). Second, the predictor values come from a systematic mapping for the whole city area. So, the value ranges differ even for predictors expressed in the same units (Table 1). Therefore, standardized beta-coefficients (Fletcher, 2015) are also reported.
We used the following performance statistics to assess the models: adjusted R2, AIC, semivariogram modelling to assess the presence of auto-correlated errors (Gräler et al., 2016), the root mean squared error over prediction (RMSEP) of test-data and the normalized RMSEP to compare the proportional difference of error variance among models. The RMSEP used the difference between the maximum and minimum observed values of the test-data (2.82).
Semivariogram models assume that the similarity between any pair of residuals is inversely proportional to the lag (distance) between them. Dissimilarity is estimated as gamma values. Three parameters can be read from the semivariogram: (1) the nugget is the value of gamma at lag = 0 and associated to statistical noise; (2) the partial sill is the difference between the nugget and the value of gamma when auto-correlation becomes negligible; (3) the range indicates the distance between observations where auto-correlation becomes negligible and the partial sill is reached. The variograms fitted here are exponential and assume isotropy, meaning that the spatial dependence at a given lag is constant throughout the area and that the resulting range is effective up to three times its distance.
The model was assessed in comparison to six alternative models, which were formulated as variations of equation (4) and estimated using the same approach. First and second are model variations where corresponding access predictors only address one of the transport modes. The third model is as the original, but excludes all of the geometric-access predictors. The fourth considers exclusively a mono-centric assumption and excludes potential jobs access. In turn, the fifth model replaces CBD access with potential job access, relaxing the mono-centric assumption. The sixth model is a variation that excludes all of the submarket predictors and the observations co-ordinates.
Results and discussion
Accessibility and land-values
Figure 4 shows pairwise correlations between the access metrics relying on transport modes (geographic access and geometric via geographic access) with the land-values. Correlations of the private and public transport-based access metrics were included for reference purposes. A label indicates in which definition each metric (travel time, cumulative opportunity or potential access) was originally measured.
Pearson’s correlations between accessibility metrics and nl_land-value.
The strengths of the correlations for each variable are similar for both transport modes except for the first five land uses. Private mobility strongly increase access opportunities to concentrations of such land uses. This is reflected in higher pairwise correlations with higher land-values in contrast with access by public mobility. This comparison seems applicable to all the variables considered. However, cumulative-opportunity metrics, aiming to analyse quality location at a neighbourhood scale, made the mobility disparity more evident.
Correlation strengths of the combined indices fall between the correlation strengths by transport mode. This was expected as the indices incorporate and reflect the difference on the strength of the pairwise correlations by transport mode, previously discussed. However, the combined indices are sensitive to spatial variations of the access benefits derived from simultaneous availability of transport modes as well as to the residents’ predominant preference for those. This was expected to contribute with additional information to the model, compared to only addressing one transport mode.
Importance of a monocentric land-value structure is preliminarily detected as the strongest correlation is observed with CBD access (∼0.6). Access to jobs, geometric via geographic access (integration_gravity), universities, cultural institutions and large-scale malls reach correlations up to ∼0.5. Then, access to banks and restaurants, clinics, hospitals and large-scale sports infrastructure reach correlations up to 0.4.
Figure 5 shows correlations of the geometric access metrics that were computed via SSx. Geometric access metrics, estimated via SSx, all show positive correlations. Those increase when increasing the radii of analysis (0.8 km-Global), from 0.09 to 0.16 for the choice values and from 0.07 to 0.4 for the integration values. This means that, at first glance, geometric-access is a quality-location that tends to be more important when indicating relative location to the city as a whole and lesser at the neighbourhood scale. Also, compact grids with small blocks are less favoured than longer blocks.
Pearson’s correlations between geometric accessibility and nl_land-value.
Regression coefficients and normalized coefficients. Grey colour bars on the normalized coefficients indicate relative importance.
VIF: Variation Inflation Factor.
Significance: ***0.001, **0.01, *0.05.
From the cumulative access metrics, only access to banks and restaurants and schools were retained. Both have a positive impact of 0.80% and 0.42% per 1% increase of access to those facilities. A 1% increment in CBD access has a positive impact of 2.53% on the value of land. Meaning that land-values at the CBD are ∼253% more expensive than those at periphery. Job access has an unexpected negative sign. Values decline 1.26% per 1% increase in the access index. We tested the sign reversal by manually removing, one by one, the following predictors that showed multicollinearity and correlation with job access: integration_gravity, global integration, CBD and culture access. Nevertheless, we found the sign remained negative. We hypothesise that in our case study when adding job access to the model simultaneously with other predictors, the index might be actually capturing the negative effects of urban centrality (e.g. pollution and congestion). Particularly when considering that this metric uses the number of trips attracted to an area as a proxy to availability of jobs.
Land-value increases by 1.16% with a 1% increase in access to large-scale malls. In turn, a 1% increase in access to large-scale grocery shops means a decline of 1.17%. The comparison between these two coefficients could be revealing that the attractiveness of agglomerated commerce and recreational activities is the one that overweighs the potential negative impacts (noise and congestion) that the vicinity of either facility poses. Furthermore, a 1% increase in the access score of the following externalities implies an increase of land-values: universities (1.21%) and hospitals (1.21%). Unexpectedly, access to cultural facilities has a negative effect of a 1.87% decrease in land-values. In Guatemala City, cultural facilities are mostly located in the core and historic areas of the city. Negative externalities, such as pollution, congestion, street robbery, and informal commerce are common in these areas, which could explain the positive association with the land-values in the pairwise correlation, but a negative sign when placed jointly with other predictors.
As in Chiaradia et al. (2009), SSx integration has contrasting effects according to the analysed spatial scale. An increase of 1% in integration at lower radii (0.8 km and 2.5 km) implies a decline of land-values of 0.09% and 0.12% correspondingly. In Guatemala City most of the areas highlighted at low radii integration correspond to deteriorated single family neighbourhoods. These are areas with compact gridiron street patterns built during the first planned expansions. Middle-low and low-income groups predominantly reside in these areas, vulnerable to pollution and street robbery. In turn, the effect of integration is positive at a city-wide scale (global). Land-values increase by 0.28% per 1% increase of global integration. SSx choice at 1.5 km has a positive impact of 0.38%. Neither distance to main roads nor normalized choice at higher radii were selected. From the results we interpret that residential land-values benefit only when located nearby streets that are important at a neighbourhood scale, even tough, such streets are commonly part of or are connected to city-structuring roads outlined by choice analyses at higher radii.
Geometric via geographic access has a positive influence of 0.62% per 1% increase in the access index, implying that quality locations benefit from geometric access not only at location, but also as a reachable resource. This metric might be capturing access to vital urban areas, or with potential for doing so, containing economic activities not addressed explicitly by the other predictors.
The strongest multicollinearity (VIF ∼50) was observed with jobs and geometric via geographic access, meaning, as expected, there is a strong correlation between them and implying that their interpretation should not be considered without caution. However, we are able to claim that each predictor still adds complementary information to the model following their impact on the AIC statistic. This is further discussed in the model assessment.
Observations within a 300 m distance of the bus lines network showed a decline of 0.11%. This was expected as previous access predictors address the true benefit of public transport, while the mere physical proximity to the network itself might be associated with negative externalities such as noise and pollution.
We observe a difference between the sub-market variables indicating east and west peripheral municipalities. Assuming that the rest of the variables are held constant, residential land on the eastern periphery is more expensive by 0.25% compared to the rest of the observations, whereas land is 0.10% cheaper in the western peripheral areas. For the average plot, this means a difference of +Q445/m2 ($60) and –Q115/m2 ($15) respectively. Density of new condominiums predominates in both peripheral areas, particularly at the west. However, selling price segmentation clearly points out a more expensive market at the peripheral east side. Selling-price segmentation of residential projects (horizontal and vertical) have both a positive impact of 0.07% and 0.09% on the land-value.
Neighbourhood characteristics all have a positive contribution. An increase of 1% in the population density represents higher land-values by 0.12%. The predominance of different socio-economic groups, incremental from 1 (low incomes) to 5 (high incomes), is associated with a land-value increase of 0.43%. This is equivalent to a difference of Q345/m2 (∼$45) for an average plot located either in an area classified as 1 or 5. An increase of 1% in the proportion of private vehicle users in an area is associated with a land-value increase of 0.33%. Furthermore, an increase of 1% in density of new vertical and horizontal residential projects has a positive influence of 0.01%. This is equivalent to a difference of ∼5–7 new residential projects within a radius of 2 km.
The coefficient of the year predictor reveals that in these data there is an increase in the land-values of 0.09% every year. Such value increase over time is additional to money inflation. This variable could be reflecting the influence of dynamics that were not captured by any other predictor. For example, urban transformations along time (e.g. city expansion, redevelopment) and overall macroeconomic growth. The average annual growth of the gross domestic product (GDP) per capita was 0.93% between 2008 and 2014 (Bank, 2017). Furthermore nl_land-value per m2 is appraised 0.17% lower per 1% increase in the plot surface area. In turn, a 1% increase in built-up areas has a positive impact of 0.14%. Lastly, the co-ordinate function f_x reflects an increase of 0.29% per one unit increase, meaning that there is a value increase trend along the east-west direction additional to the access benefits and any other location characteristic addressed by the predictors included.
Finally, the normalized coefficients can be interpreted as changes in standard deviations of the land-values per one standard deviation change in the predictor variable. These coefficients are comparable. Access to CBD was the most important variable that positively influenced the residential land-values in Guatemala City, thus confirming a mono-centric land-value structure. It is followed by the year predictor, outlining the relevance of land-value gains over the years. Interestingly, geometric via geographic accessibility becomes the third most important predictor that capitalizes residential land, then followed by access to universities, malls and hospitals.
Model performance and diagnostics
Model performance in contrast with alternative models.
The original model was found to have the highest performance of all the models. It has the highest goodness of fit over the train data explaining up to 72.8% of the land-values variability. It has the lowest AIC value (167), meaning it is the most parsimonious model and closer to reflect the process that generates the land-value observations. Comparison of the AICs reflects an improvement of 31% and 16% over a model based only on public (1st) and private (2nd) mobility correspondingly. Furthermore, it improves by 22% over a model that excludes geometric access information (3rd). We do not observe major differences (up to 4%) in the adjusted R2 across the models. However, judging by the AIC comparisons, there is an important improvement in the land-value modelling approach when addressing more than one transport mode.
By comparing the original model against the 4th and 5th alternatives, we determined that together both CBD and “jobs” access contribute to a better model. It is clear that CBD access is a dominant predictor when considering its estimated coefficient (Table 2) and observing its effect on the AIC statistic when excluded in the 5th model, while some caution should be taken when interpreting what is being revealed by “jobs access” given its sign reversal and its multicollinearity with the CBD and the geometric via geographic-access itself.
Furthermore, it seems that addressing geometric accessibility does not only improve the goodness of fit, but also reduces the spatial dependence of the regression residuals by almost 200 m, meaning that less variability in the land-value observations is left unexplained and indicating that geometric access predictors add spatialized information at a localized scale. Such information is comparable to the information that is contributed to the model by the submarket predictors and the observation co-ordinates. This is also deduced from observing the AIC statistic of the 6th model, and the increase of geometric access predictors being kept in this model.
The cross-validation with the test data shows that variations of prediction accuracy among models are modest. The original model shows the lower RMSEP (0.3150), which is equivalent to 1.37 units in local currency. The N_RMSEP shows that the standard deviation of the prediction errors for all the models remains relatively the same. These statistics were somehow unexpected. However, even at this minor scale, the highest differences are observed when excluding geometric-access and the submarket predictors.
Finally, Figure 6 shows the regression diagnostics. From left to right, we observe that the dispersion of the residuals follows an approximately constant variance. The QQ plot shows that prediction errors closely follow a normal distribution. The plot on the right shows that errors are mostly centred in 0, with a slight tendency to overestimate extremely low values and underestimate extremely high values. This indicates a slight non-linearity behaviour in the data. However, by visually judging this figure and considering the magnitude of the errors in the cross-validation, we determined that such behaviour has minor effects in the performance of the model.
Diagnostics of model residuals.
Conclusions
We implemented a comprehensive operationalization of accessibility indicators to develop a residential land-value model in Guatemala City. The model incorporated metrics of geographic, geometric and geometric via geographic access. Those were computed as indices that incorporate the variation of the access disparity due to current availability of transport modes and adjusted to the predominant use of each mode across the city.
Following our first hypothesis, land-value research would benefit from considering more comprehensive definitions of accessibility. Integrating the available transport modes into the model modestly improved the goodness of fit to the observed data (adjusted R2) and the prediction accuracy of non-observed data. Nevertheless, it yielded a model that better represents the true process generating the observations (AIC), compared to similar models addressing access by only one transport mode. Furthermore, geometric accessibility brings spatialized and localized information that contributes to a parsimonious model and better explains the variability of land-values. This was deduced from observing a reduction of the spatial dependence in the model residuals.
Following our second hypothesis, we conclude that geometric accessibility does capitalize land at location (to some extent) and as a reachable resource. Similar to the results of Chiaradia et al. (2009), integration (network closeness), analysed at a localized radius of 0.8–1.5 km, has a negative effect on residential land-values. In Guatemala City, local integration outlines areas with compact gridiron layouts in old neighbourhoods. In turn, relative closeness to anywhere in the city in areas with larger blocks seems to capitalize land-values. Only hierarchy of roads on the neighbourhood scale, as analysed by SSx choice (network betweenness), has a positive impact on residential land-values. Furthermore, we determined that the potential reachability of geometric access by available transport modes is positively associated with the value of land. We consider this metric as complementary, as it could reflect access to economically vital areas (or with the potential for such) not addressed explicitly in other access metrics.
From the model results, we found that land-values in Guatemala City differ up to 253% following CBD access, which was the strongest predictor. Thus, we conclude that there is a predominance of a mono-centric land-value structure, which strongly reflects centralized economic activities, as described by Ford (1996). Access to jobs was the most important negative externality. We assume that when other predictors better explain such opportunities, this metric reveals the negative effects of urban centrality (pollution and congestion), also considering that it is computed using trips attraction as a proxy of job opportunities.
The results have some limitations and open a few paths for further research. Geographic access metrics per transport mode might bring multicollinearity problems when using a multivariate regression method. In this research we addressed this situation by computing access indices that, unlike Adair et al. (2000), are weighted by the predominance or preference for each transport mode across the city. However, it is necessary to consider potential mobility shifts along time, to test improvements by using other approaches to estimate the indices, or to use other regression techniques to deal with multiple collinear predictors. Unlike Bourassa et al. (2007), the use of submarket variables and geographic co-ordinates did not fully remove spatial dependence, meaning that some unexplained localized information could be further addressed with the help of spatial econometrics or geostatistical approaches. The ability of the geometric via geographic accessibility variable to explain land-values should be investigated in different urban contexts. Different time-decay parameters could increase or reduce the information that is added to the land-value model.
Several Latin American cities show centralized economic activities reflected in mono-centric structures, similar to Guatemala City. Yet, Guatemala City also has its unique characteristics such as the urban form, topography, immediate proximity of the airport with the CBD, arrangement of main infrastructure and particular local values, meaning that the methodological approach presented in this research could be replicated to other Latin American cities and other areas, but not without adjusting it by using local knowledge.
Finally, some implications can be drawn for planning and land administration practice. More robust models for mass valuation purposes, with added transparency to the process, could be achieved by simultaneously addressing public and private accessibility from a geographic perspective. Furthermore, it could be relevant for transport planners to explicitly model the relations between accessibility by public transport infrastructure and the value of land. In developing countries, this could mean producing useful information to help the planning of transport projects in relation to the attractiveness and capitalization of land. Moreover, the approach could be an initial step towards foreseeing the potential impact of mobility-related projects on land-value. This is of great relevance to ground different financial mechanisms to enforce the economic viability and sustainability of projects such as land-value captures.
Footnotes
Acknowledgements
We would like to thank the collaboration by local experts from “Inspecciones Globales” in Guatemala City for their support on collecting real estate appraisals data, and granting access to their reports on the local real estate market.
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: The work reported in this research was funded by NUFFIC through the NICHE Project.
