A Spatial Econometric Analysis of Productivity Variations Across US Cities

Abstract

Using a dynamic spatial panel model applied to 377 US Metropolitan Statistical Areas (MSAs), estimated over the period 2011-2021, significant differences are found between large MSAs regarding the relationship between labour productivity and economic mass, as measured by GDP. The methodology adopted illustrates the state of the art for spatial econometric modeling as it is often needed in practice, allowing for multiple endogenous regressors and dynamic effects. The estimation method applies synthetic instruments designed to limit negative effects of instrument overabundance.

Keywords

spatial econometrics spatial statistics and spatial econometrics spatial analysis methods dynamic spatial panel productivity US metropolitan statistical areas synthetic instruments

Introduction

This paper develops a methodology for analyzing productivity variation across cities and regions in the context of limited data availability and pervasive endogeneity. Informal data analysis shows that there are differences in economic performance as measured by productivity across Metropolitan Statistical Areas¹ (MSAs) and more formal modelling via dynamic spatial panel data models provides a more detailed and precise analysis than the signals provided by the initial informal analysis.

A distinctive contribution in the paper is a method that, rather than a single elasticity, allows the heterogeneity of productivity elasticities across cities and regions to be estimated. Based on GDP and employment data, this paper explores the productivity trends in the post-crisis era 2011 to 2021 for 377 MSAs and in particular for the 8 largest MSAs of the USA. These are the MSAs with employment greater than the 0.98 quantile of the employment distribution in 2001, amounting to 8 MSAs, namely those centered on Boston, Chicago, Dallas, Houston, Los Angeles, New York, Philadelphia and Washington.² We refer to these throughout as the top 8 MSAs. The approach adopted could in principle extend to other MSAs, but in the interest of practicality and ease of demonstration the number is purposefully kept low.

The analysis focusses on the effect of output, measured by GDP, on labour productivity, measured by the ratio of GDP to employment. This focus is chosen because persistent welfare inequalities between cities evidently depend on variations in productivity. Estimation is based on a dynamic spatial panel data model using a standard GMM estimator. Attention is paid to the possibility of endogeneity bias in parameter estimates, the possibility of nonstationarity of estimates, and issues caused by an overabundance of instrumental variables. In particular the paper highlights the application of synthetic instrumental variables as advocated by Fingleton (2023) in dynamic spatial panel data modelling, which helps to put estimation and inference on a more secure basis. This is because synthetic instruments can help reduce problems caused by a large number of instrumental variables, such as biased standard errors and unreliable diagnostic tests of model validity. Most notably, the standard test of overidentification, the Sargan-Hansen J test, can be unreliable. The paper provides estimates of short and long-run elasticities, that take account of spillover across space and time.

Theoretical Basis

There are several theoretical approaches that could be adopted to establish an equation that is the basis for empirical analysis, and although each commences with different assumptions, ultimately we do seem to have a case of equifinality, whereby each theory results in the same, or a very similar estimating equation in which productivity is a partial function of economic mass or some equivalent, and typically there is evidence of increasing returns to scale. Some alternatives are summarized in the Appendix. Here the focus is on the approach of Fingleton et al. (2023), in which the starting point is a fundamental equation that establishes a connection between a city’s output level and its economic mass. Representing the output level at time t in city i by Q_it, and approximating the economic mass in city i at time t by its level of employment (L_it), this fundamental equation asserts that an increase in economic mass, and hence variety of inputs (Fujita, Krugman, and Venables 1999; Jacobs 1969), which can be perceived as the city’s productive capacity, leads to a rise in production. It is crucial to note that this relationship exhibits nonlinearity. In fact, an increase in economic mass may generate more than a proportionate increase in production, so that a 1 percent increase in mass could potentially yield a production increase surpassing 1 percent. Increasing returns to scale are encapsulated in equation $(1)$ , and its presence in the data would be confirmed if the estimated parameter γ exceeds 1.

\begin{array}{l} Q_{i t} = ϕ L_{i t}^{γ} & γ ⋝ 0, i = 1, \dots, n; t = 1, \dots, T \end{array}

(1)

Equation (1) does not explicitly incorporate productivity, but we can introduce it by taking logarithms and reorganizing the equation so that the logarithm of productivity level, ln P_it appears as the left-hand side variable, hence

\begin{array}{l} \ln Q_{i t} & = \ln ϕ + γ \ln L_{i t} \end{array}

(2a)

\begin{array}{l} \ln L_{i t} & = \frac{\ln Q_{i t}}{γ} - \frac{\ln ϕ}{γ} \end{array}

(2b)

\begin{array}{l} \ln Q_{i t} - \ln L_{i t} & = \ln Q_{i t} - \frac{\ln Q_{i t}}{γ} + \frac{\ln ϕ}{γ} \end{array}

(2c)

\begin{array}{l} \ln (\frac{Q_{i t}}{L_{i t}}) & = \ln P_{i t} = \frac{\ln ϕ}{γ} + (\frac{γ - 1}{γ}) \ln Q_{i t} \end{array}

(2d)

\begin{array}{l} \ln P_{i t} & = a + b \ln Q_{i t} \end{array}

(2e)

It is evident that when γ > = 1, indicating constant or increasing returns to scale, 1 > b = γ − 1/γ > = 0, and as γ approaches infinity, b approaches 1.0. Therefore, a positive value of b, which is less than 1.0, indicates increasing output causing increasing productivity. With 0 < = γ < 1, a 1 percent increase in labour causes a less than 1 percent increase in output, so we have diminishing returns for output with respect to labour, and falling productivity with respect to output, as indicated by b < 0. The basic theory, summarised by equations (1) through to (2e) is that higher output enhances externalities associated with economic mass, causing higher levels of productivity, and higher productivity is associated with higher wage levels and higher returns on other factors and possibly increases over a range of other indicators of social and economic well-being. As noted by Fingleton et al. (2023), ‘a large spatial economic mass not only provides the diversity and variety of inputs needed for production, and large ‘home’ (local) market opportunities for new enterprises (Jacobs 1969), but also a concentration of other assets that aid local businesses, such as hard and soft infrastructures, educational centres, skilled labour, institutional networks, and public and private research centres’. The large economic mass typical of the top 8 MSAs means that they in particular should have more extensive externalities deriving from the additional diversity and variety of inputs available for production in large cities, causing increasing returns to scale. A circular relationship between economic mass and productivity means that economic mass is best treated as endogenous, and this assumption is fundamental to the econometric methods adopted. Figures 1 and 2 provide an initial glimpse of the data, taken from the Bureau of Economic Analysis, as updated on December 8, 2022 to included new statistics for 2021 and revised statistics for 2017-2020. These data comprise GDP and employment levels for the years 2011 to 2021. Figure 1 shows that the level of output varies quite dramatically from the high point of New York, with Chicago, Miami, Los Angeles and San Francisco also very prominent, Figure 2 shows a much less dramatic variation in mean productivity level across cities, but with mean productivity about 4 times greater from highest MSA to lowest, some MSAs are possibly less productive than they could be. Increasing productivity can be a source of increasing output without the costs otherwise associated with additional inputs, such as additional labor, so that the economy as a whole produces more goods and services for the same amount of work. So for the US economy as a whole, raising productivity among lagging MSAs can have a beneficial effect on material welfare. Productivity is a critical source of prosperity. Notice however that Figures 1 and 2 do not show a clear-cut relationship between GDP and productivity, as evidenced also by the data, from the same source, in Tables 1 and 2. Clearly there are additional factors to take into account in trying to explain productivity variation across MSAs. Likewise, there are numerous papers taking various approaches to understanding US regional and urban productivity variation, and to cite just a few, see for example Moomaw (1983), Gerking (1993), Burger and Meijers (2010), Melo et al. (2017), Parilla and Muro (2017), and Caliendo et al. (2018).

Figure 1.

Mean GDP 2011-2021 across MSAs.

Figure 2.

Mean productivity 2011-2021 across MSAs.

Table 1.

Range of GDP Levels From Lowest 8 MSAs to Highest 8.

MSA	GDP 2011	MSA	GDP 2021
Bottom 8
Toledo, OH	1982474	Sebring-Avon Park, FL	2428533
Grants Pass, OR	2028517	Gadsden, AL	2596715
Sebring-Avon Park, FL	2255579	Lewiston, ID-WA	2676639
Lewiston, ID-WA	2294737	Grants Pass, OR	2954596
Walla Walla, WA	2705274	Walla Walla, WA	2997301
Gadsden, AL	2802342	Danville, IL	3024701
Portland, ME	2863836	Portland, ME	3068263
Hot Springs, AR	3006322	Hot Springs, AR	3164010
Highest 8
Boston, MA	348885073	Boston, MA	444402874
Philadelphia	359488674	Houston, TX	463233301
Dallas, TX	365601169	Washington, DC	511253994
Houston, TX	392977179	Dallas, TX	513979216
Washington, DC	446255168	San Francisco, CA	577347865
Chicago, IL	556699079	Chicago, IL	630126315
Los Angeles, CA	759138833	Los Angeles, CA	950157776
New York, NY	1346878804	New York, NY	1598387648

Table 2.

Range of Productivity Levels From Lowest 8 MSAs to Highest 8.

MSA	PRO 2011	MSA	PRO 2021
Bottom 8
St. George, UT	49.15	McAllen-Edinburg-Mission, TX	51.35
Brownsville-Harlingen, TX	51.68	Brownsville-Harlingen, TX	52.44
McAllen-Edinburg-Mission, TX	52.90	Gadsden, AL	55.96
Gadsden, AL	57.87	Portland, ME	56.27
Missoula, MT	57.92	St. George, UT	58.48
Grants Pass, OR	57.94	Hot Springs, AR	59.12
Daphne-Fairhope-Foley, AL	58.39	Hammond, LA	60.07
Hot Springs, AR	58.74	Hattiesburg, MS	60.30
Highest 8
San Francisco, CA	124.30	Wheeling, WV-OH	126.08
Lake Charles, LA	124.70	New York, NY	127.32
Lima, OH	125.98	Greeley, CO	129.07
Midland, TX	128.33	Vallejo, CA	135.18
Odessa, TX	128.41	Seattle, WA	155.72
Bridgeport-Stamford-Norwalk, CT	128.74	San Francisco, CA	176.15
Beaumont-Port Arthur, TX	131.48	Midland, TX	264.14
San Jose-Sunnyvale-Santa Clara, CA	160.72	San Jose-Sunnyvale-Santa Clara, CA	268.47

Introducing Spillovers

As possibly suggested by Figure 2, productivity in city $i (i = 1, \dots, n)$ may partly depend on productivity in ‘nearby’ cities, both at time t and time t − 1, and it may also depend on i′s own productivity at time t − 1. Mainly following Baltagi et al.(2019), a rationale for the presence of such spatio-temporal spillovers can be suggested. Assume initially that unless disturbed, productivity tends to an equilibrium so that P_i = P_it = P_it−1. However if a shock occurs so that P_it ≠ P_it−1,we assume that the subsequent path of log productivity levels follows an autoregressive process with |δ| < 1so that

\begin{array}{l} \ln P_{i t} = ς_{i} + δ \ln P_{i t - 1} & i = 1, \dots, n; t = 1, \dots, T \end{array}

(3)

In equation (3), ς_i refers to a time-constant (n by 1) vector, and assuming that T is a large number, and |δ| < 1, then equation (3) converges to $\ln P_{i T} = ς_{i} / (1 - δ)$ . In order to introduce spatial lags, some measure of inter-city relationships is required. Invariably in the spatial econometrics literature this is in the form of a so-called W matrix of dimension n representing the interconnectivity of MSAs.The W matrix³ is time-invariant and contains fixed non-negative values, so we avoid having to estimate n² − n parameters, one for each city pair, but instead use a single estimated scalar parameter ρ.

Various designations of W are possible, but in this paper the W matrix is given by

W_{i j} = d_{i j}^{- 1} / ψ

(4)

In which d_ij is the straight line distance between MSAs i and j, and ψ = max(eig) is the maximum eigenvalue of $d_{i j}^{- 1}$ . Thus W is a symmetrical matrix with off-diagonal cells diminishing in value according to the distance between i and j. Scaling the matrix by ψ aids interpretation of the estimate of ρ which one would reasonably expect to be within the bounds 1/min(eig) < ρ < 1/max(eig) = 1 for dynamic stability and stationarity. We detail stationarity conditions subsequently.

To obtain spatial lags, we first introduce spatial dependence by multiplying equation (3) by the (n by n) matrix ρW to give

\begin{array}{l} ρ W \ln P_{t} = ρ W ς + ρ W δ \ln P_{t - 1} & t = 1, \dots, T \end{array}

(5)

Subtracting equation (5) from equation (3) gives

\ln P_{t} - ρ W \ln P_{t} = ς + δ \ln P_{t - 1} - (ρ W ς + ρ W δ \ln P_{t - 1})

(6)

hence

(I - ρ W) \ln P_{t} = (δ I - ρ δ W) \ln P_{t - 1} + (I - ρ W) ς

(7)

in which I is an identity matrix of order n. This gives equation (8) in which ln P_t in each MSA is a function of ln P_t in other MSAs, particularly spilling over from nearby MSAs according to the structure of W, and is also a function of ln P_t−1 and time-invariant heterogeneity across MSAs denoted by ς.

\ln P_{t} = ρ W \ln P_{t} + δ \ln P_{t - 1} - ρ δ W \ln P_{t - 1} + (I - ρ W) ς

(8)

Spillovers are across space are likely to occur because MSA boundaries are permeable; labour quality, various other socio-economic attributes, capital, ownership and management structures, supply-chain effects, spillovers of technology, etcetera, and consequently productivity, in ‘nearby’ MSAs are likely to affect, and be affected by, productivity in MSA i.

The Model

The fundamental theory relating productivity to output is encapsulated by equation (2e), while equation (8) expresses how productivity is related to productivity across space and time. These two fundamental causes of productivity variation are combined in equation (9) which is the result of adding a + β₁ ln Q_t to equation (8) in accordance with equation (2e). In addition the equation assumes that productivity is affected by variations in capital stock K_t and human capital H_t across MSAs, since these feature widely in many accounts of the causes of productivity variation, for example Mankiw, Romer, and Weil (1992), Becker (1993), Barro and Lee (2013), Ciccone and Peri (2006), and the World Bank (2019), to mention just a few. Notice however that spatial lags of regressors such as W ln Q_t, W ln K_t and W ln H_t are not included in equation (9) because they do not follow directly from theory.

\begin{array}{l} \ln P_{t} & = ρ W \ln P_{t} + δ \ln P_{t - 1} - ρ δ W \ln P_{t - 1} + a + β_{1} \ln Q_{t} + \dots \\ β_{2} \ln K_{t} + β_{3} \ln H_{t} + (I - ρ W) ς \end{array}

(9)

In equation (9) we have ln P_t as a function of the spatial lag, the temporal lag and the spatial lag of the temporal lag. Regarding the variables K_t and H_t, we do not have explicit measures but nonetheless attempt to capture their effects in estimation. The term $(I - ρ W) ς$ in equation $(9)$ represents other time-invariant variables. Naturally this very simple model will fail to capture all of the factors affecting productivity variation across MSAs, so we make the simplifying assumption that through the period considered, 2011-2021, some omitted variables are effectively constant and can be treated as part of the time-invariant individual MSA heterogeneity represented by $μ = (I - ρ W) ς$ with $μ \sim N (0, σ_{μ}^{2})$ . The remaining random variation across cities and time is represented by $ν_{t} \sim N (0, σ_{ν}^{2})$ , so together we have a compound error process ɛ_t = μ + ν_t (t = 1, …, T). with an assumption that the individual effects μ_i and disturbances ν_it are uncorrelated. The error term ν_t is important because it attempts to capture random shocks. Of course through the study period, there are shocks but the largest is undoubtedly the COVID pandemic. One might expect that this might have an impact on fundamental relationships in the model, notably as given by β₁ ln Q_t. Throughout we are aware of the possible correlation between the error term and ln Q_t, and shock-impacts are another reason to treat ln Q_t as an endogenous variable.

\begin{array}{l} Δ \ln P_{t} & = ρ W Δ \ln P_{t} + δ Δ \ln P_{t - 1} - ρ δ W Δ \ln P_{t - 1} + \dots \\ β_{1} Δ \ln Q_{t} + β_{2} Δ \ln K_{t} + β_{3} Δ \ln H_{t} + Δ ε_{t} \end{array}

(10)

Equation (10) is obtained by taking first differences of equation (9). Section 5 outlines the rationale for estimating a model similar to equations (9) and (10).

Further Preliminary Insights

Using Bureau of Economic Analysis data described above, taking averages across 377 cities, log GDP⁴ rises each year from 16.4203 in 2011 to 16.5668 in 2021, with the exception of 2020 when GDP is below its 2019 level. To look at trends in output disparities across cities, we use the coefficient of variation, which is the standard deviation divided by the mean so as to obtain a scale invariant measure. The coefficient of variation of log GDP rises inexorably each year from 0.0719 in 2011 to 0.0740 in 2021. With regard to employment, a similar picture emerges for mean log employment⁵ averaging over the 377 cities, which increases each year from 12.0594 in 2011 to 12.1591 in 2021, with the exception of a dip in 2020. The coefficient of variation for log employment also rises each year from 0.0912 in 2011 to 0.0937 in 2021. Overall it appears that there is consistent post-crisis growth in output and employment with the exception of a COVID inspired slump in 2020, but the increasing coefficients of variation through time suggest widening differences between cities. This suggests that over the 2011-2021 period, urban economies experienced varying rates of recovery from the 2008 economic crisis and differing reactions to the shock of the COVID pandemic.

Figure 3, by focussing on a particular year and by taking logs, gives some insight into the otherwise somewhat concealed nexus between output and productivity. Figure 3 shows the relationship between log level of productivity and log level of GDP in 2011 for all 377 MSAs, indicating a strong positive correlation. Figure 4 emphasizes this by the path of log productivity (total GDP divided by total employment) for the top 8 MSAs versus the path for the smaller MSAs over the 2011-2021 period. Taken as a group, top 8 productivity is always higher than smaller MSA productivity. Overall we see that productivity was higher in 2021 than in 2011, but the path for the top 8 MSAs indicates no growth from 2012 to 2017, whereas the smaller MSAs show fairly consistent growth from 2013. For both MSA groups, the effect of the 2019 COVID pandemic is evident from the levelling off of productivity growth 2019 and 2020, with a sharp upturn from 2020. So the evidence is that while there were similarities between the MSA groups, the are also differences. One reason for the relatively muted growth of productivity for the top 8 MSAs as a group is the diverse within-group performance, as is evident in Figure 5.

Figure 3.

MSA Output versus Productivity in 2011.

Figure 4.

Productivity trends : Top 8 group compared with other MSAs.

Figure 5.

Productivity trends : Top 8 individuals compared with other MSAs.

Econometric Analysis

Estimation

Our estimator, equation (12) is in terms of differences in logs, or exponential growth rates, and this helps us cope with an immediate problem in trying to estimate equation (9), which is lack of data. Clearly it would be ideal to include the level of capital stock in each MSA and for each year among the set of explanatory variables. However capital stock data at the subregional level are rarely if ever available, but as suggested in relation to equation (33), it is possible to assume that physical capital growth is proportionate to the rate of growth of output. According to Fingleton and McCombie (1958), ‘it is a stylized fact that for the advanced countries the capital stock grows either at approximately the same rate as output or slightly slower, and this assumption is applied in an analysis of productivity growth across European Union regions. McCombie and Thirlwall (1994, p.557) found that regressing capital growth on output growth for a sample of advanced countries, gave a slope coefficient that was not significantly different from unity’. Likewise, the author finds that a simple linear OLS regression of log US capital stocks⁶ on log real GDP⁷ gives a slope coefficient of b₂ = 0.9051, R² = 0.998. As McCombie and Roberts (2007) observe, ignoring the effect of capital accumulation on the growth of productivity is permissible if the capital-output ratio does not display any growth.⁸ Accordingly we assume that, for MSA i, ln K_it = b₂ ln Q_it. Bernat (1996) found a large impact on a state’s productivity growth by the productivity growth of the surrounding states, and McCombie and Roberts (2007) suggest that this could reflect a misspecification problem. Bernat (1996) did not include the growth of the capital stock [in the Verdoorn law] and “to the extent that this is not perfectly correlated with output growth, this will affect the error term”. In fact MSA output growth over 2011 to 2021 is also significantly spatially correlated. Based on W defined by equation (4), Moran’s I_M = 0.0795, with expected value under the null hypothesis E(I_M) = −0.0027 and with var(I_M) = 2.4656e−5 under the randomization assumption, Z =(I_M − E(I_M))/var(I_M)^0.5 = 16.55, which is very significantly different from 0 when referred to the N(0,1) distribution. One would anticipate that omitted capital stock growth would be correlated with spatially correlated output growth, and it is possible that the error induced by the omission of capital stock is spatially correlated, and the presence of the spatial lag in the model, which turns out to be highly significant, may to some extent be correcting for this mispecification.

Lack of human capital data presents a similar problem. Ciccone and Hall (1996) use cross-sectional data at State level and at county level, but a problem arises finding credible MSA-specific educational attainment time series. Some series are available from the American Community Survey (ACS), but there are omissions across time and space, measurement errors, and a mismatch between ACS MSAs and MSAs in this study. Alternatively, we consider the number of high school graduates (including equivalents) and the number of bachelors degrees and find that these indicators at the national (USA) level are both closely correlated with national (metropolitan portion) real GDP levels. Regressing log high school graduates on log real GDP for 2011 to 2021 gives R² = 0.84. Regressing log bachelors’ degrees on log real GDP for 2011 to 2021 gives R² = 0.97. Using a much longer series, a simple linear OLS regression of log Index of Human Capital per Person for the United States,⁹ based on years of schooling and returns to education, on log real GDP gives b₃ = 0.18539, R² = 0.9619. We therefore assume that, for MSA i, ln H_it = b₃ ln Q_it. So we get around the problem of lack of data for physical and human capital by attempting to capture their effects via their relationship with Q_t. In equation (9), the term β₁ ln Q_t + β₂ ln K_t + β₃ ln H_t can be replaced by β₁ ln Q_t + β₂b₂ ln Q_t + β₃b₃ ln Q_t, and therefore the elasticity of P_t with respect to Q_t, broadly defined, can be represented as $β = (β_{1} + β_{2} b_{2} + β_{3} b_{3})$ in equation (11). Also the time-space diffusion parameter θ = −ρδ, and while for simplicity this restriction is not maintained in estimation, we expect the estimate of θ to be negative. The levels equation encompassing these assumptions is

\begin{array}{l} \ln P_{t} & = ρ W \ln P_{t} + δ \ln P_{t - 1} + θ W \ln P_{t - 1} + β \ln Q_{t} + \dots \\ + {\tilde{β}}_{1} D_{1} \ln Q_{t} \dots + {\tilde{β}}_{8} D_{8} \ln Q_{t} + ε_{t} \end{array}

(11)

Equation (11) also includes the terms ${\tilde{β}}_{1} D_{1} \ln Q_{t} + \dots + {\tilde{β}}_{8} D_{8} \ln Q_{t}$ to enable estimation of MSA-specific elasticities for the j = 1, ..8 top 8 MSAs. In these, D_j refers to an n by T matrix D_j corresponding to MSA j in which the i′th row contains 1s and the other matrix cells are zeros. Element by element multiplication (or Hadamard product) of D_j and the n by T matrix ln Q gives the n by T matrix D_j ln Q, which comprises zeros except for row i which contains MSA j′s values ln Q_t for t = 1, …, T.

\begin{array}{l} Δ \ln P_{t} & = ρ W Δ \ln P_{t} + δ Δ \ln P_{t - 1} + θ W Δ \ln P_{t - 1} + β Δ \ln Q_{t} + \dots \\ + {\tilde{β}}_{1} D_{1} Δ \ln Q_{t} \dots + {\tilde{β}}_{8} D_{8} Δ \ln Q_{t} + Δ ε_{t} \end{array}

(12)

Following standard practice estimation of the corresponding difference equation (12) is via the Arellano and Bond (1991) difference GMM estimator for dynamic panel data¹⁰ using appropriate instrumental variables to obtain consistent estimates. Differencing nullifies the interdependence of lagged variables and the error term, because the time-invariant individual effects μ which are correlated with the time and space-lagged dependent variables are eliminated. The estimates of the parameters of difference equation (12) are also estimates of the corresponding levels equation (11). Additional sources of endogeneity relate to ln Q. Given Z, which is an instrumental variable for ln Q, then D_jZ (j = 1, …, 8) is calculated in the same way as D_j ln Q. Reshaping each variable as an nT by 1 column vector gives, for illustration purposes, Table 3, which is a selection from the full nT (4147 = 377 times 11) by 21 data matrix pertaining to Boston (mainly), comprising ln Q, ln P, W ln P, W ln P₋₁, D₁ ln Q, Z and D₁Z. The full nT by 21 data matrix has columns ln Q, ln P, W ln P, W ln P₋₁, D₁ln Q, D₂ln Q, D₃ln Q, D₄ln Q, D₅ln Q, D₆ln Q, D₇ln Q, D₈ln Q, Z, D₁Z, D₂Z, D₃Z, D₄Z, D₅Z, D₆Z, D₇Z, D₈Z.

Table 3.

Sample of Data.

year	MSA	lnQ	lnP	WlnP	WlnP_₁	D₁lnQ	Z	D₁Z
2020	Boise	17.32126	4.255461	2.1927889	2.1903998	0.00000	17.25677	0.00000
2021	Boise	17.4072	4.289242	2.201593	2.1927889	0.00000	17.34287	0.00000
2011	Boston	19.67025	4.712776	3.5004628	3.5027215	19.67025	19.63333	19.63333
2012	Boston	19.69349	4.717762	3.4992771	3.5004628	19.69349	19.65666	19.65666
2013	Boston	19.69802	4.698321	3.4954685	3.4992771	19.69802	19.65717	19.65717
2014	Boston	19.71468	4.691829	3.4949007	3.4954685	19.71468	19.67583	19.67583
2015	Boston	19.75402	4.688375	3.4963735	3.4949007	19.75402	19.71208	19.71208
2016	Boston	19.77376	4.689332	3.4944153	3.4963735	19.77376	19.73751	19.73751
2017	Boston	19.79305	4.693485	3.4963585	3.4944153	19.79305	19.75888	19.75888
2018	Boston	19.83607	4.719387	3.4996598	3.4963585	19.83607	19.80383	19.80383
2019	Boston	19.86599	4.74331	3.5124416	3.4996598	19.86599	19.83331	19.83331
2020	Boston	19.84189	4.775228	3.5183914	3.5124416	19.84189	19.80259	19.80259
2021	Boston	19.91224	4.812411	3.5366961	3.5183914	19.91224	19.87468	19.87468
2011	Boulder	16.85129	4.486357	2.8552694	2.8558235	0.00000	16.85340	0.00000
2012	Boulder	16.8552	4.468974	2.8578201	2.8552694	0.00000	16.86565	0.00000

Key: lnQ = log GDP; lnP = log productivity; WlnP = Spatial lag of log productivity; WlnP_₁ = Spatial lag of lagged log productivity; D₁lnQ = log GDP for Boston; Z = Synthetic instrument for log GDP; D₁Z = Synthetic instrument for log GDP of Boston.

Given that endogenous variables are contemporaneously related to the errors, there needs to be a careful choice of instruments in order to satisfy moments equations. An important prerequisite is that there is no serial correlation in the errors, in which case lags of the regressors can legitimately act as instrumental variables and satisfy orthogonality of the instruments and the differenced errors. For example, some moments conditions are

E (\ln P_{i l} Δ ε_{i t}) = 0, \forall i, l = 1, \dots, t - 2, t = 3, \dots, T

(13)

E (w_{i} \ln P_{l} Δ ε_{i t}) = 0, \forall i, l = 1, \dots, t - 2, t = 3, \dots, T

(14)

where w_i is the 1 by n vector which corresponds to the i′th row of W.

Instruments created in this way are referred to as GMM instruments, or HENR instruments (after Holtz-Eakin, Newey and Rosin 1988), with one instrument per variable, time period and lag distance amounting to (T − 2)(T − 1)/2 instruments for each endogenous variable. It is apparent that creating instruments in this way is rather expensive, since GMM instruments lead to quadratic growth in the number of instruments with respect to T. Instrument overabundance can have a detrimental effect on inference. For instance the estimated asymptotic standard errors of the efficient, two-step, GMM estimator are downwardly biased in small samples (Windmeijer 2005).¹¹ Additionally, Sargan–Hansen’s J test (Sargan 1958; Hansen 1982), which tests the null hypothesis of joint validity of the moments conditions under overidentification¹² can be greatly weakened by instrument proliferation (Andersen and Sørenson 1996; Bowsher 2002; Roodman 2009a, 2009b).

One solution to the problem of instrument overabundance is to collapse the instrument set over time, so that there is only one instrument for each variable and lag distance, giving (T − 2) instruments per endogenous variable. One can further limit the number of (collapsed) instruments by dropping instruments and reducing the lags¹³ used to provide instruments from 2 to less than the maximum of T − 1. However with numerous endogenous variables requiring instruments, the number can still mount up and compromise estimation.

To further restrict the number of instruments, we apply so-called IV instruments for some of the endogenous regressors. Typically IV instruments can be entered into the instrument matrix as a single column per instrument, rather than as multiple columns as occur with GMM instruments, and so are not a cause of instrument proliferation. The problem is finding suitable IV instruments. In one of their models estimated, the seminal paper by Ciccone and Hall (1996) utilizes the presence or absence of a railroad in 1860 as an instrument for a 1988 density index which is an endogenous factor affecting productivity in 1988, on the basis that the early development of railroads underpinned the early development of agglomeration and thus correlates with contemporary density, but is not related to model residuals. Likewise, early patterns of agglomeration and distance from the Eastern seaboard of the USA are additional instruments which correlate with density but are orthogonal to model residuals. While these instruments may be appropriate for their cross-sectional analysis, finding successful time-varying instruments in a panel data context is more demanding. Generally IV instruments that maximize correlation between regressor and instrument tends to also increase the correlation between an instrument and the errors, hence causing the J test to reject, while lesser correlated instruments are typically weak and irrelevant with respect to the endogenous regressors. As explained by Fingleton (2023), and as advocated for cross-sectional regression by LeGallo and Páez (2013), one solution to this dilemma comes from the spatial filtering literature (typically associated with the work of Griffith 1988, 1996, 2000, 2003; Getis and Griffith 2002; Boots and Tiefelsdorf 2000; Patuelli et al. 2006, among others).

In the current study, a synthetic instrument is obtained from a symmetrical ‘contiguity’ matrix, the starting point for which is an n by n matrix of random numbers, r_c, in the range 0 to 1. For a matrix cell defined by row i and column j, if r_c(i, j) < 0.5 then w_c(i, j) =1 for i = 1, …, n, j = 1, …, n. Otherwise w_c(i, j) = 0 and there are also zeros on the main diagonal, thus w_c(i, i) = 0. The matrix is symetricized by calculating w_s = w_c + w_c′,and if w_s(i, j) > 0, then w_s(i, j) = 1, otherwise w_s(i, j) = 0. the outcome is a symmetrical matrix w_s of 1s and 0s. Given that the matrix simply reflects the completely hypothetical spatial connectivity of n regions and is not dependent on the dependent variable or equivalently the errors, the eigenvectors of w_s are exogenous, so are an appropriate basis for so-called synthetic instruments, since exogeneity is one of the ideal properties of an instrumental variable. The second property is close correlation between instrument and endogenous variable, and this is possible because with n exogenous orthogonal eigenvectors to consider, it is very likely that some of them will be correlated with the endogenous regressor. Accordingly, a synthetic instrument is the fitted values resulting from regressing the endogenous regressor, ln Q_t, given t = 1, on weighted linear combinations of the orthogonal eigenvectors. The estimated linear OLS regression coefficient multiplied by eigenvector i gives a weighted eigenvector. Repeating this for i = 1, …, n eigenvectors produces n weighted eigenvectors, and ln Q_t is then regressed on the sum of the weighted eigenvectors. The fitted values from similar regressions for t = 2, …, T are collected as the columns of the n by T matrix Z used as a synthetic instrument for ln Q. This is also the basis of the set of instruments for the endogenous MSA-specific regressors D_j ln Q given by the Hadamard product of top 8 MSA-specific one-zero variables D_j with Z, denoted D_jZ (j = 1, …, 8). The example given in Table 3 illustrates the data set-up, and Table 4 gives the correlations between the endogenous regressors and their synthetic instruments.

Table 4.

Correlation Between Synthetic Instruments and Regressors.

Variable	Synthetic Instrument	Correlation
$\ln Q_{t}$	$Z$	0.7857
$D_{1} \ln Q_{t}$	$D_{1} Z$	0.8230
$D_{2} \ln Q_{t}$	$D_{2} Z$	0.8011
$D_{3} \ln Q_{t}$	$D_{3} Z$	0.8102
$D_{4} \ln Q_{t}$	$D_{4} Z$	0.7941
$D_{5} \ln Q_{t}$	$D_{5} Z$	0.8055
$D_{6} \ln Q_{t}$	$D_{6} Z$	0.8265
$D_{7} \ln Q_{t}$	$D_{7} Z$	0.8325
$D_{8} \ln Q_{t}$	$D_{8} Z$	0.8155

Calculating Direct, Indirect, and Total Effects

The raw elasticities do not fully represent the total effect of a 1 percent change in GDP because spillovers across space contribute to a direct effect, where an increment in GDP in MSA i directly affects productivity in MSA i, an indirect effect where a GDP increment in the other MSAs spills over to affect MSA i′s productivity, with the total effect equal to the sum of the direct and indirect effects. Note that the direct effect will not equal the raw elasticity exactly, because an increment in GDP in i causing a change in productivity in MSA i will cause other MSAs to also increase in productivity, because of the effect of the W ln P_t term, and these induced productivity changes in neighbours and neighbours of neighbours will spill back to affect productivity in MSA i. The magnitude of these spillover effects can be measured by the difference between the raw elasticities and the direct effects.

{[\begin{array}{l} \frac{\partial \ln P_{1 t}}{\partial \ln Q_{1 t}} & . & . & . & \frac{\partial \ln P_{1 t}}{\partial \ln Q_{n t}} \\ \frac{\partial \ln P_{2 t}}{\partial \ln Q_{1 t}} & \frac{\partial \ln P_{2 t}}{\partial \ln Q_{2 t}} & . & . & \frac{\partial \ln P_{2 t}}{\partial \ln Q_{n t}} \\ . & . & . & . & . \\ \frac{\partial \ln P_{n t}}{\partial \ln Q_{1 t}} & \frac{\partial \ln P_{n t}}{\partial \ln Q_{n t}} \end{array}]}_{t} = B_{N}^{- 1} (β I + \tilde{β} O)

(15)

In order to calculate the elasticities that do take these spillovers into account, we calculate the matrix of derivatives (15) in which $B_{N} = (I - ρ W)$ and I is an n by n identity matrix, similar to the matrix described by LeSage and Pace (2009) and Elhorst (2014) among others. The assumption in this calculation in that there is a unit increase in ln Q in all MSAs at time t.

Each MSA will have its own elasticity. For example, for i = 1, then we can see from the left hand side of (15) that the first derivative is the elasticity ∂ ln P_1t/∂ ln Q_1t which is the direct effect of a change in GDP in i = 1 on productivity in i = 1. The other off-diagonal terms in the first row ∂ ln P_1t/∂ ln Q_jt (j = 2, …, n) are the indirect effects. So the total effect of an increment in GDP in all n MSAs on productivity in MSA i = 1 is the row sum of (15). Given the set of MSAs for which parameter estimate $\hat{β}$ alone is applicable, the true derivatives are given by the estimate of $B_{N}^{- 1} (β I)$ . However for the top 8 MSAs, there are the additional terms $\tilde{β} = β_{1}, \dots, β_{8}$ to capture MSA-specific effects. In (15), the term βI places the estimate of β in each of the n main diagonal cells of an otherwise n by n matrix of zeros. The extra term $\tilde{β} O$ is equivalently an n by n matrix of zeros O with estimated βs added to specific main diagonal cells according to the top 8 MSA considered. The other main diagonal cells are also zeros. Without $\tilde{β} O$ , different MSA-specific derivatives depend on estimated $B_{N} = (I - ρ W)$ and β, so they vary according to their location, given by W,with ρ and β constant across all MSAs. Including $\tilde{β} O$ introduces varying elasticities for reasons other than W across MSAs. In this example, given the thesis that the large MSAs (top 8) will have different elasticities compared to each other and to other MSAs, the estimated derivatives are given by W, $\hat{ρ}, \hat{β}$ and the different elements of estimated $\tilde{β}$ .

The outcome of applying equation (15) is a set of n direct, indirect and total short-term elasticities, one for each individual. Because each derivative is different, LeSage and Pace (2009) suggest summary measures of direct, indirect and total effects. Accordingly, the direct short-term effect of a unit increase at time t is equal to the mean of the leading diagonal of (15). The total effect is equal to the mean row sum of (15) and the indirect effect is equal to the total effect minus the direct effect. The MSA-specific effects are obtained from the specific rows of matrix (15), so for MSA i, the direct effect is ∂ ln P_it/∂ ln Q_it and the total effect is the sum of row i. The direct effects tend to be similar to the raw elasticities. The total effects are however very different, since these take full account of the instantaneous spillover effects at time t.

Note that the short-term derivatives in equation (15) pertain to a specific point in time t. If we are prepared to assume constancy of parameters and W over time, and given that the model generating the parameters passes the test of dynamic stability and stationarity, it is also possible to calculate the n by n matrix of long-term derivatives (16), where C_N = γI + θW, based on effects of a persistent increase in ln Q across all n MSAs as t goes to infinity. So the long-term derivatives are the ones that would hold given that the interactions through space and time have stabilized to a steady-state at which log productivity no longer changes.

[\begin{array}{l} \frac{\partial \ln P_{1 t}}{\partial \ln Q_{1 t}} & . & . & . & \frac{\partial \ln P_{1 t}}{\partial \ln Q_{n t}} \\ . & . & . & . & . \\ . & . & . & . & . \\ \frac{\partial \ln P_{n t}}{\partial \ln Q_{1 t}} & \frac{\partial \ln P_{n t}}{\partial \ln Q_{n t}} \end{array}] = {(- C_{N} + B_{N})}^{- 1} (β I_{N} + \tilde{β} O)

(16)

Again the total effect is equal to the mean row sum of (16), and the direct effect is the mean of the main diagonal. And again the elasticities for specific MSAs can be picked out from the matrix of derivatives (16).

The standard errors for each elasticity are obtained by generating numerous (S = 5, 000) sets of parameter combinations based on the estimated k + 4 by k + 4 parameter variance-covariance matrix V for the k + 4 parameters γ, ρ, θ, β, β₁…β₈. As shown by equation (17), given the decomposition of the estimated parameter variance-covariance matrix via the Cholesky decomposition (chol) to an upper triangular matrix, as described by Elhorst (2014), then the matrix product of this with a k + 4 by 1 vector of samples drawn from an N(0, 1) distribution Φ embodies the covariance, but with zero mean, so adding to this the k + 4 by 1 vector of estimates of [γ ρ θβ β_1,…, β₈]′ gives outcomes [γ_s ρ_s θ_s β_s β_1s, …, β_8s]′ for draw s which has the desired means and covariance.

{[γ_{s} ρ_{s} θ_{s} β_{s} β_{1 s}, \dots, β_{8 s}]}^{'} = c h o l {(V)}^{'} Φ + {[\hat{γ} \hat{ρ} \hat{θ} \hat{β} {\hat{β}}_{1,} \dots, {\hat{β}}_{8}]}^{'}

(17)

As noted by LeSage and Pace (2009), the empirical distribution of the parameters can also be obtained using a large number of simulated parameters drawn from a multivariate normal distribution. In this case

\begin{array}{l} [γ_{s} ρ_{s} θ_{s} β_{s} β_{1 s}, \dots, β_{8 s}] & = & N (ξ, V) \end{array}

(18)

\begin{array}{l} ξ & = & [\hat{γ} \hat{ρ} \hat{θ} \hat{β} {\hat{β}}_{1,} \dots, {\hat{β}}_{8}] \end{array}

(19)

Another approach described in LeSage and Pace (2009) obtains dispersion measures base on Bayesian Markov chain Monte Carlo estimation, but this is beyond the scope of this paper. Applying either (17) or (18) gives almost identical outcomes, but the estimates given in Table 8 are based on equation (17).

Each parameter combination drawn (γ_s, ρ_s and θ_s, ; s = 1, …, S) is tested to see if it complies with the dynamic stability and stationarity conditions.

{[\begin{array}{l} \frac{\partial \ln P_{1 t s}}{\partial \ln Q_{1 t s}} & . & . & . & \frac{\partial \ln P_{1 t s}}{\partial \ln Q_{n t s}} \\ . & . & . & . & . \\ . & . & . & . & . \\ \frac{\partial \ln P_{n t s}}{\partial \ln Q_{1 t s}} & \frac{\partial \ln P_{n t s}}{\partial \ln Q_{n t s}} \end{array}]}_{t} = B_{N s}^{- 1} (β_{s} I + {\tilde{β}}_{s} O)

(20)

Those that do not are rejected, but the p = 1081 that pass the test are then turned into direct, indirect and total short-term effects using equation (20) in which $B_{N s} = (I - ρ_{s} W)$ and β_sI and ${\tilde{β}}_{s} O$ contain the simulated values of the β parameters. Direct, indirect and total effects for stationary draw s are calculated from relevant cells of (20) as in the case of (15).

Following the approach for short-term derivatives, the distribution of the long-run effects is obtained from the set of P simulations that pass the stationarity test, so from each set of simulated parameters $C_{N s} = γ_{s} I + θ_{s} W, B_{N s} = (I - ρ_{s} W)$ and β_sI and ${\tilde{β}}_{s} O$ , one calculates (21) and therefore direct, indirect and total simulated effects. The set of P such calculations gives the distribution of the elasticities.

[\begin{array}{l} \frac{\partial \ln P_{1 t}}{\partial \ln Q_{1 t}} & . & . & . & \frac{\partial \ln P_{1 t}}{\partial \ln Q_{n t}} \\ . & . & . & . & . \\ . & . & . & . & . \\ \frac{\partial \ln P_{n t}}{\partial \ln Q_{1 t}} & \frac{\partial \ln P_{n t}}{\partial \ln Q_{n t}} \end{array}] = {(- C_{N s} + B_{N s})}^{- 1} (β_{s} I + {\tilde{β}}_{s} O)

(21)

Results

Table 5 summarizes the results of estimating equation (12) over the period 2011 to 2021 via difference-GMM based on using only 19 instruments, 10 collapsed GMM instruments based on ln P_t and W ln P_t with lags restricted to 2 to 6, plus the 9 IV (synthetic) instruments. This relatively small number avoids standard error bias and satisfies critical diagnostic tests.

Table 5.

Estimates of Dynamic Spatial Panel Model With Top 8 MSAs.

Variable	Parameter	Est.	s.e.	z-ratio	Raw Elasticities
$\ln p_{t - 1}$	$γ$	0.836	0.1960	4.26
$W \ln p_{t}$	$ρ$	0.783	0.2408	3.25
$\ln Q_{t}$	$β_{1}$	0.069	0.0281	2.43
Boston	$β_{2}$	0.046	0.0308	1.51	$β_{2}$ + $β_{1}$	0.115
Chicago	$β_{3}$	−0.081	0.0761	−1.06	$β_{3}$ + $β_{1}$	−0.012
Dallas	$β_{4}$	−0.019	0.0174	−1.07	$β_{4}$ + $β_{1}$	0.050
Houston	$β_{5}$	−0.258	0.1066	−2.42	$β_{5}$ + $β_{1}$	−0.190
Los Angeles	$β_{6}$	0.203	0.0251	8.08	$β_{6}$ + $β_{1}$	0.272
New York	$β_{7}$	−0.182	0.0156	−11.65	$β_{7}$ + $β_{1}$	−0.113
Philadelphia	$β_{8}$	−0.416	0.0867	−4.8	$β_{8}$ + $β_{1}$	−0.347
Washington	$β_{9}$	−0.006	0.0348	−0.17	$β_{9}$ + $β_{1}$	0.063
$W \ln p_{t - 1}$	$θ$	−0.640	0.5973	−1.07
Diagnostics			Reference
AR(1)		−3.38	N(0,1)	p = 0.001
AR(2)		−1.00	N(0,1)	p = 0.317
Hansen J		8.11	$χ_{7}^{2}$	p = 0.323
Instruments	Number of instruments = 19GMM: collapsed, lags 2 to 6 of $\ln p_{t}$ , $W \ln p_{t}$ IV: Z D1Z D2Z D3Z D4Z D5Z D6Z D7Z D8Z

Test of dynamic stability and stationarity Maximum absolute real eigenvalue of $A = {(I - ρ W)}^{- 1} (γ I + θ W)$ , 0.90089 result of test is that the model does not violate the assumption of dynamic stability and stationarity.

The first critical diagnostic test is of the viability of lagging variable ln P_t and W ln P_t by two periods to form GMM instruments, which depends on zero second-order serial correlation in the first differenced residuals (see Arellano and Bond 1991; Bond 2002). An informal indication that this is a reasonable assumption is given by Figure 6 which is based on estimated residuals.

Figure 6.

Evidence of lack of second order residual serial correlation.

Figure 6 points to an absence of second order serial correlation which is the key requirement. More formally, we reject the null hypothesis of no first-order serial correlation¹⁴ in first differences (AR(1) test) but critically do not reject the null hypothesis of no second-order serial correlation in first differences (AR(2), as shown by Table 5, in which the z-ratio for AR(2) is not extreme with respect to the N(0,1) distribution. So on this basis one can assume that the instruments are independent of the error term which is a necessary requirement for valid instrumental variables. The second crucial diagnostic is the Sargan-Hansen J test for overidentifying restrictions, as described in the Appendix. Typically the J test suffers from low power when there are too many instruments, but this problem has been averted by limiting the use of GMM instruments, collapsing and restricting lags, and applying IV instruments. Table 5 indicates that the estimates are consistent, with the J test statistic not significant when referred to the $χ_{7}^{2}$ distribution. The third diagnostic test relates to the dynamic stability and stationarity of the model. This boils down to a consideration of the estimates of γ, ρ and θ, as explained by Elhorst (2001,2014), Parent and LeSage (2011, 2012), Debarsy, Ertur, and LeSage (2012) and Lee and Yu (2010). If the maximum absolute real eigenvalue of A = (I − ρW)⁻¹(γI + θW) is less than 1.0 it lies within the unit circle (Elhorst 2001; Parent and LeSage 2011) defining the stationary region for the model. The value of 0.90089 given in Table 5 indicates that estimated ln P_t converge to stable and stationary paths as t becomes large.

The estimates shown in Table 5 indicate that there are highly significant temporal lag and spatial lags, with both $\hat{γ}$ and $\hat{ρ}$ significantly greater than 0, although $\hat{θ}$ is not significant. The level of productivity is spatially correlated and evidently there is a memory effect with productivity levels correlated over time. Also the elasticity of productivity with respect to GDP given by $\hat{β} = 0.069$ is significant, with z = 2.43 which has an upper-tail p-value of 0.01 when referred to the N(0, 1) distribution. The estimated (raw) elasticity indicates that doubling GDP causes productivity to increase by 7 percent. This estimate is similar to that of Ciccone and Hall (1996) relating county employment density to state-level productivity. Using different methods, they find that doubling employment density increases average labour productivity by about 6 percent.

Also we are particularly interested in estimating MSA-specific elasticities, in particular we wish to identify differences in elasticity among the top 8 MSAs and between them and the smaller MSAs. The effect of a 1 percent change in GDP on top 8 productivity has two elements. First it depends, like all other MSAs, on $\hat{β}$ . Secondly, it depends on ${\hat{β}}_{1}, \dots, {\hat{β}}_{8}$ given D₁…D₈ = 1. For example, for Boston, the effect of output is given by $\hat{β} \ln Q$ + ${\hat{β}}_{1} D_{1} \ln Q = (\hat{β} + {\hat{β}}_{1}) \ln Q$ so the elasticity is $(\hat{β} + {\hat{β}}_{1})$ . Likewise other Top MSA elasticities are given by $\hat{β} + {\hat{β}}_{2} D_{2}, \dots, \hat{β} + {\hat{β}}_{8} D_{8}$ . In the case of all the other smaller MSAs, each elasticity equals $\hat{β}$ .

For comparison, Table 6 gives the estimates using GMM-style instruments, as described by Holtz-Eakin, Newey, and Rosen (1988) and Arellano and Bond (1991), in the absence of collapsing and not restricting lags to 2 to 6 but extending from 2 to 10. Additionally, the Table 6 estimates are not based on synthetic instruments but on the lagged values of D₁ln Q, …, D₈ln Q. In this case there are

3 (T - 1 (T - 2) / 2 + 8 (T - 2)

= 207 instruments.¹⁵ Table 6 reaffirms the significance of the temporal and spatial lags, and of ln Q,though there are some differences in estimated parameters compared with the estimates in Table 5. However the large number of instruments casts doubt on the insignificance of the Hansen J test of overidentifying restrictions, and therefore on the consistency of the parameter estimates. Table 7 contains estimates given by a generalized least squares random effects estimator, and therefore this estimator takes no account of endogeneity. While the estimates also suggest that the temporal and spatial lags, and ln Q are significant, the coefficient estimates mean that the maximum absolute real eigenvalue of A = (I − ρW)⁻¹(γI + θW) is equal to 1.1331 indicating non-stationarity. Thus it becomes impossible to calculate, for example, total long-run elasticities, as the levels of lnP will never converge to a steady state.

Table 6.

Estimates Based on GMM Instruments.

Variable	Parameter	Est.	s.e.	z-ratio
$\ln p_{t - 1}$	$γ$	0.589	0.0832	7.08
$W \ln p_{t}$	$ρ$	0.758	0.0873	8.68
$\ln Q_{t}$	$β_{1}$	0.093	0.0182	5.11
Boston	$β_{2}$	0.581	1.2198	0.48
Chicago	$β_{3}$	−0.026	0.5654	−0.05
Dallas	$β_{4}$	0.036	0.3824	0.1
Houston	$β_{5}$	−0.654	1.2603	−0.52
Los Angeles	$β_{6}$	0.207	0.3463	0.6
New York	$β_{7}$	1.209	2.0094	0.6
Philadelphia	$β_{8}$	−0.469	1.3078	−0.36
Washington	$β_{9}$	−3.292	2.2655	−1.45
$W \ln p_{t - 1}$	$θ$	−0.383	0.1599	−2.4
Diagnostics			Reference
AR(1)		−4.80	N(0,1)	p = 0
AR(2)		−0.58	N(0,1)	p = 0.565
Hansen J		216.57	$χ_{195}^{2}$	p = 0.138
Instruments	Number of instruments = 207GMM: $\ln p_{t}$ , $W \ln p_{t} \ln Q_{t}$ $D_{1} \ln Q_{t} D_{2} \ln Q_{t} D_{3} \ln Q_{t} D_{4} \ln Q_{t} D_{5} \ln Q_{t} D_{6} \ln Q_{t} D_{7} \ln Q_{t} D_{8} \ln Q_{t}$

Test of dynamic stability and stationarity Maximum absolute real eigenvalue of $A {= (I - ρ W)}^{- 1} (γ I + θ W)$ , 0.85124 result of test is that the model does not violate the assumption of dynamic stability and stationarity.

Table 7.

Estimates Based on Random Effects GLS.

Variable	Parameter	Est.	s.e.	z-ratio
$\ln p_{t - 1}$	$γ$	0.9786	0.0091	107.73
$W \ln p_{t}$	$ρ$	1.1697	0.0546	21.43
$\ln Q_{t}$	$β_{1}$	0.0033	0.0007	5.03
Boston	$β_{2}$	0.0001	0.0002	0.61
Chicago	$β_{3}$	−0.0004	0.0001	−3.77
Dallas	$β_{4}$	−0.0003	0.0001	−3.06
Houston	$β_{5}$	−0.0007	0.0002	−4.37
Los Angeles	$β_{6}$	−0.0003	0.0001	−2.1
New York	$β_{7}$	−0.0002	0.0002	−1.25
Philadelphia	$β_{8}$	−0.0005	0.0001	−3.9
Washington	$β_{9}$	−0.0003	0.0001	−2.04
$W \ln p_{t - 1}$	$θ$	−1.1709	0.0546	−21.43
Constant	$c$	0.0441	0.0334	1.32

Test of dynamic stability and stationarity Maximum absolute real eigenvalue of $A = {(I - ρ W)}^{- 1} (γ I + θ W)$ , 1.1331 result of test is that the model violates the assumption of dynamic stability and stationarity.

The elasticities in Table 5 are outcomes assuming spillover effects are nullified. However, taking into account spillovers gives the short-term and long-term estimates given in Table 8. Table 8 also gives the standard errors and z-ratios for each of the parameter estimates, using the distribution of the P direct, indirect and total elasticities. In the case of z-ratios greater than 1.96 in absolute value, the 95 percent confidence interval does not include zero, so the parameter estimate is significantly different from zero.

For the smaller MSAs, the estimated short-term total effect of a 1 percent increase in GDP is a 0.27 percent increase in productivity. The outcomes for the top 8 MSAs are varied, ranging from a high total short-run elasticity of 0.36 percent for Los Angeles to −0.05 percent for Philadelphia.

The long-term elasticities in Table 8 are more speculative because they rely on the causes of long-term elasticity to remain constant through time, and are reliant on estimates of parameters made through a period of economic turbulence. A reaffirmation of the stationarity of the estimated model leading to feasible long-run elasticity estimation and a good visual representation of the total long-term elasticities is obtained by iteration. Consider the n by 1 vector ${\hat{y}}_{j}$ , where n is the number of MSAs, and this vector of simulated log productivities is calculated at every iteration, for j = 1, …, S, where in the empirics described here S = 60.

{\hat{y}}_{j} = B_{N}^{- 1} (C_{N} {\hat{y}}_{j - 1} + X β)

(22)

X (1 : n, 1 : 9) = [\ln Q_{T} D_{1} \ln Q_{T} \dots D_{8} \ln Q_{T}]

(23)

In equation (23) the n by 1 vector ln Q_T is the T′th observation of ln Q_t, t = 1, …, T and D_k, k = 1, …, 8 is an n by 1 vector of zeros except for 1 in the row specific to top 8 MSA k and β is the k + 1 vector of parameter estimates given in Table 5. Given stationarity, ${\hat{y}}_{j}$ converges to steady state levels as j goes to S.

Repeating the iterations but increasing each variable in X by Δ = 1 gives

{\hat{y}}_{j}^{Δ} = B_{N}^{- 1} (C_{N} {\hat{y}}_{j - 1}^{Δ} + X^{Δ} β)

(24)

in which

X^{Δ} (1 : n, 1 : 9) = [\ln Q_{T} + Δ D_{1} (\ln Q_{T} + Δ) \dots D_{8} (\ln Q_{T} + Δ)]

(25)

Using equations (24) and (25),

{\hat{y}}_{j}^{Δ}

converges to n (different) steady state levels as j goes to S. It is then the case that

{\hat{y}}_{j}^{Δ} - {\hat{y}}_{j}

goes to steady state as j goes to S, as shown by Figure 7, and these steady states equate to the total long-run elasticities given in Table 8. Figure 7 illustrates the diversity of the total long-run elasticities ranging from Washington where a permanent increase of GDP of 1 percent in the long-run causes productivity to increase by 3.74 percent, to Houston, where productivity increases by 0.37 percent. However following LeSage (2014) in particular, caution should be exercised when interpreting individual level elasticities, which may not be very precise. Instead we focus on the z-ratios and test the null hypothesis that there is zero long-run total elasticity. This null hypothesis is not rejected in the case of Houston, New York and Philadelphia. In contrast, for Boston, Chicago, Dallas, LA and Washington, the null hypothesis is rejected and it is apparent that increased GDP induces an increase in productivity.

Table 8.

Short and Long-term Elasticities.

Short-term
	Direct			Indirect			Total
	Est.	s.e.	z-ratio	Est.	s.e.	z-ratio	Est.	s.e.	z-ratio
$\ln Q_{t}$	0.0673	0.0206	3.26	0.2029	0.1131	1.79	0.2702	0.1095	2.47
Boston	0.1159	0.0325	3.56	0.182	0.1032	1.76	0.2979	0.1001	2.97
Chicago	−0.0123	0.0398	−0.31	0.2825	0.16	1.77	0.2702	0.1699	1.59
Dallas	0.0502	0.0115	4.39	0.152	0.0808	1.88	0.2022	0.0783	2.58
Houston	−0.1904	0.0538	−3.54	0.1411	0.0748	1.88	−0.0493	0.1075	−0.46
LA	0.2721	0.0187	14.58	0.088	0.0428	2.06	0.3601	0.0465	7.74
NY	−0.1147	0.0271	−4.23	0.2482	0.1445	1.72	0.1335	0.1315	1.02
Phil.	−0.3543	0.0541	−6.55	0.3005	0.1751	1.72	−0.0538	0.1856	−0.29
Wash.	0.0637	0.0217	2.94	0.2939	0.1719	1.71	0.3576	0.1651	2.17

Long-term
	Direct			Indirect			Total
	Est.	s.e.	z-ratio	Est.	s.e.	z-ratio	Est.	s.e.	z-ratio
$\ln Q_{t}$	0.4126	0.1971	2.09	2.2558	0.8493	2.66	2.6683	0.4626	5.77
Boston	0.7088	0.2576	2.75	2.0436	0.7522	2.72	2.7524	0.563	4.89
Chicago	−0.076	0.2436	−0.31	3.1655	1.1467	2.76	3.0895	0.7804	3.96
Dallas	0.3065	0.1345	2.28	1.6528	0.6628	2.49	1.9593	0.4748	4.13
Houston	−1.1613	0.3577	−3.25	1.5327	0.6171	2.48	0.3714	1.0164	0.37
LA	1.6567	0.8275	2	0.9188	0.3979	2.31	2.5755	0.9823	2.62
NY	−0.7054	0.4254	−1.66	2.8196	0.9948	2.83	2.1142	1.5014	1.41
Phil.	−2.1905	0.9408	−2.33	3.4167	1.1962	2.86	1.2262	2.2826	0.54
Wash.	0.3936	0.287	1.37	3.3453	1.1779	2.84	3.7389	0.7662	4.88

Figure 7.

Equilibrium long-run total elasticities.

However it should be emphasized that these results are conditional on parameters estimated over the turbulent period 2011-2021. The negative effects for Houston, for example, undoubtedly reflect the strong negative shocks that Houston experienced through this period, such as the ongoing consequences of the 2008 financial crisis, which had a significant impact on the city’s energy sector, the 2014 oil price crash, which also affected the energy industry, and Hurricane Harvey in 2017 caused significant damage to the city and disrupted many businesses. New York and Philadelphia may have been prone to sector-specific impacts following the financial crisis, and more recently a move to home-based working leading to low office occupancy rates, lost tax revenue and reduced spending on goods and services in central city areas.

Conclusions

This paper shows the significant positive causal effect of GDP growth on productivity growth across the MSAs of the USA, in the context of very limited data at the MSA level, endogeneity and temporal and spatial spillover effects. The paper applies a model driven by theory relating output to productivity, but highlights the diverse nature of this relationship, showing that some of the large MSAs are characterized by output having a significant effect on productivity, but for others the relationship is evidently non-existent. The conclusion is that the broad theoretical link between output and productivity, which is a sine qua non of much of contemporary urban economics and economic geography, is less than law-like when examined at a disaggregated level. The paper shows that Boston, Chicago, Dallas, Los Angeles and Washington stand out as having significant positive total long-run elasticities. This implies that as these agglomerations grow, productivity will advance also, and therefore a range of associated beneficial social and economic attributes, particularly higher wages, should also follow. However the analysis also finds no significant long-run elasticity involving output and productivity for Houston, New York and Philadelphia. This could occur if in the long-run output and labor increased at the same rate leaving productivity constant. Alternatively, while output might increase faster than labor, static or even falling productivity could be due to the inefficiencies of production in a spatially confined urban economy negating the positive externalities one normally associates with large cities, what one might call congestion effects in the broadest sense. One can speculate that this may be the fate of many very large cities, with many producers competing for limited services, property and land, and basically getting in each others way. In the long run, static or falling productivity may dissuade investment. On the other hand the prospect of rising productivity could attract additional investment in physical, social and economic capital. While large cities may not disappear instantaneously, their growth may be stymied because output growth may not induce productivity growth feeding back to further output growth, with static productivity resulting in comparatively low wages and less in-migration. So what we could be observing, if maintained in the long run, is a process of adjustment in city status and an upper limit on city size beyond which negative externalities become dominant. From a policy perspective, this type of outcome could be avoided by encouraging physical and human capital growth and by attempting to minimize negative externalities of congestion by adopting advanced communications technologies and by planning solutions that decentralize congested urban places.

Footnotes

Declaration of Conflicting Interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) received no financial support for the research, authorship, and/or publication of this article.

ORCID iD

Bernard Fingleton

Notes

Appendix

References

Abdel-Rahman

Fujita

1990. “Product Variety, Marshallian Externalities, and City Sizes.” Journal of Regional Science 30 (2): 165-183.

Andersen

T. G.

Sørensen

B. E.

1996. “GMM estimation of a Stochastic Volatility Model: a Monte Carlo study.” Journal of Business and Economic Statistics 14: 328-352.

Angeriz

McCombie

Roberts

2008. “New Estimates of Returns to Scale and Spatial Spillovers for EU Regional Manufacturing, 1986—2002.” International Regional Science Review 31 (1): 62-87.

Arellano

Bond

1991. “Some Tests of Specification for Panel Data: Monte Carlo Evidence and an Application to Employment Equations.” Review of Economic Studies 58: 277-297.

Baltagi

B. H.

2021. Econometric Analysis of Panel Data. 6th ed. New York: Springer.

Baltagi

Fingleton

Pirotte

2019. “A Time-Space Dynamic Panel Data Model with Spatial Moving Average Errors.” Regional Science and Urban Economics 76: 13-31.

Barro

R. J.

Lee

J. W.

2013. “A New Data Set of Educational Attainment in the World, 1950–2010.” Journal of Development Economics 104: 184-198.

Becker

G. S.

1993. Human Capital: A Theoretical and Empirical Analysis, with Special Reference to Education. Chicago: University of Chicago Press.

Bernat

G. A.

1996. “Does Manufacturing Matter? A Spatial Econometric View of Kaldor’s Laws.” Journal of Regional Science 36: 463-477.

10.

Bond

S. R.

2002. “Dynamic Panel Data Models: a Guide to Micro Data Methods and Practice.” Portuguese Economic Journal 1: 141-162.

11.

Boots

Tiefelsdorf

2000. “Global and Local Spatial Autocorrelation in Bounded Regular Tessellations.” Journal of Geographical Systems 2: 319-348.

12.

Bowsher

C. G.

2002. “On Testing Overidentifying Restrictions in Dynamic Panel Data Models.” Economics Letters 77: 211-220.

13.

Burger

M. J.

Meijers

E. J.

2010. “Spatial Structure and Productivity in US Metropolitan Areas.” Environment and Planning A: Economy and Space 42: 1383-1402.

14.

Caliendo

Parro

Rossi-Hansberg

Sarte

P.-D.

2018. “The Impact of Regional and Sectoral Productivity Changes on the U.S. Economy.” The Review of Economic Studies 85 (4): 2042-2096.

15.

Ciccone

Hall

R. E.

1996. “Productivity and the Density of Economic Activity.” American Economic Review 86 (1): 54-70.

16.

Ciccone

Peri

2006. “Identifying Human-Capital Externalities: Theory with Applications.” The Review of Economic Studies 73 (2): 381-412.

17.

Dall’erba

Percoco

Piras

2009. “Service Industry and Cumulative Growth in the Regions of Europe.” Entrepreneurship and Regional Development 21: 333-349.

18.

Doran

Fingleton

2018. “US Metropolitan Area Resilience: Insights from Dynamic Spatial Panel Estimation.” Environment and Planning A: Economy and Space 50 (1): 111-132.

19.

Elhorst

J. P.

2001. “Dynamic Models in Space and Time.” Geographical Analysis 33 (2): 119-140.

20.

Elhorst

J. P.

2014. Spatial Econometrics: From Cross Sectional Data to Spatial Panels. Heidelberg: Springer.

21.

Debarsy

Ertur

LeSage

J. P.

2012. “Interpreting Dynamic Space–Time Panel Data Models.” Statistical Methodology 9: 158-171.

22.

Feenstra

R. C.

Inklaar

Timmer

M. P.

2015. “The Next Generation of the Penn World Table.” American Economic Review 105 (10): 3150-3182.

23.

Fingleton

2003. “Externalities, Economic Geography and Spatial Econometrics : Conceptual and Modeling Developments.” International Regional Science Review 26 (2): 197-207.

24.

Fingleton

2009. “Prediction Using Panel Data Regression with Spatial Random Effects.” International Regional Science Review 32 (2): 195-220.

25.

Fingleton

2023. “Estimating Dynamic Spatial Panel Data Models with Endogenous Regressors Using Synthetic Instruments.” Journal of Geographical Systems 25: 121-152. doi:10.1007/s10109-022-00397-3.

26.

Fingleton

McCombie

1998. “Increasing Returns and Economic Growth : Some Evidence for Manufacturing from the European Union Regions.” Oxford Economic Papers 50: 89-105.

27.

Fingleton

Gardiner

Martin

Barbieri

2023. “The Impact of Brexit on Regional Productivity in the UK.” ZFW – Advances in Economic Geography 67 (2:3): 142-160. doi:10.1515/zfw-2021-0061.

28.

Fujita

Thisse

J. F.

2002. Economics of Agglomeration: Cities, Industrial Location, and Regional Growth. Cambridge: Cambridge University Press.

29.

Fujita

Krugman

Venables

A. J.

1999. The Spatial Economy: Cities, Regions and International Trade. Cambridge: MIT Press.

30.

Gerking

1993. “Measuring Productivity Growth in U.S. Regions: A Survey.” International Regional Science Review 16 (1:2): 155-185.

31.

Getis

Griffith

D. A.

2002. “Comparative Spatial Filtering in Regression Analysis.” Geographical Analysis 34 (2): 130-140.

32.

Griffith

D. A.

1988. Advanced Spatial Statistics. Dordrecht: Kluwer Academic.

33.

Griffith

D. A.

1996. “Spatial Autocorrelation and Eigenfunctions of the Geographic Weights Matrix Accompanying Geo-Referenced Data.” Canadian Geographer 40: 351-367.

34.

Griffith

D. A.

2000. “A Linear Regression Solution to the Spatial Autocorrelation Problem.” Journal of Geographical Systems 2: 141-156.

35.

Griffith

D. A.

2003. Spatial Autocorrelation and Spatial Filtering: Gaining Understanding through Theory and Scientific Visualization. New York: Springer.

36.

Guo

Dall’erba

Le Gallo

2013. “The Leading Role of Manufacturing in China’s Regional Economic Growth: A Spatial Econometric Approach of Kaldor’s Laws.” International Regional Science Review 36 (2): 139-166.

37.

Hansen

1982. “Large Sample Properties of Generalized Method of Moments Estimators.” Econometrica 50: 1029-1054.

38.

Holtz-Eakin

Newey

Rosen

H. S.

1988. “Estimating Vector Autoregressions with Panel Data.” Econometrica 56: 1371-1395.

39.

Jacobs

1969. The Economy of Cities. New York: Random House.

40.

Kaldor

1957. “A Model of Economic Growth.” The Economic Journal 67: 591-624.

41.

Lee

L. F.

2010. “Some Recent Developments in Spatial Panel Data Models.” Regional Science and Urban Economics 40: 255-271.

42.

LeGallo

Páez

2013. “Using Synthetic Variables in Instrumental Variable Estimation of Spatial Series Models.” Environment and Planning A 45: 2227-2242.

43.

LeSage

J. P.

Pace

R. K.

2009. Introduction to Spatial Econometrics. London: CRC Press.

44.

LeSage

J. P.

2014. “What Regional Scientists Need to Know about Spatial Econometrics.” The Review of Regional Studies 44 (1): 13-32.

45.

Mankiw

N. G.

Romer

Weil

D. N.

1992. “A Contribution to the Empirics of Economic Growth.” The Quarterly Journal of Economics 107 (2): 407-437.

46.

McCombie

1982. “Economic Growth, Kaldor’s Law and the Static-Dynamic Verdoorn Law Paradox.” Applied Economics 14: 279-294.

47.

McCombie

J. S. L.

Thirlwall

A. P.

1994. Economic Growth and the Balance-of-Payments Constraint. London: Palgrave Macmillan.

48.

McCombie

J. S. L.

Roberts

2007. “Returns to Scale and Regional Growth: The Static‐Dynamic Verdoorn Law Paradox Revisited.” Journal of Regional Science 47: 179-208.

49.

McCombie

J. S.

Spreafico

M. R.

2018. “Productivity Growth of the Cities of Jiangsu Province, China: A Kaldorian Approach.” International Review of Applied Economics 32: 1-22.

50.

Melo

P. C.

Graham

D. J.

Levinson

Aarabi

2017. “Agglomeration, Accessibility and Productivity: Evidence for Large Metropolitan Areas in the US.” Urban Studies 54 (1): 179-195.

51.

Moomaw

R. L.

1983. “Spatial Productivity Variations in Manufacturing: A Critical Survey of Cross-Sectional Analyses.” International Regional Science Review 8 (1): 1-22.

52.

Parent

LeSage

J. P.

2011. “A Space–Time Filter for Panel Data Models Containing Random Effects.” Computational Statistics and Data Analysis 55 (1): 475-490.

53.

Parent

LeSage

J. P.

2012. “Spatial Dynamic Panel Data Models with Random Effects.” Regional Science and Urban Economics 42: 727-738.

54.

Parilla

Muro

2017. “Understanding US Productivity Trends from the Bottom-Up.” https://www.brookings.edu/articles/.

55.

Patuelli

Griffith

D. A.

Tiefelsdorf

Nijkamp

2006. “The Use of Spatial Filtering Techniques.” Tinbergen Institute Discussion Paper 2006-049/3.

56.

Pesaran

M. H.

2015. Time Series and Panel Data Econometrics. Oxford: Oxford University Press.

57.

Piras

Postiglione

Aroca

2012. “Specialization, R&D and Productivity Growth: Evidence from EU Regions.” Annals of Regional Science 49: 35-51.

58.

Pons-Novell

Viladecans-Marsal

1999. “Kaldor’s Laws and Spatial Dependence: Evidence for the European Regions.” Regional Studies 33: 443-451.

59.

Quigley

J. M

. 1998. “Urban Diversity and Economic Growth.” Journal of Economic Perspectives 12: 127-138.

60.

Rivera-Batiz

1988. “Increasing Returns, Monopolistic Competition, and Agglomeration Economies in Consumption and Production.” Regional Science and Urban Economics 18: 125-153.

61.

Roodman

. 2009a. “A Note on the Theme of Too Many Instruments.” Oxford Bulletin of Economics and Statistics 71: 135-158.

62.

Roodman

. 2009b. “How to Do Xtabond2: an Introduction to Difference and System GMM in Stata.” The Stata Journal 9 (1): 86-136.

63.

Sargan

J. D.

1958. “The Estimation of Economic Relationships Using Instrumental Variables.” Econometrica 26: 393-415.

64.

Setterfield

1997. Rapid Growth and Relative Decline: Modelling Macroeconomic Dynamics with Hysteresis. Basingstoke: Macmillan Press.

65.

Toner

1999. Main Currents in Cumulative Causation: The Dynamics of Growth and Development. Basingstoke: Macmillan Press.

66.

United States Census Bureau.

2012. https://www.census.gov/en.html

67.

Windmeijer

2005. “A Finite Sample Correction for the Variance of Linear Efficient Two-step GMM Estimators.” Journal of Econometrics 126: 25-51.

68.

World Bank . 2019. World Development Indicators 2019. Washington, DC: World Bank.