Abstract
In a recent issue of Political Analysis, Grant and Lebo authored two articles that forcefully argue against the use of the general error correction model (GECM) in nearly all time series applications of political data. We reconsider Grant and Lebo’s simulation results based on five common time series data scenarios. We show that Grant and Lebo’s simulations (as well as our own additional simulations) suggest the GECM performs quite well across these five data scenarios common in political science. The evidence shows that the problems Grant and Lebo highlight are almost exclusively the result of either incorrect applications of the GECM or the incorrect interpretation of results. Based on the prevailing evidence, we contend the GECM will often be a suitable model choice if implemented properly, and we offer practical advice on its use in applied settings.
In a recent issue of Political Analysis, Taylor Grant and Matthew Lebo author the lead and concluding articles of a symposium on time series analysis. These two articles argue forcefully against the use of the general error correction model (GECM). In their lead article, Grant and Lebo declare: “we recommend the GECM in only one rare situation: when all of the variables are strictly unit-root series,
Grant and Lebo identify two primary concerns with the GECM. First, when time series are stationary, the GECM cannot be used as a test of cointegration. This is a useful and often under-appreciated point. 2 However, if this were Grant and Lebo’s only concern, their analysis would not fundamentally alter the conclusions of past research and could be easily dealt with in future studies by not using the GECM to test for cointegration with stationary time series. Grant and Lebo’s second concern is much more troubling. They argue that across most (and perhaps all) political science time series, the GECM will produce “an alarming rate of Type I errors” (p.4). This threat of spurious findings is the primary reason Grant and Lebo advocate abandoning the GECM.
If the GECM regularly produces spurious results, scholars would indeed be well-advised to abandon this approach. However, the problems Grant and Lebo highlight follow almost entirely from either incorrect applications of the GECM or incorrect interpretation of results. Indeed, a careful examination of Grant and Lebo’s results, as well as the other contributions to the Political Analysis symposium, shows the GECM performs quite well across a variety of common data scenarios in political science. In this article, we examine five of the scenarios that Grant and Lebo consider and we find that for four of the data scenarios, if applied correctly the GECM can be estimated without concern for spurious relationships. With the fifth data type (fractionally integrated series), the GECM sometimes offers a suitable approach. 3 Our analysis pays particularly close attention to the cases of bounded unit roots and near-integrated time series. We devote extra attention to these types of time series because they are common in political science and because none of the other symposium articles reconsider Grant and Lebo’s claims about these types of data. 4 We show that when applied correctly, there is no inherent problem when using the GECM with bounded unit roots or near-integrated data.
Although our conclusions differ greatly from Grant and Lebo’s recommendations, we do not expect our findings to be controversial. Most of our evidence comes directly from Grant and Lebo’s own simulations. We also support our claims with additional simulation results. Although our findings are straightforward, our conclusions about the GECM are important for multiple reasons. First, a correct understanding of the GECM holds implications for how we understand existing research. Grant and Lebo identified five prominent articles and in each case they critiqued the authors’ use of the GECM. Grant and Lebo also pointed out that none of the other symposium articles provide “a defense of any GECM results published by a political scientist” (p.70). Our findings show that Grant and Lebo were too quick to criticize these researchers’ use of the GECM. Indeed, we reconsider two of the articles that Grant and Lebo critiqued (Casillas et al., 2011; Kelly and Enns, 2010) and we demonstrate that a correct understanding of the GECM indicates that the methods and findings of these two articles are sound.
Understanding the GECM also holds implications for future research. For time series analysis, Grant and Lebo recommend fractional integration (FI) methods. Although FI methods are certainly an important statistical approach (e.g. Box-Steffensmeier et al., 1998; Box-Steffensmeier and Tomlinson, 2000; Clarke and Lebo, 2003), substantial disagreement exists regarding their utility for political science data (Box-Steffensmeier and Helgason, 2016). 5 Given the debate about the utility of FI methods (especially with short time series), it is important for researchers to know when the GECM avoids the errors that Grant and Lebo ascribe to it. Until alternate methods are shown to perform better, based on our findings we recommend that researchers use the GECM for bounded unit roots (when statistical tests indicate the dependent variable contains a unit root and cointegration is present) and with near-integrated data (again, cointegration must be established when statistical tests suggest a unit root). 6 We also remind readers that the GECM is appropriate with FI data in some contexts (Esarey, 2016; Helgason, 2016) and it is appropriate with stationary time series (although we point out that the mathematically equivalent autodistributed lag model (ADL) is less likely to produce errors of researcher interpretation with stationary data).
We conclude with a detailed summary of our recommendations for time series practitioners, highlighting where we agree with Grant and Lebo and where our recommendations differ. The conclusion also discusses avenues for future research. These include studying the performance of the GECM with other types of time series. Wlezien (2000), for example, discusses “combined” time series (which contain both integrated and stationary processes) and he shows they can be modeled with a GECM. Future research also needs to continue to evaluate the performance of FI techniques. Of particular interest is resolving the debate between Lebo and Grant (2016) and Keele et al. (2016) regarding how long a time series must be to reliably estimate the FI parameter
Reconsidering five of Grant and Lebo’s data scenarios
In this section, we revisit Grant and Lebo’s first five data scenarios. We find that across all five scenarios, the GECM typically avoids spurious relationships. We explain why our results differ from Grant and Lebo’s conclusions and highlight when we agree with their recommendations.
Case 1: The dependent variable contains a unit root,
Grant and Lebo begin with a very important, and often forgotten, point. They show that if the dependent and independent variables contain a unit root, cointegration must be established prior to interpreting the results of a GECM (see also Enns et al., 2014).
7
Grant and Lebo explain, “Without cointegration, however, the model is unbalanced and the practitioner should set aside the estimates and choose a different specification” (p.7).
8
Grant and Lebo also correctly emphasize that when
must be less than the corresponding critical value in Ericsson and MacKinnon (2002). Grant and Lebo’s Table 2 (second row) confirms that when an analyst uses the appropriate critical values, this cointegration test performs well even when
At first glance, the bottom row of Grant and Lebo’s Table 2 might appear to contradict this statement, because these results show that Grant and Lebo incorrectly rejected the null hypothesis that
Case 2: The dependent variable is a bounded unit root
Grant and Lebo’s second case considers bounded unit roots, which are time series that “can exhibit the perfect-memory of integrated data” but are bound between an upper and lower limit (p.10). Grant and Lebo cite the public’s policy mood as an example (Stimson, 1991). They find that policy mood contains a unit root, but because it is based on survey percentages, it is clearly bound between 0 and 100. According to Grant and Lebo, bounded unit roots should not be analyzed with a GECM. They write, “Even if we find series that are strictly unit-roots and we use MacKinnon CVs, mistakes are still rampant if our dependent variable is one of the vast majority of political times series that is bounded” (p.12).
A reconsideration of Grant and Lebo’s simulations demonstrates, however, that the GECM performs no worse with bounded unit roots than it does with the integrated time series discussed in Case 1 (where Grant and Lebo recommend the use of the GECM). First, consider Grant and Lebo’s finding that, “Boundedness does not seem to affect the estimation of
To understand why the results in Grant and Lebo’s Table 3 are problematic, recall that Grant and Lebo’s key point about bounded unit roots is that despite containing a unit root, bounded unit roots behave differently than pure
As noted above (also see Case 3 below), Grant and Lebo show that tests of cointegration based on a GECM will incorrectly find evidence of cointegration too often if the dependent variable is stationary. This realization provides a rather simple modeling strategy for bounded unit roots. When a bounded unit root behaves as if it is stationary, the GECM should not be used to test for cointegration. When a bounded unit root behaves as if it contains a unit root, the GECM should be appropriate. To test whether this strategy avoids Type I errors, we replicate Grant and Lebo’s bounded unit root simulations, adding a test for the time series properties of
After generating these series, we use an augmented Dickey Fuller (ADF) test to evaluate whether we reject the null hypothesis of a unit root in the dependent series. At first, our use of an ADF test may seem surprising. It is well known that ADF tests are underpowered against the alternative hypothesis of stationarity (Blough, 1992; Cochrane, 1992). Thus, the ADF may incorrectly conclude that a series that behaves as if stationary follows a unit root process in the observed data. However, this means we are biasing our simulations against support for the GECM since we are more likely to incorrectly conclude the series contains a unit root and thus inappropriately utilize the GECM as a test of cointegration (thereby inflating the rate of Type I errors with those cointegration tests). If the ADF rejects the null of a unit root, we do not use the GECM to test for cointegration. Even though the true DGP in our simulations is a bounded unit root, if the series behave as if stationary (because the bounds and mean reversion generate a series with a constant variance and mean), the GECM should not be used to test for cointegration. Not only would the cointegration test be wrong if the dependent series behaves as if stationary, but we have no reason to expect a cointegrating relationship between an integrated predictor and an outcome variable that appears stationary (Keele et al., 2016).
Table 1 reproduces Grant and Lebo’s original results (from their Table 3, p.11) and the results from our simulations. Recall that our simulations are identical to Grant and Lebo’s except we do not use the GECM to test for cointegration if an ADF test on
The percentage of simulations that provide incorrect evidence of cointegration based on Grant and Lebo’s Table 3 and based on first testing for a unit root.
Table entries report Type I error rates when testing for cointegration with simulated data, depending on the specified sample size, measurement bounds, and variance. “Grant & Lebo reproduction” presents values from Grant and Lebo, Table 3. “With unit root test” presents the corresponding values after first testing for a unit root in
Our primary interest is evaluating whether the GECM yields correct inferences if we first test the sample properties of the bounded unit root. To test this expectation, the “With Unit Root Test” rows only use the GECM to test for cointegration if the ADF test does not reject the null hypothesis of a unit root in the dependent series. Across all parameters, the Type I error rate is about 5 per cent. By first diagnosing the time series properties of the dependent series (which is standard practice in time series analysis), we avoid the Type I errors in the cointegration tests. These results show that there is no inherent problem with using the GECM with bounded unit roots. 14 This is an important result. Grant and Lebo state, “Even if we find series that are strictly unit-roots and we use MacKinnon CVs, mistakes are still rampant if our dependent variable is one of the vast majority of political times series that is bounded” (p.12). Yet, the results in Table 1 show that if we follow these guidelines (i.e. find evidence of unit–roots and use MacKinnon critical values), the mistakes Grant and Lebo found essentially disappear.
In addition to offering guidance about the appropriate use of the GECM, these simulation results hold implications for Grant and Lebo’s analysis of Kelly and Enns (2010). Grant and Lebo use Kelly and Enns’ analysis of the relationship between income inequality and policy mood to illustrate the pitfalls of analyzing a bounded unit root with a GECM. Specifically, based on Kelly and Enns’ analysis, Grant and Lebo conclude that there is “No cointegration” and thus the “GECM model [is] inappropriate” (p.26). Yet, looking at Kelly and Enns’ most parsimonious analysis (Table 1, Column 2) we find clear evidence of cointegration.
15
We select the most parsimonious specification because in Keele, Linn, and Webb’s first contribution to the symposium, they suggested that Kelly and Enns over-fit their model. By focusing on this parsimonious model (which Keele, Linn, and Webb did not consider) we mitigate concerns that the results are due to over-fitting the model. The evidence of cointegration in Kelly and Enns’ analysis combined with the simulation results above in Table 1 support the use of the GECM. Grant and Lebo also conclude that there is “No support for short- or long-term effect of income inequality on public mood” (p.26). This conclusion is surprising because, as noted above, Grant and Lebo conclude that “Boundedness does not seem to affect estimation of
Since the various simulations validate Kelly and Enns’ estimates and conclusions about cointegration, we wondered why Grant and Lebo’s replication of Kelly and Enns with nonsense regressions produced spurious results. In their nonsense regressions, Grant and Lebo replicated Kelly and Enns’ analysis substituting the key predictor variables for variables that are surely unrelated to the public’s policy mood: beef consumption, coal emissions, tornado fatalities, and onion acreage (see Grant and Lebo’s Table E.13). Grant and Lebo write, “Based on our past replications and simulations we expect spurious regressions, and that is what we find” (Grant and Lebo, supplementary materials, p.43). This conclusion depends, however, on an incorrect application of the GECM. Across the eight nonsense regressions, none show evidence of cointegration. 16 If Grant and Lebo followed their own advice to “set aside the estimates” without cointegration, they would never have reported these results from these nonsense regressions. 17 The spurious results in their nonsense regressions result because they did not test for cointegration. We should also note that even if the GECM was implemented correctly, we do not recommend the use of atheoretical variables to demonstrate the possibility of spurious findings. Instead, we recommend the standard approach of simulating data and conducting Monte Carlo experiments. 18
Grant and Lebo’s final critique of Kelly and Enns comes from their Tables E.16 and E.17, where they re–analyze Kelly and Enns’ data with a FECM and find no significant relationships in the data. What Grant and Lebo fail to consider is the possibility that FECMs under-identify true relationships in small samples. Since Grant and Lebo’s fractional differencing analyses failed to fully replicate four out of five influential articles, this is a critical consideration. In all of their simulations, Grant and Lebo never report how often fractional differencing methods identify true relationships. Helgason (2016) considered prediction error and found that the performance of the FECM depends heavily on whether short-term dynamics are present and the sample size, but he did not test how often FECMs correctly identify true relationships. We also do not know how the FECM performs if applied to data that are not fractionally integrated. If the FECM is overly conservative, Grant and Lebo’s re-analysis and critiques of the other articles would also be highly problematic. This is an important area for future research.
Case 3: The dependent variable and all independent variables are stationary
As we noted earlier, Grant and Lebo make an important contribution by highlighting the fact that
Here we wish to clarify that the increased risk of Type I errors that Grant and Lebo refer to is due entirely to the potential for incorrect interpretation of GECM results. Although the GECM and ADL contain the exact same information, the two models present this information differently. When researchers fail to realize that the two models present information differently, errors of interpretation can emerge. Thus, we agree with Grant and Lebo that when the dependent variable is stationary, the parameterization of the GECM is more likely than the ADL to lead to errors of interpretation. Specifically, when estimating a GECM with a stationary
Case 4: The dependent variable is strongly autoregressive/near-integrated
Grant and Lebo’s fourth case focuses on near-integrated data, which are time series with a root close to, but not quite, unity (Phillips, 1988). Grant and Lebo again draw a confusing distinction between the ADL and GECM. In their first article, they write: “Our findings for the ADL match those of De Boef and Granato (1997), who find that the model has acceptable spurious regression rates with near–integrated data. But we also find that this does not translate for the same data in the GECM” (p.15). They present additional simulation results in support of this claim in their concluding article, stating that when estimating a GECM, “With sixty observations there is a significant threat of Type I errors” (p.75). Because the ADL and GECM are the same model, as with the stationary dependent variable example above, the threat of errors stems entirely from potential researcher errors. Because the ADL and GECM are mathematically equivalent, and since Grant and Lebo found evidence that the ADL avoids spurious correlations (see their Table 6), the same must be true for the GECM. There is no inherent problem with the GECM and near-integrated data. If the GECM is implemented correctly, no problems emerge.
Here we review Grant and Lebo’s simulation results to illustrate potential errors that researchers need to avoid. One potential error is evaluating
A second error can result from using the incorrect critical values when testing for cointegration with
The percentage of spurious relationships for near-integrated series, results from Grant and Lebo Tables G.1–G.5 (
Notes: Long-term estimates (
The left half of our Table 2 reports the false rejection rates for
As noted above, the bottom half of Grant and Lebo’s Table 6 found false rejection rates above 5 per cent because they did not use the appropriate MacKinnon critical values. The right half of Table 2 reports the same information (i.e. the false rejection rate for
Grant and Lebo’s lead and concluding articles recommend that the GECM should not be estimated with near-integrated data and short time series. We agree that if scholars implement the GECM incorrectly with near-integrated data, incorrect results will emerge. However, Grant and Lebo’s simulation results (based on the ADL as well as the GECM) show that if the GECM is implemented and interpreted correctly (i.e. long-run relationships in the GECM are only considered if there is evidence of cointegration), it is completely appropriate with near-integrated data.
Case 5: The dependent variable is fractionally integrated, (0, d, 0) and
Grant and Lebo’s fifth case focuses on fractionally integrated time series. Often researchers assume a time series is stationary (
Despite Grant and Lebo’s enthusiasm for FI methods, two other contributions to the Political Analysis symposium demonstrate that even when data are fractionally integrated, the GECM is often appropriate. Esarey (2016) examined fractionally integrated time series where
The top half of Grant and Lebo’s concluding Table 2 reports the rate of false rejections for LRMs. The results seem to contradict Esarey’s findings because in almost every case, the false rejection rate was greater than 5 per cent. However, Grant and Lebo did not first test for cointegration, which means the rate of Type I errors is greatly inflated. As Grant and Lebo explained in the context of near-integrated data, “relying on the significance of the LRM rather than the joint hypothesis test of the
Helgason (2016) performs additional simulations with fractionally integrated time series and finds that the performance of the GECM and FECM depends on the length of the time series and whether or not short-run dynamics are present. When a cointegrating relationship exists between fractionally integrated variables, both models provide similar results when
The fact that neither Esarey nor Helgason found evidence of increased Type I error rates when the GECM was applied to fractionally integrated series makes Grant and Lebo’s conclusions regarding Casillas et al. (2011) seem surprising. Casillas, Enns, and Wohlfarth examined the relationship between the public’s policy mood and Supreme Court decisions. Grant and Lebo argue that the dependent variables analyzed by Casillas, Enns, and Wohlfarth were fractionally integrated and they should not have estimated a GECM. However, a closer look at the data and Grant and Lebo’s analysis suggests that the GECM was indeed appropriate. 24
First, Grant and Lebo’s estimate of the FI parameter
To understand the necessary steps when testing for FI, recall that if the ARFIMA model that estimates
Portmanteau (Q) test for autocorrelation in ARFIMA and ARIMA models of the per cent of liberal supreme court decisions that reversed the lower court.
Notes: Table entries report the results of Ljung and Box (1978) Portmanteau (Q) white noise tests across different lag lengths. The maximum lag length (20) is determined by
A second concern with Grant and Lebo’s approach is the model they use to estimate
It may be that with short time series, we cannot draw firm conclusions about the time series properties of variables. Yet, the balance of evidence from the various tests suggest that these series contain a unit root. Of course, because these series are percentages, they are clearly bounded. We saw above that when we cannot reject the null of a unit root (as is the case here), as long as the MacKinnon critical values show evidence of cointegration, researchers can model bounded unit roots with a GECM. In addition, the
Why then did Grant and Lebo’s nonsense regressions, where they replaced the predictors in Casillas, Enns, and Wohlfarth’s analysis with the annual number of shark attacks, tornado fatalities, and beef consumption, show evidence of spurious relationships? We again emphasize that simulations, not nonsense regressions, are the most appropriate way to test for the rate of spurious regression. However, even if we take these nonsense regressions at face value, once again we find that Grant and Lebo’s conclusions result because they interpreted the GECM results despite no evidence that shark attacks, tornado fatalities, and beef consumption are cointegrated with the dependent variables. 29 If Grant and Lebo followed their own advice to “set aside the estimates” without cointegration, they would never have reported these results from their shark attack/tornado/beef analysis. Instead, they conclude, “our nonsense IVs are significant far too often” (p.22). This is an erroneous conclusion that emerged because Grant and Lebo failed to follow their own recommendations regarding the GECM.
Conclusions and recommendations
We applaud Grant and Lebo for trying to clarify the time series literature. Their lead and concluding articles to the recent Political Analysis symposium on time series error correction methods make some important contributions. Accordingly, we agree with the following recommendations.
When analyzing integrated time series, researchers must establish cointegration with appropriate MacKinnon critical values prior to interpreting the results of a GECM.
When analyzing a dependent variable that is stationary, researchers cannot use
Although the ADL and GECM produce the same information (in different formats), the ADL is less likely to yield errors of interpretation when
However, we are not convinced that Grant and Lebo have presented sufficient evidence to raise fundamental questions about the applicability of the GECM to political time series. In this article we have re-examined many of Grant and Lebo’s own simulations and have supplemented these with our own new analyses. The results here point to the conclusion that, when executed properly, the GECM is an analytically appropriate model choice when a dependent variable is:
a bounded unit root (with cointegration);
near-integrated (with cointegration).
These scenarios are common in social science applications, meaning that Grant and Lebo’s skepticism toward the GECM is largely misplaced.
In addition to making the broad point that the GECM can be usefully applied to a variety of political time series, we also showed that Grant and Lebo’s critiques of Casillas et al. (2011) and Kelly and Enns (2010) were highly flawed. Taking Grant and Lebo’s critiques at face value, Keele, Linn, and Webb suggested that the problem with these analyses could be over-fitting. But in reality there was no problem to solve. 30 When analyzed appropriately, the core results of these earlier studies remain intact, and it is incorrect to conclude that the GECM was inappropriately applied in these cases.
Some of our conclusions are echoed by other contributors to the Political Analysis symposium. But none of the responses directly question Grant and Lebo’s core argument about the highly constrained set of circumstances in which a GECM would be appropriate. We have attempted to highlight substantial problems with key conclusions in Grant and Lebo’s work. In sum, our results show that their methodological concerns about the inappropriateness of the GECM for political science time series are far too broad. Moreover, the critiques of at least two of the substantive analyses that Grant and Lebo replicate are not supported by their own evidence. We therefore believe that it would be a mistake for applied researchers to adopt Grant and Lebo’s recommendation to use “the GECM in only one rare situation” (p.27). The evidence does not support this recommendation. The GECM should not be set aside. It should remain a technique that is regularly applied alongside other techniques in the tool kit of social science research.
Based on our reanalysis, there is still much to learn from Grant and Lebo’s work. Most particularly, they have done a great service by drawing additional attention to FI techniques. Our primary purpose here was not to explore models of FI, and the papers in the symposium leave several considerations about FI techniques unresolved. First, debate exists regarding the ability of FI tests to accurately identify FI. Second, as we demonstrated by reviewing Grant and Lebo’s re-analysis of Casillas et al. (2011), researchers have multiple parametric and semi-parametric methods available to estimate factional integration, yet different assumptions made in these tests (e.g. presence and identification of short-term dynamics) may lead to different conclusions about FI and the dynamic properties of a time series. Third, it is not yet evident that FI modeling approaches can reliably identify true relationships in the data, especially with short time series. Fourth, it is not clear how FI techniques perform if incorrectly applied to series that are not fractionally integrated. We look forward to seeing new contributions in political methodology that help to sort out these and other remaining issues regarding FI. Another important avenue for future research is the consideration of “combined” time series. Wlezien (2000) has shown that “combined” time series (where a process combines both integrated and stationary components) are likely common in political science data and that these series can be modeled with a GECM. Combined time series are particularly important to consider because, as Wlezien (2000) explains, they tend to look like FI series in finite samples, which further calls into question the ability of FI tests to correctly identify time series when
Footnotes
Acknowledgements
We would like to thank Bryce Corrigan, Mike Hanmer, Jeff Harden, two anonymous reviewers, the associate editor, and the editor at Research and Politics for helpful comments and suggestions. All information necessary to reproduce the simulations reported in this article is provided in the Supplementary Appendix as well as the Research and Politics Dataverse site.
Declaration of conflicting interest
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
Notes
Carnegie Corporation of New York Grant
The open access article processing charge (APC) for this article was waived due to a grant awarded to Research & Politics from Carnegie Corporation of New York under its ‘Bridging the Gap’ initiative.
