Abstract
Oppenheim et al. (2015) provides the first empirical analysis of insurgent defection during armed rebellion, estimating a series of multinomial logit models of continued rebel participation using a survey of ex-combatants in Colombia. Unfortunately, many of the main results from this analysis are an artifact of separation in these data – that is, one or more of the covariates perfectly predicts the outcome. We demonstrate that this can be identified using simple cross tabulations. Furthermore, we show that Oppenheim et al.’s (2015) results are not supported when separation is explicitly accounted for. Using a generalization of Firth’s (1993) penalized-likelihood estimator – a well-known solution for separation – we are unable to reproduce any of their conditional results. While our (re-)analysis focuses on Oppenheim et al. (2015), this problem appears in other research using multinomial logit models as well. We believe that this is both because the discussion on separation in political science has primarily focused on binary-outcome models, and because software (Stata and R) does not warn researchers about seperation in multinomial logit models. Therefore, we encourage researchers using multinomial logit models to be especially vigilant about separation, and discuss simple red flags to consider.
The analysis of qualitative outcomes (i.e. binary, ordinal, or nominal data) is ubiquitous in political science, with research into conflict, coalition formation, vote choice, policy adoption, etc. When predictors are also discrete, researchers need to be mindful of possible separation: that is, perfect prediction of the outcome. Under separation, sample analysis produces implausible estimates of population parameters. While this is now well understood in binary-outcome models (Rainey, 2016; Zorn, 2005), applied research often fails to recognize that the same concern is present with ordinal- and nominal-valued outcomes.
Oppenheim, Steele, Vargas, and Weintraub (2015) provides a useful example of the consequences of failing to recognize such separation. In their multinomial logistic models of insurgent defection, Oppenheim et al. (2015) includes several interactions of rare, binary predictors. In so doing, they partition the data such that there are several combinations with no observations. As a result, the reported risk ratios on these predictors are in the hundreds of thousands – implying, for example, that less than 1 in 1 million similar rebels would be captured instead of demobilizing. However, when we explicitly account for separation in Oppenheim et al.’s (2015) models – adding a penalty to the score vector (Kosmidis and Firth, 2011) – none of their conditional expectations are supported.
The Oppenheim et al. (2015) piece presents a larger question: why does separation often go unrecognized in multinomial logistic models? We discuss two possible reasons: i) most discussions of separation in political science have focused on the binary-outcome case, and ii) statistical software (Stata and R) does not handle separation in a consistent manner. As such, we encourage researchers with nominal outcomes to be particularly vigilant about possible separation.
Separation in (multinomial) logistic regression
With discrete data, separation occurs when one or more covariates correctly classifies – that is, predicts the outcome for – each observation. More formally, complete separation occurs when a subvector
While these issues seem well understood by political scientists when estimating binary-outcome models (Rainey, 2016; Zorn, 2005), less care is taken by researchers estimating multinomial models (e.g. Forsberg, 2013; Koga, 2011). For example, Fortna (2015) explicitly notes that the effect of a predictor, Africa, cannot be reported in a set of binary logistic models due to perfect prediction. Yet, when several variables – including the main predictor, Terrorist Rebel Group – perfectly classify the outcomes in subsequent multinomial logistic models, the issue is not discussed.
What explains this oversight given that the consequences from separation are exactly the same as those confronted in the binary logit case? First, many researchers may be unaware of the correspondence between separation in the binary and multinomial logistic models. While many of the seminal works on separation discuss the more general
Second, statistical software packages often have different conventions for handling separation, which may confuse researchers. Worse still, some packages vary in their own respective treatment of separation across different commands. In Stata, for example, if there is perfect prediction in a binary logit model (logit), offending observations are dropped and users receive a warning message. However, with multinomial logit (mlogit), these observation are retained and there is no warning message provided (Long and Freese, 2006). Similarly, in R, both the multinom and mlogit functions do not warn of possible separation. As such, researchers cannot regularly rely on statistical software warnings to identify separation.
To demonstrate the consequences of failing to recognize separation in multinomial logistic models, we next reanalyze Oppenheim et al. (2015).
Oppenheim et al. (2015)
Conflict studies has increasingly turn to individual-level data to better understand the microfoundations of political violence. While others have focused on initial joining behavior, Oppenheim et al. (2015) offers the first empirical analysis of continued participation in an ongoing rebellion. Specifically, why do rebels choose to defect or remain loyal? Oppenheim et al. (2015) argues a combatant’s decision is a function of their initial reason for joining (ideological or economic), the subsequent behavior of the group (undergoing ideological indoctrination and/or participating in peasant abuse), wartime experiences (pressure from armed forces), and interactions therein. For concision, we summarize Oppenheim et al.’s (2015) theoretical expectations in Table 1.
Oppenheim et al.’s (2015) expectations for demobilzation.
Note: Arrows indicate whether demobilization/side-switching is more (↑) or less (↓) likely under the given conditions. The column titles give the unconditional expectations (H1 and H2), while each of the elements give the conditional expectations (i.e. interactions of the row and column predictors).
To test these, Oppenheim et al. (2015) uses survey data from Fundación Ideas para la Paz on Colombian ex-combatants. Specifically, their analysis uses a sample of 582 respondents who joined left-wing guerilla groups (e.g. FARC, ELN) but were subsequently captured (49), individually demobilized (506), or switched sides to the paramilitary group (27). These data are used to construct a nominal dependent variable (i.e. captured, demobilized, switched), which is then analyzed in a series of multinomial logistic models. Using a set of binary predictors and interactions (e.g. Economic need × Political indoctrination) on the rebels’ histories, Oppenheim et al. (2015) concludes broad support for their expectations.
However, we find that several of their main results are due solely to separation. While the unconditional relationships hold (H1 and H2), none of the conditional arguments, which serve as the basis for much of their theory, find support. We demonstrate this in two ways. First, a simple cross tabulation presented in Table 2 shows that there are no observations for several of the conditions – meaning there is no variation within that category. For example, the sample contains no instances of a captured rebel who both joined for economic reasons and was not politically indoctrinated. The same is true for two other combinations of conditions (as indicated by the bold zeros in Table 2). This absence of observations is not surprising given they occur under conditions when rare outcomes (i.e. only 49 individuals were captured and 27 switched) are intersected with rare predictors (i.e. only 38 respondents reported peasant abuse). In the presence of these empty cells, maximum likelihood estimates do not exist.
Cross tabulation of outcome categories and interactions from Oppenheim et al. (2015).
What can researchers do? Heinze and Schemper (2002) show that the penalized-likelihood strategy of Firth (1993) can recover finite-valued estimates in the presence of separation – a solution that is now widely used in political science (Zorn, 2005). Kosmidis and Firth (2011) extend this penalized-likelihood strategy to multinomial logit, enabling us to reanalyze Oppenheim et al. (2015) using a bias-corrected MNL. In short, this penalizes the likelihood by the square root of the determinant of the information matrix (i.e. Jeffreys prior). More intuitively, this is roughly analogous to adding a small value to the frequencies in Table 2, ensuring no separation. We prefer this over other solutions as: i) it is the natural extension of a solution already familiar to political scientists, and ii) functions are available (brglm2 in R) to easily implement this procedure.
The results are presented in Table 3, which first reproduces the findings from Oppenheim et al. (2015) using multinomial logit (MNL) and then attempts to replicate these results using Kosmidis and Firth’s (2011) penalized multinomial logit (Firth-MNL).
1
In the absence of separation, these estimators produce similar results, but here we see dramatic differences. To illustrate, consider the interaction Economic need × Indoctrination in Model 3. With MNL, we observe an implausibly low estimate (
Replication of Models 3–5 from Oppenheim et al. (2015) – naïve multinomial logit (MNL) vs. bias-corrected multinomial logit (Firth-MNL).
Following Oppenheim et al. (2015), Captured is used as the reference category. Standard errors in parentheses. * = p<0.1, ** = p <0.05, *** = p<0.01.
With Firth-MNL, the coefficient estimates on these same covariates are dramatically attenuated (–0.856 and 1.113, respectively) and do not approach significance even at the p < 0.1 level. The same is true for the interactions considered in Models 4 and 5, corresponding exactly to the combinations for which there were empty cells in Table 2. In short, rather than providing evidence in support of Oppenheim et al.’s (2015) conditional expectations, these large coefficient values actually indicate separation.
Discussion
Failing to recognize separation in models with qualitative outcomes can produce inaccurate inferences. As Oppenheim et al. (2015) demonstrates, researchers may even conclude the anomalous estimates provide very strong support for their theories. Moreover, it illustrates that even where researchers recognize the presence of perfect prediction, its consequences for estimation may not be understood. 2 Rather than offer support, parameter values recovered under separation convey little useful information about the underlying population parameters of interest.
This is not to say that the conditional expectations articulated in Oppenheim et al. (2015) are wrong, only that they are not supported by these data. There is insufficient information from which to draw reliable inferences over the claims they make. The descriptive evidence – observed frequencies – suggests that they may indeed be right, but to discriminate this empirically would require more data exhibiting greater variation.
Separation is not a problem unique to Oppenheim et al. (2015). As such, we highlight several points in concluding. First, researchers should recognize that separation concerns apply analogously to nominal-outcome models. Second, researchers should undertake simple diagnostics such as cross tabs, which can reveal sparse data coverage. Third, researchers should scale predictors (as suggested in Gelman, Jakulin, Pittau et al., 2008) and beware of large coefficients and standard errors, as these can be indicative of separation. Finally, researchers should consider principled solutions to separation – often through a penalty or a prior – for robustness. 3 Here we have demonstrated one of these approaches and shown its efficacy in helping to avoid unsubstantiated inferences.
Supplemental Material
rap_appendix – Supplemental material for A warning on separation in multinomial logistic models
Supplemental material, rap_appendix for A warning on separation in multinomial logistic models by Scott J. Cook, John Niehaus and Samantha Zuhlke in Research and Politics
Footnotes
Acknowledgements
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
Supplementary material
The supplementary files are available at http://journals.sagepub.com/doi/suppl/10.1177/2053168018769510. The replication files are available at:
.
Notes
Carnegie Corporation of New York Grant
This publication was made possible (in part) by a grant from Carnegie Corporation of New York. The statements made and views expressed are solely the responsibility of the author.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
