Abstract
This paper intends to remind communication scientists that the indirect effect as estimated in mediation analyses is a statistical synonym for omitted variable bias (i.e. confounding or suppression). This simple fact questions the interpretability of statistically significant ‘indirect effects’ when using observational data: in social reality, all variables correlate with each other to some extent – the so-called ‘crud factor’ – which means that omitted variable bias and ‘indirect effects’ at the population level are virtually guaranteed regardless of the actual variables involved in the statistical mediation model. As a result, there can be no inferential link between the observation of a significant indirect effect and a theoretical claim of mediation. Through this argument, the paper hopes to add to the existing warnings on mediation analyses and cultivate a more critical interpretation of ‘indirect effects’ in communication science.
Establishing the working mechanisms and explanatory processes underlying communication phenomena is considered to be one of the most important goals in communication science (e.g. Holbert and Stephenson, 2003; Preacher and Hayes, 2008; Valkenburg et al., 2016). To contribute to this end researchers often resort to a statistical technique known as mediation analysis: a statistical mediation model defines the relationship between three sets of variables – a set of explanatory variables, consisting of (presumed) causal predictors and covariates, a set of mediators, and a set of dependent variables. In the simplest case this boils down to a model consisting of one observed independent variable x, one observed mediator m, and one observed dependent variable y (see Figure 1). The underlying theoretical hypothesis is that variable x indirectly influences y through m or, in other words, that variable m serves as the mediator of the effect of x on y. The associated statistical test proceeds by estimating the product of regression coefficients (

A simple three-variable mediation model.
A well-known problem: Significant indirect effects do not validate causal interpretations
Despite the popularity of mediation analyses, methodologists have long warned against their application, especially in research using non-experimental data (e.g. Bullock et al., 2010; Bullock and Ha, 2011; Kline, 2015). The reason for this is that the results of mediation analyses have little to say about the implied causality underlying a mediation hypothesis: the mediation analysis itself does nothing to establish temporal order or rule out third-variable explanations and, as such, significant indirect effects in observational studies should not be taken as direct evidence of a specific causal pathway (Fiedler et al., 2011; Kline, 2015). Of course, the basic distinction between causality and correlation is thoroughly acknowledged in limitation sections of communication research, and few researchers would literally claim to have found definitive evidence for a causal mechanism based on a significant ‘indirect effect’ alone. However, Chan et al. (2020) recently found that communication scholars do, in fact, use causal terminology to describe significant indirect effects, suggesting that mediation analyses are at least implicitly assumed to provide evidence about process-oriented hypotheses. This is also attested to by the fact that the widespread adoption of mediation has been explicitly applauded by communication scholars – as a sign of theoretical progress and methodological sophistication in the field (e.g. Perloff, 2013).
Unfortunately, there is good reason to argue that significant indirect effects often do not provide any evidence – not even tentative evidence – for process-related theoretical claims. This is not just true because there might be omitted confounders or alternative variable orders at play – an issue that has been acknowledged and discussed at length (Chan et al., 2020; Fiedler et al., 2011; Kline, 2015). Arguably, an even more fundamental issue is that, in many cases, the inferential test of the indirect effect is nothing short of a logical truism. This is also the point stressed throughout this paper: it is statistically guaranteed that any random constellation of three observed variables will generate a significant ‘indirect effect’ in a mediation model at some fixed but unknown sample size n. This is true regardless of whether or not the variables in the mediation model are actually related in any theoretically meaningful way, which means that there is little value at all in interpreting ‘significant’ products-of-coefficients from mediation analyses as meaningful evidence of some sort.
A lesser-known problem: Significant indirect effects are logical truisms
To understand this criticism it is enough to reiterate the basic (though rarely considered) fact that mediation is a statistical synonym for confounding or suppression (see also MacKinnon et al., 2000). This is easily seen when y is written as a function of x alone. Given (1) and (2) it holds that
The fact that omitted variable bias and indirect effects are mathematical synonyms has two logical implications. First, it means that there will always be omitted variable bias in a population-level model if m serves as a mediator of the x–y-relationship in the population. This makes conceptual sense, because for a variable to be a mediator it needs to explain shared variance between x and y. This reasoning was also made explicit in traditional mediation techniques such as Judd and Kenny’s (1981) causal-steps approach, where a change in the relationship between x and y after controlling for variable m was taken as evidence for statistical mediation. A second but less intuitive – and, arguably, less appealing – implication of equation (5) is that any variable m acting as a population-level confounder or suppressor of the x–y-relationship will serve as a statistical mediator in the population-level statistical model. Indeed, the population-level counterpart of equation (6) implies that there will be no mediation or omitted variable bias at the population level if and only if the product of population-level regression coefficients
While this fact might not come as a surprise to statistically versed readers, it seems important to reiterate as it underlines that, in many common cases, testing for indirect effects in communication research is nigh meaningless. In essence, the statistical test of the ‘indirect effect’ only serves to address the question: does variable m serve as an omitted variable – a confounder or suppressor – in the relationship between x and y? When posited like this, a test of statistical mediation might not appear all that interesting anymore, and it is even less so when we consider that the answer will nearly always be ‘yes’ when a study relies on observational data. The reason for this is known as the crud factor: ‘in the social sciences, everything is somewhat correlated with everything’ (Meehl, 1990: 108). All social, behavioral and personality variables are part of a complex, intractable constellation of interdependencies, which means that any randomly measured set of such variables can be expected to be at least somewhat interrelated at the population level (see also Orben and Lakens, 2020). But if it is true that virtually all relationships between observed variables at the population level are nonzero, then it follows that virtually no omitted variable bias in a finite linear constellation of variables will ever be literally zero. And if virtually no omitted variable bias in a finite linear constellation of variables will ever be literally zero, then, by equation (6), no so-called ‘indirect effect’ in a population-level statistical model will ever be equal to zero! This means that any ‘indirect effects’ coefficient
Again, it seems useful to underline that this criticism is different in spirit from more oft-heard cautions against mediation analyses in observational designs: the problem addressed here is not with the fact that
Does crud really render observational indirect effects meaningless?
While the crud-criticism fundamentally challenges the value of testing for indirect effects one could formulate several rebuttals to it. First, not all readers will be convinced that the crud factor plays such a fundamental role in communication research. It has been argued, for instance, that the crud factor is an empirical hypothesis in itself, and that there is ‘no a priori reason to believe that one will always reject the null hypothesis at any given sample size’ (Mulaik et al., 1997: 80). One problem with this counterargument is that it is not feasible to ask for the crud factor to be tested (as it would require us to exhaust population-level observations, which is impossible). Another problem is that the assumption of there being a crud factor seems much more parsimonious than the assumption of there being no such thing: saying that a crud factor does not exist requires us to take seriously that exact, literal null parameters occur in social reality (and, in fact, that they are prevalent). For a correlation coefficient this would imply that exhausting the population of possible observations would have us end up with a correlation of A little thought reveals a fact widely understood among statisticians: The null hypothesis, taken literally (and that's the only way you can take it in formal hypothesis testing), is always false in the real world. It can only be true in the bowels of a computer processor running a Monte Carlo study (and even then a stray electron may make it false). If it is false, even to a tiny degree, it must be the case that a large enough sample will produce a significant result and lead to its rejection.
But if this is true, then it necessarily follows that the inference of non-zero omitted variable bias (through the statistical rejection of a non-null ‘indirect effect’) is a truism.
Another rebuttal to the paper could be that the crud factor argument is actually a criticism of null-hypothesis testing in general, rather than a criticism of mediation analyses per se. That is certainly correct, but the crud factor poses problems that are particularly pressing in the context of mediation analyses. There are at least three reasons for this. First, as mentioned earlier, the mediation literature relies very heavily on causal terminology (indirect, direct, and total ‘effects’) that is typically avoided when describing correlational findings in observational research. Such use of terminology is reasonable as long as one sticks to the theoretical rationale guiding research efforts, but it becomes problematic when the rejection of a null value for a product of coefficients is also taken as an empirical corroborator for an ‘indirect effect’. With this in mind, it seems valuable to always entertain the literal interpretation of the product of coefficients as omitted variable bias: instead of interpreting
A second reason why the crud factor is pressing in the context of mediation analyses is that there is no commonly agreed-upon metric of effect size to evaluate the strength of an ‘indirect effect’. While various different metrics have been proposed, many of them are known to have statistical and conceptual issues, and a consensus on their application is yet to be found (see Lachowicz et al., 2018, for a discussion and promising development). This is unfortunate, because an estimate of effect size may help counter allegations of crud-factor relationships: crud-factor relations, being the outcome of a complex chain of interdependencies, are likely to be very small in size. For this reason, a relatively strong effect size already lends more evidence for some theoretically viable indirect effect rather than crud per se. This is why it seems crucial for researchers to substantively interpret at least some type of available effect size – the most accessible simply being the estimated products-of-coefficients in unstandardized or standardized form (the latter is known as the index of mediation: Preacher and Hayes, 2008). As noted above, the interpretation should focus on the reasons why the size of these coefficients can be thought to reflect corroborations of the indirect effects hypotheses rather than just crud-like omitted variable bias. This issue is currently not properly addressed in the communication literature; mediation in the field is simply inferred through the statistical significance of products-of-coefficients (Chan et al., 2020).
A third reason why the crud factor problem is pressing in the context of mediation analysis is that establishing statistical mediators is commonly believed to be a more impressive theoretical contribution than simply finding x–y-relationships (Hayes et al., 2011). It should be clear by now that this is not necessarily the case: the crud factor guarantees non-null ‘indirect effects’ in observational studies, so there is no principled reason to value research more if it reports some significant indirect effects. On the contrary, there is a fundamental risk for a literature relying too heavily on these types of findings: if mediation hypotheses passing null hypothesis tests are considered to be theoretical contributions in a field, then nearly all process-related conjectures – regardless of their validity or logical content – may be counted as theoretical contributions. As a result, the knowledge base of the discipline risks diverging into a patchwork of idiosyncratic ‘process models’ that are not necessarily meaningful and have no clear connection to one another (see also Rohrer et al., 2020). This clarifies an additional benefit of adopting a more critical attitude toward mediation analyses: if we no longer consider significant indirect effects as meaningful corroborators for process-related hypotheses, then we might prevent communication theory from further increasing “in complexity without increasing in explanatory power” (Lang, 2013: 14).
Conclusion: The scientific insignificance of significant indirect effects
In sum, the takeaway message of this paper is pretty straightforward: the communication literature should be much more critical toward the theoretical viability of significant ‘indirect effects’. While this recommendation is certainly not new, it deserves being reiterated – especially given that previous cautions do not appear to have changed scholarly practice all that much (as is reflected in Chan et al., 2020). The arguments raised in the current paper also used a different frame compared to previous discussions: rather than reiterating that mediation results can be plagued by third variables or reverse causal order, this paper unpacked a simple statistical argument to show that the significance of indirect effects is nothing but a truism. Hopefully, this alternative perspective can convince communication scholars that null rejections alone have very little to say about the theoretical viability of an indirect effect.
The only circumstance under which null hypothesis tests for indirect effects are valuable is when a study uses an experimental design with random assignment for both the independent variable and the mediator (see Bullock et al., 2010; Bullock and Ha, 2011): if subjects are randomly assigned to conditions for x and m, the crud factor is no longer a concern because randomization and manipulation break pre-existing dependencies between all variables in the mediation model. However, the same is not true when only the independent variable x is manipulated! While such a set-up breaks the crud-factor dependency between x and m, and between x and y, the relationship between m and y remains untouched. This means that as long as the manipulation of x has a non-zero population-level influence on m, a population-level indirect effect on y is again guaranteed. Under conditions when it is not feasible to manipulate both x and m – for instance, when a research question requires naturalistic observation – mediation analyses might still have their place. However, under such circumstances, researchers will need to provide a very sound theoretical and methodological argument to embed their interpretations – not just when it comes to the order of variables and their accounting for confounders (to address inverse causality and third variable explanations), but also when it comes to the size of the indirect effect (to address crud). An Editorial requirement to always report Directed Acyclic Graphs together with the sizes of the respective relationships might certainly help in this regard (Pearl, 2009; Rohrer, 2018). In any case, the statistical significance of a product-of-coefficients alone should no longer be considered as a meaningful basis to evaluate process-related claims in communication science.
Footnotes
Acknowledgement
The author would like to thank soon-to-be-dr. Anneleen Meeus and all members of the DCC Statistics Meeting for their useful comments on earlier versions of the manuscript.
Declaration of conflicting interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: Part of the work for this paper was done during a postdoctoral fellowship supported by the Research Foundation – Flanders under Grant 12J7619N.
