Abstract
Causal decomposition analysis is among the rapidly growing number of tools for identifying factors (“mediators”) that contribute to disparities in outcomes between social groups. An example of such mediators is college completion, which explains later health disparities between Black women and White men. The goal is to quantify how much a disparity would be reduced (or remain) if we hypothetically intervened to set the mediator distribution equal across social groups. Despite increasing interest in estimating disparity reduction and the disparity that remains, various estimation procedures are not straightforward, and researchers have scant guidance for choosing an optimal method. In this article, the authors evaluate the performance in terms of bias, variance, and coverage of three approaches that use different modeling strategies: (1) regression-based methods that impose restrictive modeling assumptions (e.g., linearity) and (2) weighting-based and (3) imputation-based methods that rely on the observed distribution of variables. The authors find a trade-off between the modeling assumptions required in the method and its performance. In terms of performance, regression-based methods operate best as long as the restrictive assumption of linearity is met. Methods relying on mediator models without imposing any modeling assumptions are sensitive to the ratio of the group-mediator association to the mediator-outcome association. These results highlight the importance of selecting an appropriate estimation procedure considering the data at hand.
A key objective of decomposition analysis is to identify risks or resources (“mediators”) that contribute to disparities between groups of individuals defined by social characteristics, such as race, ethnicity, gender, class, and sexual orientation. Examples of such mediators include incarceration, which explains the racial earnings gap among men (Western and Pettit 2005); socioeconomic status (SES), which explains the cardiovascular health (CVH) gap across race-gender groups (Lee, Park, and Boylan 2021); and the opportunity to learn, which explains the math achievement gap across racial/ethnic groups (Schmidt, Guo, and Houang 2021). The key to this approach is to single out contributing factors that could play a role in reducing such disparities across social groups.
Several studies have situated decomposition analysis within the counterfactual framework of causal inference. VanderWeele and Robinson (2014) first advanced the idea of focusing on observed social disparities rather than the causal effects of social groups such as race and gender. Jackson and VanderWeele (2018) further developed the approach and proposed various definitions of disparity reduction and disparity remaining using an interventional perspective (Nguyen, Schmid, and Stuart 2021). In this article, we use one of their definitions that quantifies the extent to which the observed disparity would be reduced or remain if we hypothetically intervened to set the mediator distributions equal between social groups among individuals with similar demographic backgrounds.
Another counterfactual approach to mediation is causal mediation analysis on the basis of natural direct and indirect effects (Pearl 2001; Robins 2003). We include a detailed discussion regarding the similarities and differences between these methods in the section “Relations to Causal Mediation Analysis,” but one notable difference is that causal decomposition analysis requires fewer assumptions than causal mediation analysis. One critical assumption required in causal mediation analysis, but not in causal decomposition analysis, is “no intermediate confounding.” Given that a myriad of factors contributes to social disparities in health outcomes, intermediate confounding (effects of social groups confounding the mediator-outcome relationship) is likely to occur in disparities research, so causal decomposition analysis has a substantial advantage over causal mediation analysis.
Despite this merit, allowing intermediate confounders in causal decomposition analysis adds a modeling burden, because the identification of disparity reduction and remaining depends on the conditional probability of intermediate confounders in addition to that of the mediator and outcome. To reduce the modeling burden, estimation methods for causal decomposition analysis use different strategies. In this study, we focus on six methods: two regression methods based on the difference-in-coefficients and the product-of-coefficients estimators (Jackson and VanderWeele 2018), two weighting methods based on ratio of mediator probability weighting (RMPW) and inverse odds ratio weighting (IORW) estimators (Jackson 2021), and two imputation methods based on the single-mediator imputation estimator (Lundberg 2022; Sudharsanan and Bijlsma 2021) and the multiple-mediator imputation estimator (Park, Qin, and Lee 2022). The two regression methods impose restrictive modeling assumptions such as linearity. The weighting and imputation methods rely on the observed distribution of variables instead of imposing a restrictive modeling assumption. Specifically, the weighting methods rely on the observed distribution of the outcome; the single-mediator imputation method relies on the observed distribution of the intermediate confounders; and the multiple-mediator imputation method relies on the observed distribution of the mediator. However, the consequences, in terms of performance, of relying on the modeling assumptions or observed distributions are poorly understood, particularly when combined with varying data conditions critical to mediation settings.
Therefore, the goal of this study is to review the modeling strategies of each method and assess their performance in various data conditions. Comparing the performance of methods that use different strategies to reduce the modeling burden is a focal point of our simulation study, which makes it distinct from other simulation studies that compare performance across traditional mediation methods (e.g., MacKinnon et al. 2002). To achieve this goal and for simplicity, this review and simulation study focuses on continuous outcomes.
To empirically ground our investigation, we examine disparities in CVH across race-gender groups using the Midlife Development in the United States (MIDUS) study. Specifically, we focus on the research question “To what extent would the CVH disparity between Black women and White men be reduced if the college completion rate was equal between the groups?” and we illustrate estimation methods in the context of this example.
Causal Decomposition Analysis: Review
In this section, we review causal decomposition analysis and then discuss several issues that emerge when one uses traditional mediation analysis or causal mediation analysis to study health disparities across socially defined groups.
Using the motivating example, we consider the setting in which the groups are Black women
Initial Disparity
We are interested in the CVH disparity between Black women and White men, controlling for age and genetic vulnerability. Formally, the initial disparity between Black women and White men is defined as
Disparity Reduction and Remaining
Once we observe the disparity between Black women and White men, we would also want to identify how to reduce the disparity, for example, by increasing Black women’s college completion rate to that of White men. We thus equalize the college completion rate between the groups among those with the same covariate level, because we consider potential differences in the college completion rate that arise through age (e.g., cohort differences) or genetic vulnerability as unrelated to disparities in the college completion rate across gender and racial groups. Then, the disparity reduction is defined as, conditional on baseline covariates, the difference between the average CVH of Black women and their counterfactual CVH after setting their college completion rate equal to that of White men among those with the same baseline covariate level. Formally,
Interpretation of these conditional estimands would apply to a specific level of baseline covariates
To give causal interpretations to disparity reduction and remaining, we need to make the following identification assumptions:
All these assumptions are strong, and whether the assumptions are met or not depends on a substantive example. Assessing the plausibility of the assumptions is essential but beyond the scope of this study. Given the assumptions, disparity reduction and remaining are nonparametrically identified as
where
Relations to Traditional Mediation Analysis
Traditionally, decomposition analysis has been understood, formulated, and conducted within the linear framework on the basis of the difference-in-coefficients estimator (Freedman and Schatzkin 1992; Olkin and Finn 1995). For example, a seminal work by Fryer (2011) used this difference-in-coefficients approach to examine whether controlling for test scores reduced observed racial/ethnic disparities in wages, unemployment, incarceration, and health. In the context of our motivating example and on the basis of the traditional difference-in-coefficients estimator, disparity reduction is estimated by
The disparity reduction and remaining obtained from this traditional approach differs from those defined at the beginning of this section. Specifically, if identification assumptions A1, A2, and A3 are met,
Another widely used approach in decomposition is the Kitakawa-Oaxaca-Blinder (KOB) decomposition (Kitagawa 1955; Oaxaca 1973; Blinder 1973). The KOB is often used to decompose social disparities in an outcome to the explained (by the fact that groups have different means for the mediator) and unexplained portions. Hou (2014) extended the KOB decomposition to address intermediate confounders using a regression-based mediation analysis framework. His article shows the equivalence between product-of-coefficients and difference-in-coefficients estimators when no exposure-mediator interaction exists.
The explained and unexplained portions defined in the KOB decomposition correspond to disparity reduction and disparity remaining, respectively. The only difference is that the typical KOB decomposes a marginal disparity where it is not conditional on baseline covariates
Relations to Causal Mediation Analysis
One popular counterfactual approach to mediation is causal mediation analysis on the basis of natural direct and indirect effects (Pearl 2001; Robins 2003). Bauer and Scheim (2019) adopted this approach and applied VanderWeele’s (2013) three-way decomposition method for disparities research. However, in prior literature (e.g., Jackson and VanderWeele 2019; Lundberg 2022; Park et al. 2022) it is argued that causal decomposition analysis is preferred over causal mediation analysis when studying contributing factors to disparities for the following three reasons.
First, causal decomposition analysis adopts the framework of a descriptive disparity, focusing on estimating the causal effects of manipulable factors rather than social groups. In our example, we are interested in the causal effect of college completion in reducing a CVH disparity but are agnostic about the causal effect of social groups on CVH. This framework circumvents the issue of assigning counterfactual outcomes to nonmanipulable factors such as race and gender. In contrast, causal mediation analysis is often applied to settings in which a manipulated treatment affects an outcome. Hence, it focuses on estimating the causal effects of the treatment as well as the mediator.
Second, causal decomposition analysis is based on interventional effects (Didelez, Dawid, and Geneletti 2012), which provide a straightforward interpretation of direct and indirect effects defined in disparities research. If causal mediation analysis was applied to our example, natural indirect effects would compare each Black woman’s CVH with the potential CVH outcome of each Black woman after setting their mediator (college completion status) to a value that would have naturally resulted had she been born a White man. Considering this potential outcome is somewhat strange because a Black woman cannot be reborn as a different race-gender status and experience the mediator. In contrast, disparity reduction computes the difference between the average Black woman’s CVH and the average counterfactual outcome of Black women after hypothetically intervening to equalize the college completion rate between groups. Compared with natural indirect effects, disparity reduction is more straightforward to interpret.
Third, identifying disparity reduction requires a weaker assumption than natural indirect effects. Identifying natural indirect effects requires no omitted confounding in the (1) exposure-outcome, (2) exposure-mediator, and (3) mediator-outcome relationships. In identifying disparity reduction, no counterfactual outcome is assigned to social groups, so we do not need to assume there is no omitted confounding in the exposure-outcome and exposure-mediator relationships. Most importantly, natural indirect effects require an additional assumption, that is, no intermediate confounding (Pearl 2009) or no interaction in the group-mediator relationship at the individual level (Robins 2003). Each assumption is restrictive and unrealistic, and neither assumption is met in our example. Early-life adversity (childhood SES and abuse) can affect the risk for dropping out of college and CVH, and the effect of college completion on CVH might vary by social group. In contrast, disparity reduction on the basis of interventional indirect effects requires neither assumption. That is, observed intermediate confounders are allowed, and interaction in the group-mediator relationship is allowed.
However, estimating disparity reduction and remaining in causal decomposition analysis is challenging because of these added intermediate confounders. As shown in equation (1), the identification result for disparity reduction and remaining depends on the conditional probability of intermediate confounders given the group status and baseline covariates. This dependence implies that, unless we either make restrictive modeling assumptions or rely on the observed distribution of variables, then intermediate confounders need to be modeled to estimate disparity reduction and remaining.
Estimation Methods
The following section details the estimation procedure of each method and how each method addresses the modeling burden of intermediate confounders. To estimate disparity reduction and remaining conditional on
Regression-Based Approaches
Difference-in-Coefficients Method
The estimation procedure requires modeling the following four successive outcome models regressed on group status and baseline covariates, and additionally intermediate confounding (childhood SES
where
Given equations (3) to (6), disparity reduction is estimated as
Note that
Product-of-Coefficients Method
A product-of-coefficients approach is obtained by posing a model for the causal KOB decomposition discussed by Jackson and VanderWeele (2018). There are different ways to pose a model to estimate disparity reduction and disparity remaining. For example, Jackson and VanderWeele require modeling intermediate confounders in addition to the mediator and the outcome (see page 17 of their appendix).
Here, we implement the causal KOB decomposition by only fitting the mediator and outcome models, which has an advantage in terms of reducing the modeling burden. For illustration, we assume the mediator is continuous (education) and we will show that the method can address a discrete mediator (college completion status). The estimation procedure requires modeling the following mediator and outcome models as
where
Alternatively, as shown in Jackson and VanderWeele’s (2018) appendix, we can estimate the disparity remaining by modeling intermediate confounders as
This product-of-coefficients estimator allows a group-mediator interaction, and it can be easily modified to address binary mediators. For instance, a logistic/ probit regression should be fitted for a binary mediator as
Weighting-Based Approaches
Jackson (2021) proposed two weighting-based estimators on the basis of adaptation of the RMPW and IORW estimation. These estimators were originally developed in the causal mediation literature by Hong, Deutsch, and Hill (2015) and Tchetgen Tchetgen (2013), respectively.
RMPW
The RMPW estimator can be applied to a single discrete mediator. The following estimation procedure relies on two mediator models in which any linear and nonlinear relationships are allowed (steps 1 and 2) while using the observed distribution of the outcome (step 3).
Fit a mediator model, regressing college completion status on baseline covariates among White men
Fit another mediator model, regressing college completion status on baseline covariates and the intermediate confounders (childhood SES and abuse) among Black women
Calculate the average CVH
The disparity reduction is estimated as
IORW
IORW can also be applied to a single discrete mediator. The estimation procedure relies on two mediator models (step 1) and four exposure models (steps 2 and 3) while using the observed distribution of the outcome. The procedure is similar to RMPW, so we briefly describe the estimation procedure here.
Fit two mediator models regressing college completion status on baseline covariates and intermediate confounders. On the basis of the two fitted models, compute the predicted probabilities of
Fit two exposure models regressing group status on baseline covariates and intermediate confounders. On the basis of the two fitted models, compute the predicted probabilities of being in a specific group as
Fit two exposure models regressing group status on college completion status and baseline covariates and intermediate confounders. On the basis of the two fitted models, compute the predicted probabilities of being in a specific group as
The remaining steps are the same as with the RMPW estimator, except the weight is given as
One advantage of these weighting estimators is their flexibility to accommodate linear and nonlinear relationships, as the estimators do not change regardless of the fitted models. However, the disadvantages of these estimators include addressing discrete mediators only, as most weighting-based approaches do not work very well with continuous variables. Also, weighting-based approaches are generally less efficient in terms of standard errors compared with regression-based approaches (VanderWeele 2010).
Imputation-Based Approaches
Single-Mediator Imputation Method
Sudharsanan and Bijlsma (2021) and Lundberg (2022) proposed an estimator on the basis of the parametric g-formula (Robins 1986). Their algorithm predicts potential outcomes by randomly drawing values from mediators and outcomes from probability distributions. To address the uncertainty associated with this procedure, random draws for mediators and outcomes are conducted hundreds or even thousands of times. Combined with bootstrapping, the algorithm requires substantial computational power and time. Here, we extend this approach by using a predicted mediator value for continuous mediators (and a randomly drawn value from the mediator distribution for binary mediators), and we directly address the uncertainty associated with predicting a mediator value by bootstrapping rather than randomly drawing from mediator probability distributions multiple times. Although it is a minor difference, it substantially reduces computational power and time. The following estimation procedure relies on modeling a mediator and an outcome in which any linear and nonlinear relationships are allowed (steps 1 and 2) while using the observed distribution of the intermediate confounder (step 2).
Fit a mediator model, regressing the mediator (college completion status) on group status and baseline covariates. Using the coefficients from the fitted model, we compute the predicted value of the mediator for each subject (denoted as
Fit an outcome model, regressing CVH on group status, intermediate confounder, mediator, and baseline covariates as
The predicted outcome values obtained from step 2 will be averaged over
The disparity reduction is estimated as
This single-mediator imputation estimator is flexible in addressing linear and nonlinear relationships as well as discrete and continuous mediators and outcomes.
Multiple-Mediator Imputation Method
Park et al. (2022) proposed the multiple-mediator imputation estimator by adopting the result in VanderWeele and Vansteelandt (2014), which was originally developed for causal mediation analysis. Park et al. (2022) developed this estimator to address the case of intervening on multiple mediators simultaneously, which is useful when the causal ordering of the mediators cannot be easily determined. Although the method is for multiple mediators, it can also address a single mediator. The following estimation procedure relies on modeling the intermediate confounders and the outcome in which any linear and nonlinear relationships are allowed (steps 1 and 2) while using the observed distribution of the mediator (step 2).
Fit a confounder model, regressing each intermediate confounder (childhood SES and abuse) on group status and baseline covariates. Using the coefficients from the fitted model, we compute a predicted value of each confounder for each subject (denoted as
Fit an outcome model, regressing CVH on social groups, intermediate confounders, mediator, and baseline covariates as
The predicted outcome values obtained from step 2 will be averaged over
The disparity reduction is estimated as
This multiple-mediator imputation estimator is highly flexible because it can address (1) any nonlinear terms, (2) multiple mediators and a single mediator, and (3) different variable types of mediators and outcomes.
However, depending on the causal structure of variables, there could be more burden in correctly specifying models than in the single-mediator imputation method. This estimator requires modeling intermediate confounders instead of mediators. From a modeling perspective, this estimator is advantageous only when the number of mediators exceeds or equals the number of intermediate confounders.
Simulation Study
Weighting- or imputation-based methods are generally more flexible than regression-based methods because no restrictive modeling assumptions are required. However, this flexibility comes at the cost of relying on the observed distribution of variables. We conducted a simulation study to assess the performance of the methods, either relying on the observed distribution of variables or imposing restrictive modeling assumptions with various data conditions to help researchers choose an optimal method given the data at hand. For simplicity, we refer to difference-in-coefficients, product-of-coefficients, RMPW, IORW, single-mediator imputation, and multiple-mediator imputation as estimators 1, 2, 3, 4, 5, and 6, respectively, in this section. Table 1 shows the summary of available estimation methods depending on conditions.
Summary of Available Methods
Note: Estimators are as follows: 1, difference-in-coefficients; 2, product-of-coefficients; 3, ratio of mediator probability weighting; 4, inverse odds ratio weighting; 5, single-mediator imputation; and 6, multiple-mediator imputation.
Data Generation
To generate synthetic data that mimics real data, we use the distribution of each variable in the MIDUS data used for the motivating example, which contains the group status
Specifically, we create a binary treatment
For the binary mediator case, we use the same procedure, but dichotomize
On the basis of the synthetic data, we compute the true average outcome values for
Coefficient Values for Each Scenario and Corresponding Parameters
Note:
Simulation Setting
We consider three conditions critical to mediation settings: type of mediator, sample size, and the ratio between the
Last, an important condition that we vary for each fixed sample size is the ratio between the
The ratio is defined as
Thus, we consider 15 scenarios with different
In this study, we used the following metrics to compare the performance of each estimation method: relative bias, the normalized root mean squared errors (nRMSEs), and 95 percent confidence interval coverage using the percentile bootstrap method (Efron 1982) with the number of bootstrap replicates of 1,000. The relative bias measures the difference between the average of the estimates and the true value relative to the true value. The nRMSE measures the square root of the average squared difference between the estimate and the true value relative to the true value. For each scenario, we make 1,000 replicates of the sample from the population, and the performances are averaged over the 1,000 repetitions. The coverage rate for the 95 percent confidence interval is defined as the proportion of replications where the true value is covered by the 95 percent confidence interval out of 1,000 replications.
Simulation Results
The simulation results for a continuous and binary mediator are summarized in Figures 1 and 2, respectively. In the figures, we present the relative bias (first row), nRMSE (second row), and 95 percenet confidence interval coverage (third row) for disparity reduction and remaining. Each column represents a different estimator. The x-axis represents the ratio and the

Performance of disparity reduction (A) and disparity remaining (B) with a continuous mediator.

Performance of disparity reduction (A) and disparity remaining (B) with a binary mediator.
Figure 1 (continuous mediator) shows the performance of estimators 1, 2, 5, and 6. Estimators 3 and 4 are not considered because they are only available for a binary mediator. Estimators 1, 2, and 6 perform well with a medium or large sample size
In contrast, estimator 5 does not perform well in bias and coverage when the ratio is less than 1. The coverage rate of estimator 5 exceeds 0.98 with ratios less than 1 even with the sample size of 1,000, which implies that estimator 5 is inefficient in standard errors (here, shown as wide confidence intervals). In addition, the bias for a small sample size
Figure 2 (binary mediator) shows that estimators 1, 2, and 6 for disparity reduction perform well in terms of bias, variance, and coverage with a medium or large sample size
For estimators 3 and 4, we observe a low coverage rate with ratios less than 0.5. For example, with a sample size of 1,000 and a ratio of 0.1, the coverage rates of estimators 3 and 4 are only 0.53 and 0.65, respectively. These low coverage rates are due to narrow confidence intervals. Estimator 3 is advantageous in terms of modeling perspective because it only requires modeling two mediator models. However, the simulation result suggests this modeling advantage comes at the cost of low coverage when the ratio is small. We also observe a large bias (22.6 percent of the true value) and low coverage (0.64) for estimator 3 with a sample size of 100 and a ratio of 0.9.
For estimator 5, we observe a high coverage rate with ratios less than 0.5. For example, with a sample size of 1,000 and a ratio of 0.1, the coverage rate of estimator 5 is 0.98. These high coverage rates are due to the wide confidence intervals produced by the estimator, and this pattern remains consistent for continuous and binary mediators.
The pattern is similar with disparity remaining estimators, but there are two notable differences. First, estimator 4 has a better coverage rate for disparity remaining, achieving the nominal level across all sample sizes and ratios. Second, estimator 3 shows a high coverage rate for ratios less than 0.5. For example, with a sample size of 1,000 and a ratio of 0.1, estimator 3 has a coverage rate of 0.99. These high coverage rates are due to the wide confidence intervals produced by the estimator.
In addition to these 15 scenarios, we present another set of simulation studies under the same model specification but with standardized variables in part D of the online supplement. Although the relative bias and nRMSEs vary slightly, the low and high coverage issues of the weighting (estimators 3 and 4) and single-mediator imputation (estimator 5) methods persist, even after standardizing the variables.
In summary, we find a trade-off between the modeling assumptions required and performance in terms of bias, variance, and coverage. Methods that require a restrictive assumption perform best if the assumption is met (e.g., estimators 1 and 2). The performance of methods that do not require any restrictive modeling assumption but rely on modeling the mediator (estimators 3, 4, and 5) is sensitive to the ratio of the
Application
Choosing between Methods
On the basis of our review of the methods and the simulation study, we provide recommendations for selecting an optimal method. We illustrate the practice of choosing an optimal method using the motivating example. Our research question is, to what extent would the CVH disparity be reduced if we increased the college completion rate of Black women to the level of White men among individuals with the same age and genetic vulnerability? The mediator is college completion status and the outcome is CVH, with higher values indicating better CVH (mean
Which method should be used among these multiple options? In our case, the sample size is 1,978, and the ratio is 0.319. Given the sample size and ratio, the simulation study suggests the product-of-coefficients and the multiple-mediator imputation methods should work well. If investigators are willing to assume no other nonlinear terms except for the group-mediator interaction, the product-of-coefficients method should be considered. If other nonlinear terms are modeled, the imputation method should be considered. The RMPW and IORW methods are also available options, but caution is required as the confidence interval for disparity reduction obtained from nonparametric bootstraps may be narrower than expected for ratios smaller than 0.5.
Summary of Findings From the Working Example
Table 3 shows estimates for disparity reduction and remaining obtained from different estimation methods. We begin by noting that the initial disparity for Black women compared with White men is
Estimates of the Disparity Reduction and Disparity Remaining for Black Women versus White Men
Note: Estimators are as follows: 2, product-of-coefficients estimator; 3, ratio of mediator probability weighting estimator; 4, inverse odds ratio weighting estimator; 5, single-mediator imputation estimator; 6, multiple-mediator imputation estimator. Baseline covariates are mean-centered. CI = confidence interval.
The estimand
In this example, the statistical uncertainty reflected in the 95 percent confidence interval is greater than the variability in the estimate across different methods. Moreover, the same conclusion is derived from different estimation methods. Yet it is important to note that a different conclusion could be derived depending on estimation methods, particularly when the sample size is small or when the ratio is even smaller than 0.319.
Discussion
Estimation of disparity reduction and remaining is challenging because of the added burden of modeling intermediate confounders. Therefore, it is crucial to use an estimation method that reduces the modeling burden while maintaining good performance. Using both simulation and real-data examples, this article investigated the performance of six methods for estimating disparity reduction and remaining that use different strategies to reduce modeling burdens. We found that the methods imposing a restrictive modeling assumption perform best as long as the assumption is satisfied. For instance, with a continuous mediator, the regression-based estimators provide a precise estimate with the 95 percent coverage rate reaching the nominal level.
The other estimators use the observed distribution of variables. Of these, the weighting (RMPW and IORW) and single-mediator imputation estimators rely on modeling a mediator. The weighting methods for binary mediators perform poorly when the group-mediator association is smaller than half the size of the mediator-outcome association. A low coverage rate of the weighting estimators obtained from nonparametric bootstraps with ratios less than 0.5 is particularly worrisome as it could inflate the type 1 error rate. The single-mediator imputation estimator provides a high coverage rate when the ratio is less than 1 (continuous mediator) or 0.5 (binary mediator), which could inflate the type 2 error rate. In contrast, the multiple-mediator imputation estimator relies on modeling intermediate confounders, and thus the performance does not depend on the ratio. However, the performance of this estimator could be affected by the ratio for
There are several limitations to our study that could drive future research. First, this study only addresses one way of defining disparity reduction and remaining. A different definition of disparity reduction and remaining exists (Jackson 2021; Jackson and VanderWeele 2018; Lundberg 2022), and the performance of estimation methods for different definitions is unknown. Therefore, the simulation study could be extended to an alternative definition of disparity reduction and remaining. Second, we used the ratio metric as an important condition in our simulation study, but the metric is useful only for continuous and binary mediators. Should categorical mediators with more than two discrete values be used, the metric must be redefined, and the performance of methods should be reexamined. Finally, the current study only addresses issues of estimating disparity reduction and remaining when the identification assumptions, such as no omitted confounding, are met. However, the assumptions are strong, and thus they may not be met in many empirical settings. Therefore, it is crucial to examine whether identification assumptions are met, as the bias due to violations of identification assumptions could be more extensive than that due to modeling assumptions.
Supplemental Material
sj-pdf-1-smx-10.1177_00811750231183711 – Supplemental material for Choosing an Optimal Method for Causal Decomposition Analysis with Continuous Outcomes: A Review and Simulation Study
Supplemental material, sj-pdf-1-smx-10.1177_00811750231183711 for Choosing an Optimal Method for Causal Decomposition Analysis with Continuous Outcomes: A Review and Simulation Study by Soojin Park, Suyeon Kang and Chioun Lee in Sociological Methodology
Footnotes
Acknowledgements
We are thankful to the editor and the three anonymous reviewers for their valuable feedback in enhancing the quality of our article.
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This research was supported by a grant from the American Educational Research Association, which receives funds for its AERA Grants Program from the National Science Foundation under award NSF-DRL #1749275. Opinions reflect those of the authors and do not necessarily reflect those the American Educational Research Association or the National Science Foundation.
Data Accessibility Statement
Supplemental Material
Supplemental material for this article is available online.
Author Biographies
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
