Sage Journals: Discover world-class research

Abstract

Bayesian methods are becoming increasingly in demand in clinical and public health comparative effectiveness research. Limited literature has explored parametric Bayesian causal approaches to handle time-dependent treatment and time-dependent covariates. In this article, building on to the work on Bayesian g-computation, we propose a fully Bayesian causal approach, implemented using latent confounder classes which represent the patient’s disease and health status. Our setting is suitable when the latent class represents a true disease state that the physician is able to infer without misclassification based on manifest variables. We consider a causal effect that is confounded by the visit-specific latent class in a longitudinal setting and formulate the joint likelihood of the treatment, outcome and latent class models conditionally on the class indicators. The proposed causal structure with latent classes features dimension reduction of time-dependent confounders. We examine the performance of the proposed method using simulation studies and compare the proposed method to other causal methods for longitudinal data with time-dependent treatment and time-dependent confounding. Our approach is illustrated through a study of the effectiveness of intravenous immunoglobulin in treating newly diagnosed juvenile dermatomyositis.

Keywords

Bayesian estimation causal inference longitudinal data latent class

1. Introduction

Many studies aim to estimate treatment effectiveness using longitudinal observational data obtained from disease registries and administrative databases. Such a design reflects real-world clinical practice and simplifies the data collection process without the expense, intensive planning and complex infrastructure required by randomized controlled trials.^1,2 Treatment assignment under this design typically follows a clinician-driven decision process, where the treating clinician determines the treatment option using available demographic and past and current medical records.

There is an increased demand for Bayesian causal methods in applied comparative effectiveness research of rare diseases.^3–5 The Bayesian method provides a probability summary of the treatment effectiveness which can be communicated more easily with patients and their families and allows investigators to assess treatment effectiveness under various clinical expert beliefs.

In recent years, several Bayesian approaches, such as Bayesian propensity score analysis (BPSA), Bayesian marginal structural models (BMSMs) and Bayesian g-computation, and nonparametric/semiparametric Bayesian approaches for dynamic treatment regimes have been proposed to handle time-dependent treatment.^6–13 In particular, BMSM and BPSA are not conventional full Bayesian approaches – the treatment assignment process under the ‘design’ stage is separated from the outcome process under the ‘analysis’ stage.^7,9 These methods implement a ‘feedback-cutting’ strategy to ensure parameters from the treatment and outcome models are not estimated simultaneously. Both BPSA and BMSM employ Bayesian posterior predictive inference for estimating the inverse probability of treatment weighting (IPTW) weights (or importance sampling weights in the case of BMSM). Bayesian g-computation,¹¹ unlike standard g-computation, features joint modelling of the outcome, treatment and covariate processes where all processes are parametrically specified. This can be computationally intractable with a large set of time-dependent confounders without the use of shrinkage priors and other Bayesian variable selection methods. We direct readers to Li et al.¹⁴ for a comprehensive review of Bayesian causal inference frameworks and methods.

Previously, methodology for handling latent confounders has been largely focused on methods to overcome violation of the no unmeasured confounding assumption.^15–26 The use of latent variables in Bayesian causal modelling has been predominately explored under the context of Bayesian sensitivity analysis for unmeasured confounding in a point-treatment setting^19,22,24,27; here latent variables quantify the impact of the unmeasured confounder on the treatment and outcome. The goal of the Bayesian sensitivity analysis is to estimate the bias-corrected causal effects. Additionally, causal modelling with latent class analysis has been explored to quantify the effect of an exposure effect on the latent outcome class membership.^28,29 Our proposed Bayesian method differs from these works in three ways: first, we allow the latent confounder class to be visit-dependent; second, we use this visit-specific latent class as a means of modelling a large set of measured covariates in order to characterize time-dependent confounding. Lastly, we conduct our Bayesian inference using a joint likelihood for outcome, treatment and confounders.

The objective of this work is to propose a fully Bayesian causal model and estimation method that features dimension reduction of time-dependent confounders and at the same time enables the joint modelling of treatment, outcome and covariates. We propose a Bayesian latent class approach with a set of binary time-dependent covariates as class indicators (or manifestation variables) from a time-dependent latent confounder class. The proposed Bayesian latent class approach involves jointly modelling the treatment and outcome processes through the unobserved latent confounder classes and can be directly implemented in standard Markov chain Monte Carlo (MCMC) software (e.g. JAGS, Stan, etc).

In this article, we form our longitudinal causal structure with latent classes and formulate the proposed Bayesian approach in Section 2. Simulation studies are conducted in Section 3 to assess the estimation performance with varying sample sizes, numbers of indicators, the quality of indicators, as well as misspecified outcome and/or treatment assignment model. The motivating study is analysed in Section 4 and we conclude with a discussion in Section 5.

2. Bayesian estimation in a longitudinal causal framework via latent classes

2.1. Notation

Let $n$ be the total number of subjects enrolled in a longitudinal observational study, indexed by $i = 1, \dots, n$ and $J$ be the total number of visits indexed by $j = 1, \dots, J$ where clinical measurements are recorded and a treatment decision is made. Let $Y_{i}$ be the random variable representing an end-of-study outcome after the $J$ th visit for individual $i$ and $Z_{i j}$ be the random variable representing the discrete treatment assigned to individual $i$ at visit $j$ , where for ease of notation we define $Z_{i 0} = 0$ . We define $U_{i j}$ as the unobserved, latent health class and $X_{i j}$ as the $1 \times p$ vector of observed binary health indicators, modelled manifest variables of the latent class, for individual $i$ at visit $j$ . We assume there are in total $K$ latent classes at each visit. We define the full indicator, latent class and treatment histories for subject $i$ as ${\bar{X}}_{i} = {X_{i 1}, \dots, X_{i J}}$ , ${\bar{U}}_{i} = {U_{i 1}, \dots, U_{i J}}$ and ${\bar{Z}}_{i} = {Z_{i 1}, \dots, Z_{i J}}$ and the partial histories up to and including visit $j$ are denoted as ${\bar{X}}_{i j} = {X_{i 1}, \dots, X_{i j}}$ , ${\bar{U}}_{i j} = {U_{i 1}, \dots, U_{i j}}$ and ${\bar{Z}}_{i j} = {Z_{i 1}, \dots, Z_{i j}}$ .

2.2. Causal diagram with latent classes

In this article, we consider a new causal structure with time-dependent latent classes as displayed in Figure 1. We assume at each visit $j$ , the patient falls into one of $K$ latent health classes, which the treating clinician is able to recover through indicators $X_{i j}$ . Treatment $Z_{i j}$ is then determined following the classified health classes along with consideration of treatment history and past health status. Note that when the number of latent classes is two, at any given point in time the model corresponds to a diagnostic classification model (DCM) with a single latent attribute,³⁰ and is thus conceptually similar to models used in item response theory (IRT). IRT, DCM and our model all assume that the latent class represents a true underlying yet unobserved construct. In DCM. this construct is mastery of an educational skill; in our framework, it represents the disease state. For example, the disease course in juvenile dermatomyositis (JDM) is characterized by periods of disease quiescence and flares; multiple sclerosis can manifest as relapsing–remitting, primary progressive or secondary progressive; Lyme disease has three stages (early localized, early disseminated and late disseminated); there are four stages of heart failure for which limitations on physical activity are used for classification (New York Heart Association functional classification³¹). Our model assumes that the physician recovers these latent classes with no misclassification; a directed acyclic graph (DAG) in the presence of misclassification is provided in the Appendix and we discuss inference under misclassification in the discussion.

Figure 1.

Longitudinal causal directed acyclic graph. $X_{i j}$ , $U_{i j}$ and $Z_{i j}$ represent class indicators, latent confounder classes and treatment for subject $i$ at visit $j$ ; $Y_{i}$ is the end-of-study outcome for subject $i$ .

Under this causal framework, the set of time-varying indicators do not directly affect treatment assignment (i.e. the indicators are conditionally independent of treatment given latent class) or the end-of-study outcome $Y_{i}$ , such that $Z_{i j} ⊥ {\bar{X}}_{i j} ∣ ({\bar{U}}_{i j}, {\bar{Z}}_{i j - 1})$ and $Y_{i} ⊥ {\bar{X}}_{i j} ∣ ({\bar{U}}_{i j}, {\bar{Z}}_{i j})$ , for $j = 1, \dots, J$ . For simplicity, we omit time-independent confounders from Figure 1. They can be added to the DAG, above the class indicators with only arrows pointing downward to the latent classes. Similar to class indicators, time-independent confounders do not directly affect treatment assignment and the end-of-study outcome. We provide an example with time-independent confounders in the simulation study in Section 4.

2.3. Potential outcome and causal estimand

Following Rubin’s potential outcome framework, for each unique treatment sequence $\bar{a} = (a_{1}, \dots, a_{J})$ in set $A$ , we have a corresponding potential outcome defined as $Y_{i}^{\bar{a}}$ , for subject $i$ . Given a binary treatment assignment process, there are a total of $2^{J}$ potential outcomes and $2^{J}$ unique sequential treatment combinations. In addition to the potential outcome, we also have potential latent classes and class indicators. Under the potential outcome framework, we make the following assumptions.

Stable unit treatment value assumption. We assume no competing of treatment resources between patients, that is, that the treatment assignment for one patient does not affect the outcome of another and $Y_{i}^{\bar{a}} ∣ (z_{i 1} = a_{1}, \dots, z_{i J} = a_{J}) = Y_{i} ∣ (z_{i 1} = a_{1} \dots, z_{i J} = a_{J})$ (consistency).³²

Sequential latent unconfoundedness. Treatment assignment at each visit is independent of potential variables, given the history of treatment and latent class, such that, $Z_{i j} ⊥ ({\bar{U}}_{i j + 1}^{{\bar{a}}_{j}}, {\bar{X}}_{i j + 1}^{{\bar{a}}_{j}}, Y_{i}^{\bar{a}}) ∣ ({\bar{U}}_{i j}^{{\bar{a}}_{j - 1}}, {\bar{Z}}_{i j - 1})$ .

Conditional independence between class indicators. We assume conditioning on the latent class $U_{i j}$ , the set of class indicators at visit j, $X_{i j} = (X_{i j 1}, X_{i j 2}, \dots, X_{i j p})$ are independent of each other, yielding $P (X_{i j} ∣ U_{i j}) = \prod_{l = 1}^{p} P (X_{i j l} ∣ U_{i j})$ .

The causal parameter of interest is the average potential outcome

E [Y_{i}^{\bar{a}}]

. In the case of a binary, time-dependent treatment, researchers are often interested in assessing the counterfactual contrast on the average potential outcome of always versus never treated. With the introduction of the time-dependent latent class,

E [Y_{i}^{\bar{a}}]

is formulated as

\begin{aligned} E [Y_{i}^{\bar{a}} ∣ θ, α] & = \sum_{U_{i J} = 1}^{K} \dots \sum_{U_{i 1} = 1}^{K} E (Y_{i} ∣ {\bar{Z}}_{i}, {\bar{U}}_{i} = {\bar{u}}_{J}, θ)^{I_{{{\bar{Z}}_{i} = \bar{a}}}} \\ \times [\prod_{j = 1}^{J} P (U_{i j} = u_{j} ∣ {\bar{Z}}_{i j - 1} = {\bar{a}}_{j - 1}, {\bar{U}}_{i j - 1} = {\bar{u}}_{j - 1}, α_{j})] \end{aligned}

(1)

where

θ

represents the parameter vector of the outcome model and

α

represents the parameter vector of the latent class model. Similar to the average treatment effect formulation in g-computation, here (1) is derived following the sequential latent unconfoundedness assumption and the consistency assumption (see Supplemental material for details).

The challenge here is that ${\bar{U}}_{i}$ is unobserved. However, the class indicators are informative about ${\bar{U}}_{i}$ and so we propose to estimate the model parameters through the joint likelihood of the observed data $\prod_{i = 1}^{n} P (Y_{i}, {\bar{Z}}_{i}, {\bar{X}}_{i})$ . In the next section, we will define the joint likelihood and outline the posterior predictive inference to estimate the average potential outcome in (1).

2.4. Posterior predictive inference

Let $o_{i} = {Y_{i}, {\bar{Z}}_{i}, {\bar{X}}_{i}}$ be the observed variables for $i$ . We assume ${o_{1}, o_{2}, \dots}$ is an exchangeable sequence of real-valued random quantities over patient index with unknown parameters $α$ , $β$ , $γ$ , and $θ$ characterizing the latent class model, the covariate model, the treatment assignment model and the outcome model, respectively.³³ Let $Λ = {θ, α, β, γ}$ , the likelihood of the observed data given parameters over $n$ subjects and $k$ visits is derived below following DAG Figure 1.

\begin{aligned} \prod_{i = 1}^{n} P (o_{i} ∣ Λ) = \prod_{i = 1}^{n} \sum_{u_{i J} = 1}^{K} \dots \sum_{u_{i 1} = 1}^{K} P (o_{i}, {\bar{U}}_{i} ∣ Λ) \\ = \prod_{i = 1}^{n} \sum_{u_{i J} = 1}^{K} \dots \sum_{u_{i 1} = 1}^{K} P (Y_{i} ∣ {\bar{Z}}_{i}, {\bar{U}}_{i} = {\bar{u}}_{J}, θ) \prod_{j = 1}^{J} [P (Z_{i j} ∣ {\bar{Z}}_{i j - 1}, {\bar{U}}_{i j} = {\bar{u}}_{j}, γ_{j}) \\ \times P (U_{i j} = u_{j} ∣ {\bar{U}}_{i j - 1} = {\bar{u}}_{j - 1}, {\bar{Z}}_{i j - 1}, α_{j}) P (X_{i j} ∣ U_{i j} = u_{j}, β_{j})] \end{aligned}

(2)

where we assume conditional independence of the indicators

X

, and treatment assignment

Z

given the latent classes

U

and conditional independence of class indicators

X

and outcome

Y

given latent classes

U

. This yields a posterior distribution

P (Λ ∣ o_{1}, \dots, o_{n}) \propto \sum_{u_{i J} = 1}^{K} \dots \sum_{u_{i 1} = 1}^{K} P (o_{i}, {\bar{U}}_{i} ∣ Λ) P_{0} (Λ)

of the parameters, which can be estimated using MCMC.

P_{0} (Λ)

represents the joint prior distribution of the parameters.

We estimate the average potential outcome using posterior predictive inference where we predict the end-of-study outcome of a targeted, static treatment sequence $\bar{a}$ using the observed data. The posterior predictive average potential outcome is given by

E [Y_{i}^{\bar{a}} ∣ o_{1}, \dots, o_{n}] = \int_{Λ} E [Y_{i}^{\bar{a}} ∣ Λ] P (Λ ∣ o_{1}, \dots, o_{n}) d Λ

(3)

Following posterior predictive inference, a distribution of average potential outcomes can be easily obtained via Monte Carlo simulation. Given a set of posterior draws of all parameters,

Λ^{s}

s = 1, \dots, S

, the average potential outcome is computed in two steps for each

s

as following (suppressing covariates):

(1)

Given $Λ^{s}$ , calculate $Y_{i}^{\bar{a}, s}$ , $i = 1, \dots, n$ with $\sum_{u_{i J} = 1}^{K} \dots \sum_{u_{i 1} = 1}^{K} P (Y_{i}^{\bar{a}, s} ∣ U_{i 1} = u_{i 1}, \dots, U_{i J} = u_{i J}, θ^{s}) P (U_{i 1} = u_{i 1} ∣ α_{i}^{s}) \dots P (U_{i J} = u_{i J} ∣ {\bar{U}}_{i J - 1} = {\bar{u}}_{i J - 1}, {\bar{Z}}_{i J - 1} = {\bar{a}}_{J - 1}, α_{j}^{s})$ .

(2)

With $Y_{i}^{\bar{a}, s}$ , $i = 1, \dots, n$ , calculate mean over $i$ .

Under the full Bayesian estimation following the joint likelihood in (2), the unobserved latent classes are imputed (or in other words predicted) at each Monte Carlo iteration along with posterior draws of all model parameters. However, the imputed latent classes are not directly used to estimate the average potential outcome, instead, the posterior probability of being in a specific class conditioning on past latent classes, past treatment and class indicators is used.

For this work, we assume the number of latent classes at each visit is finite, fixed and known. This means the treating clinician classifies patients into a fixed number of latent classes at each visit and such information is available to the analyst. Standard statistical information criteria including the Bayesian information criterion and the deviance information criterion (DIC) can be used in practice to select the number of latent classes and determine the fit of the Bayesian latent class approach.^34,35

3. Simulation study

3.1. Simulation with latent confounder class

We consider a simulation study with 1000 replications of a simple three-visit longitudinal study of 125, 250, and 500 subjects. Figure 2 displays the causal diagram of the simulated data. The simulated data consists of a continuous and a binary time-independent confounder $C_{i} = {C_{i 1}, C_{i 2}}$ , one set of class indicators for each visit, $X_{i 1} = {X_{i 11}, \dots, X_{i 1 p}}$ and $X_{i 2} = {X_{i 21}, \dots, X_{i 2 p}}$ , a time-dependent binary treatment ${Z_{i 1}, Z_{i 2}}$ , a time-dependent latent class ${U_{i 1}, U_{i 2}}$ taking one of three values and a continuous study outcome $Y_{i}$ . We add two time-independent confounders in this simulation study to resemble clinical analysis where demographic confounders such as sex and age at study entry are often included in the causal analysis.

Figure 2.

Longitudinal causal diagram of the simulation dataset with latent confounder class. $X_{i j}$ , $U_{i j}$ and $Z_{i j}$ represent class indicators, latent confounder classes and treatment for subject $i$ at visit $j (j = 1, 2)$ ; $C_{i}$ represents time-independent confounders and $Y_{i}$ is the end-of-study outcome for subject $i$ .

The simulated datasets are generated as follows: (1)

We generate $C_{i 1}$ from $N (10, 3)$ and $C_{i 2}$ from a Bernoulli distribution with $P (C_{i 2} = 1) = 0.6$ .

(2)

$U_{i 1}$ and $U_{i 2}$ are generated from multinomial distributions with

\begin{aligned} \log \frac{P (U_{i 1} = 2)}{P (U_{i 1} = 1)} & = \log \frac{P (U_{i 1} = 3)}{P (U_{i 1} = 1)} = 0.5 - 0.1 C_{i 1} + 0.2 C_{i 2} \\ \log \frac{P (U_{i 2} = 2)}{P (U_{i 2} = 1)} & = 0.5 - 0.1 C_{i 1} + 0.2 C_{i 2} + I_{{U_{i 1} = 2}} + 0.5 I_{{U_{i 1} = 3}} - Z_{i 1} \\ \log \frac{P (U_{i 2} = 3)}{P (U_{i 2} = 1)} & = 0.5 - 0.1 C_{i 1} + 0.2 C_{i 2} + 0.5 I_{{U_{i 1} = 2}} + I_{{U_{i 1} = 3}} - Z_{i 1} \end{aligned}

(4)

The simulated ratio between the three classes in each dataset is approximately valued at

42 : 29 : 29

(3)

The class indicators are Bernoulli random variables that are conditionally independent given $U_{i 1}$ and $U_{i 2}$ . We consider the number of indicators for each visit as $p = 5$ or $10$ . Letting $q = (q_{1}, \dots, q_{p})$ be a vector that describes how each indicator is related to each class level, we set $q = (1, 2, 2, 3, 3)$ for $p = 5$ , and $q = (1, 1, 1, 2, 2, 2, 3, 3, 3, 3)$ for $p = 10$ . Conceptually, extending the concepts of DCM, for $p = 5$ $P (X_{i j l} = 1 ∣ U_{i j} \neq 1)$ for $l = 1$ can be thought of as the guessing parameter for the latent class being at level $1$ , while $P (X_{i j l} = 0 ∣ U_{i j} = 1)$ is the slipping parameter for latent class level 1; similarly $P (X_{i j l} = 1 ∣ U_{i j} \neq 2)$ for $l = 2, 3$ can be thought of as the guessing parameter for the latent class being at $2$ , while $P (X_{i j l} = 0 ∣ U_{i j} = 2)$ is the slipping parameter for the latent class being at level 2, and similarly for latent class 3. High-, medium- and low-quality indicators had slipping $=$ guessing $= 0.1, 0.25, 0.6,$ respectively, and were operationalized as follows: a.

High-quality indicators: $logit (P (X_{i j l} = 1 ∣ U_{i j})) = - 2.2 + 4.4 I_{{U_{i j} = q_{l}}}$ .

Medium-quality indicators: $logit (P (X_{i j l} = 1 ∣ U_{i j})) = - 1.1 + 2.2 I_{{U_{i j} = q_{j}}}$ .

Low-quality indicators: $logit (P (X_{i j l} = 1 ∣ U_{i j})) = - 0.4 + 0.8 I_{{U_{i j} = q_{j}}}$ .

We consider four indicator quality combinations: all indicators are of high quality, all indicators are of medium quality, all indicators are of low quality and a mixed setting where one indicator for latent class 1 was high quality and the others were low quality (i.e.

logit (P (X_{i j 1} = 1 ∣ U_{i j})) = - 2.2 + 4.4 I_{{U_{i j} = 1}}

and

logit (P (X_{i j l} = 1 ∣ U_{i j})) = - 0.4 + 0.8 I_{{U_{i j} = q_{j}}}

for

l > 1

(4)

$Z_{i 1}$ and $Z_{i 2}$ are generated from Bernoulli distributions with

\begin{aligned} logit (P (Z_{i 2} = 1 ∣ {\bar{U}}_{i 2}, {\bar{Z}}_{i 1})) & = I_{{U_{i j} = 2}} + 1.5 I_{{U_{i j} = 3}} - Z_{i 1} - 1 \end{aligned}

(5)

\begin{aligned} logit (P (Z_{i 1} = 1 ∣ {\bar{U}}_{i 1}) & = I_{{U_{i 1} = 2}} + 1.5 I_{{U_{i 1} = 3}} - 1 \end{aligned}

(6)

(5)

The end-of-study continuous outcome $Y$ is generated from $N (μ_{y}, 1)$ with two simulation settings on the strength of the relationship between the latent class and the mean of $Y$ . a.

We define medium level latent class confounding as $μ_{y} = 0.1 + 0.5 Z_{i 1} + Z_{i 2} - 0.2 I_{{U_{i 1} = 2}} - 0.5 I_{{U_{i 1} = 3}} - 0.5 I_{{U_{i 2} = 2}} - I_{{U_{i 2} = 3}}$ .

We define high level latent class confounding as $μ_{y} = 0.1 + 0.5 Z_{i 1} + Z_{i 2} - 0.5 I_{{U_{i 1} = 2}} - I_{{U_{i 1} = 3}} - I_{{U_{i 2} = 2}} - 1.5 I_{{U_{i 2} = 3}}$ .

For this simulation study, the causal parameter of interest is the causal contrast in the outcome between ‘always treated’ and ‘never treated’,

E [Y^{(1, 1)}] - E [Y^{(0, 0)}]

. Given our simulated settings, we aim to assess estimation performance with respect to indicator quality and quantity, sample size and confounding strength of the latent class on the end-of-study outcome. We compare the proposed Bayesian latent class approach with five approaches:

(1)

Adjusted linear regression on the outcome,

Y_{i} = θ_{0} + θ_{1} C_{i 1} + θ_{2} C_{i 2} + θ_{3} Z_{i 1} + θ_{4} Z_{i 2} + θ_{5} X_{i 1} + θ_{6} X_{i 2} + ϵ_{i}

(7)

where

θ_{5} = (θ_{5, 1}, \dots θ_{5, p})

and

θ_{6} = (θ_{6, 1}, \dots, θ_{6, p})

. Once fitted, we obtain

E [Y^{(1, 1)}]

and

E [Y^{(0, 0)}]

by fixing the treatment sequence at (1,1) and (0,0) in (7);

(2)

Standard marginal structural model (MSM) without latent classes (treating class indicators as time-dependent confounders) uses the visit-specific treatment assignment model to obtain visit-specific IPTW weights, specifically

\begin{aligned} logit (P (Z_{i j} = 1 ∣ {\bar{Z}}_{i j - 1}, {\bar{X}}_{i j}, C_{i})) & = γ_{0} + γ_{1} C_{i 1} + γ_{2} C_{i 2} + γ_{3} Z_{i j} + γ_{4} Z_{i j - 1} \\ E (Y_{i} ∣ {\bar{Z}}_{i 2}) & = θ_{0} + θ_{1} Z_{i 1} + θ_{2} Z_{i 2} \end{aligned}

(8)

where

j = 1, 2

and

γ_{3} = (γ_{3, 1, 1}, \dots, γ_{3, 1, p}, γ_{3, 2, 1}, \dots, γ_{3, 2, p})

(3)

Longitudinal targeted maximum likelihood estimation (TMLE) without latent classes where we fit the treatment assignment model in (8) and the outcome model in (7).^36–38

We use diffuse normal priors,

N (0, 10)

, for all parameters. In total 4000 posterior draws were taken from 50,000 iterations and 30,000 burn-in with every fifth sample being collected to reduce autocorrelation between draws (‘thinning’). The convergence of the Markov chain is evaluated using Geweke’s convergence

Z

-score and an absolute value <2 is accepted.³⁹ The posterior predictive mean of the causal contrast (mean), relative bias (RB), empirical standard error (ESE), average standard error (ASE) and the 95% coverage probability of the simulation study are reported and evaluated. All simulations were carried out using JAGS, and example codes and results are available on the GitHub page of the first author https://github.com/Kuan-Liu/BayesianLatentCausalModel.

From Table 1, the Bayesian latent class approach outperforms others in the presence of medium- to high-quality class indicators. Specifically, the Bayesian latent class approach produces the least biased estimates when all indicators are of medium or high quality and when there is one high-quality indicator and all the others are low quality. However, when all indicators are of low quality, all methods return biased estimators and the Bayesian latent class approach produces the most biased estimates. Comparing the frequentist methods, the conventional MSM and TMLE without latent classes perform equally well with medium- to high-quality indicators. With a sample size of 500 and a medium level of latent confounding, as the number of indicators increased from 5 to 10, the estimation of the causal contrast $E [Y^{(1, 1)}] - E [Y^{(0, 0)}]$ improves in terms of bias and coverage across all methods. Similar findings are identified for the simulations with sample sizes of 250 (Supplemental Tables 1 and 3) with the distinction that while the Bayesian approach continues to outperform the other approaches for high- and medium-quality indicators, in the setting where there is just one high-quality indicator and four low-quality indicators, both TMLE and MSM outperform the Bayesian approach at a sample size of 125. Similar results also hold under high confounding (Supplemental Tables 2, 4 and 5).

Table 1.

Simulation results for $E [Y^{(1, 1)}] - E [Y^{(0, 0)}]$ over 1000 replications with 500 samples and medium latent confounding.

Sample size = 500	Number of indicators = 5						Number of indicators = 10
Medium confounding	Estimator	Mean	RB	ESE	ASE	CP	Mean	RB	ESE	ASE	CP
Indicator setting	True	1.67
High	Adjust	1.44	−13.99	0.16	0.16	69.9	1.47	−12.02	0.16	0.16	77.1
	MSMs	1.56	−6.61	0.18	0.19	91.7	1.62	−3.17	0.19	0.20	94.2
	TMLE	1.56	−6.30	0.21	0.20	89.4	1.62	−2.93	0.21	0.20	91.8
	Bayes	1.67	−0.53	0.17	0.17	94.7	1.66	−0.85	0.17	0.16	94.1
Medium	Adjust	1.31	−21.37	0.16	0.17	41.5	1.38	−17.21	0.17	0.16	59.0
	MSMs	1.37	−18.01	0.18	0.18	60.7	1.48	−11.41	0.18	0.19	82.3
	TMLE	1.38	−17.61	0.20	0.19	63.7	1.48	−11.24	0.20	0.20	81.4
	Bayes	1.65	−1.57	0.21	0.21	95.1	1.64	−2.22	0.19	0.19	93.3
Low	Adjust	1.20	−27.88	0.17	0.17	20.4	1.24	−26.04	0.17	0.17	25.6
	MSMs	1.21	−27.32	0.17	0.17	23.9	1.26	−24.77	0.18	0.18	34.2
	TMLE	1.21	−27.38	0.20	0.19	33.0	1.26	−24.81	0.20	0.19	40.6
	Bayes	1.17	−30.08	0.17	0.17	16.4	1.21	−27.66	0.22	0.18	23.5
High and low	Adjust	1.34	−19.60	0.16	0.16	46.9	1.35	−19.06	0.17	0.16	48.7
	MSMs	1.39	−16.51	0.17	0.18	66.2	1.45	−13.14	0.18	0.19	79.0
	TMLE	1.40	−16.07	0.20	0.19	69.9	1.46	−12.63	0.21	0.20	78.5
	Bayes	1.72	2.75	0.21	0.21	94.1	1.66	−1.07	0.17	0.17	94.4

RB: relative bias; ESE: empirical standard error; ASE: average standard error; CP: coverage probability; Adjust: adjusted linear regression; MSM: marginal structural model; TMLE: targeted maximum likelihood estimation; Bayes: Bayesian latent class approach.

The columns are posterior predictive mean (mean), RB, ESE, ASE and 95% CP.

3.2. Simulation without latent confounder class

To explore the performance of the Bayesian method under misspecification, we consider another two simulation studies with 1000 replications of a simple three-visit longitudinal study of 500 subjects where (i) the outcome variable is not directly affected by the latent confounder classes, and (ii) the latent confounder class is not featured in the causal framework. Figure 3(a) displays the causal diagram of the simulated data where the outcome model is free of the latent confounder class and Figure 3(b) displays the causal diagram of the simulated data where the treatment and outcome models are free of the latent confounder class. Simulation details are described in the online Supplemental material.

Figure 3.

Longitudinal causal diagram of the simulation dataset without latent confounder. $X_{i j}$ , $U_{i j}$ and $Z_{i j}$ represent class indicators, latent confounder classes and treatment for subject $i$ at visit $j$ , $j = 1, 2$ ; $C_{i}$ represents time-independent confounders and $Y_{i}$ is the end-of-study outcome for subject $i$ . (a) Outcome model free of latent confounder. (b) Treatment and outcome models free of latent confounders.

The simulation results with a misspecified outcome model are reported in Table 2 ( $p = 10$ ) and Supplemental Table 6 ( $p = 5$ ). The simulation results with a misspecified outcome and treatment models are reported in Table 3. In the case of a misspecified outcome model, where the time-varying treatment models were generated with latent confounding class but not the outcome model, we observe similar simulation results compared to Table 1. The Bayesian latent class approach performs well with high-quality class indicators, including the case when there is only one high-quality indicator. It produces the most biased estimates with low-quality class indicators. In the case of a misspecified outcome and treatment models, where the causal data are generated without the latent confounding class, the Bayesian latent class approach matches the performance of MSMs and TMLE with a near unbiased estimator and good coverage probability.

Table 2.

Simulation results for the estimated causal parameter $E [Y^{(1, 1)}] - E [Y^{(0, 0)}]$ over 1000 replications with 500 samples and the simulated data with outcome free of the latent confounder class.

Sample size = 500	Number of indicators = 10
Indicator setting	Estimator	Mean	RB	ESE	ASE	CP
High	Adjust	1.50	−2.92	0.16	0.18	95.4
True $= 1.55$	MSMs	1.55	0.04	0.19	0.20	95.3
	TMLE	1.55	0.03	0.22	0.20	93.1
	Bayes	1.54	−0.21	0.18	0.18	94.5
Medium	Adjust	1.50	3.88	0.16	0.20	98.0
True $= 1.44$	MSMs	1.45	0.76	0.20	0.22	96.1
	TMLE	1.45	0.47	0.21	0.20	93.8
	Bayes	1.37	−4.89	0.26	0.24	92.4
Low	Adjust	1.50	−1.91	0.16	0.19	97.7
True $= 1.53$	MSMs	1.52	−0.33	0.20	0.21	95.7
	TMLE	1.52	−0.41	0.21	0.20	91.9
	Bayes	1.56	4.08	0.31	0.26	89.3
High and low	Adjust	1.50	−0.82	0.16	0.19	97.9
True $= 1.51$	MSMs	1.51	−0.21	0.19	0.21	96.8
	TMLE	1.51	−0.21	0.19	0.19	93.4
	Bayes	1.54	2.06	0.20	0.20	94.2

The columns are posterior predictive mean, RB, ESE, ASE and 95% CP.

Table 3.

Simulation results for the estimated causal parameter $E [Y^{(1, 1)}] - E [Y^{(0, 0)}]$ over 1000 replications with 500 samples and the simulated data free of the latent confounder class.

Sample size = 500	Mean	RB	ESE	ASE	CP
Number of indicators = 5
True	1.72
Adjust	1.50	−12.48	0.17	0.16	72.8
MSMs	1.72	0.08	0.16	0.17	95.6
TMLE	1.72	0.14	0.24	0.23	92.5
Bayes	1.71	−0.06	0.17	0.17	94.1
Number of indicators = 10
True	1.93
Adjust	1.49	−22.61	0.16	0.17	24.8
MSMs	1.93	−0.25	0.16	0.18	96.8
TMLE	1.93	−0.17	0.22	0.21	93.2
Bayes	1.92	−0.30	0.16	0.17	96.0

The columns are posterior predictive mean, RB, ESE, ASE and 95% CP.

Overall, the performance of the Bayesian method was unchanged when the outcome is directly influenced by the indicators rather than the latent classes (under misspecified outcome model) (Table 2). Although slightly more biased than the frequentist methods, the Bayesian approach has little bias when all indicators are of medium or high quality and the bias difference reduces as the number of indicators increases. Having at least one high-quality indicator results in approximately unbiased estimators for all methods. However, when all indicators are of poor quality, the Bayesian method is biased while the frequentist methods return approximately unbiased estimators (Table 2). Lastly, the Bayesian latent class approach continues to do well when data are simulated without latent classes (Table 3).

4. Application to the JDM study

We applied the proposed Bayesian latent class approach to a retrospective cohort study of intravenous immunoglobulin (IVIg) in treating JDM using observational clinical data hosted at The Hospital for Sick Children (SickKids). JDM is a rare, chronic multisystem disease in children with incidence estimates of about 2–4 per million paediatric population in North America.⁴⁰ JDM is considered incurable at the moment and current treatment goals for managing newly diagnosed moderate JDM are to achieve inactive disease and prevent functional limitations.⁵ An earlier version of the observational SickKids JDM data was analysed using marginal structural models, BPSMs and two-step BPSA to investigate the effect of IVIg adjunct therapy on successfully managing JDM disease activity, compared to the standard therapy without IVIg.^12,41

The SickKids JDM data were recorded whenever a patient completed a scheduled clinical visit. The baseline visit $(t = 0)$ was defined as the date the patients enrolled in the SickKids JDM study and a follow-up window of 18 months after the initial baseline was considered. We included follow-up times at 6 months $(t = 6)$ , 12 months $(t = 12)$ and 18 months $(t = 18)$ . Clinical measurements and treatment prescription information at each follow-up time were collected. The final study cohort included 121 patients who were enrolled between December 1992 and August 2017 with a follow-up window of 18 months.

We defined the primary clinical outcome as the probability of achieving a score of zero on the Child Health Assessment Questionnaire (CHAQ) at 18 months. The CHAQ score, ranging between 0 and 3, is a functional assessment tool that measures the level of physical disability of children and its use has been validated in JDM.⁴² A score of zero indicates that the patient can perform activities of daily living without any difficulty and a score of three indicates the patient can no longer complete activities of daily living. At baseline and each follow-up visit, patients were prescribed either the standard JDM therapy or IVIg in addition to the standard JDM therapy by the treating rheumatologist. Patients were defined as being exposed to IVIg at each clinical visit if during the six months prior a prescription of IVIg was recorded. In total, 34 out of the 121 patients (28.1%) were exposed to IVIg within the first 18 months post-diagnosis. We assumed clinicians at the baseline and follow-up visits used the following five clinical indicators to assign treatment: present functional status (normal vs. abnormal), currently taking prednisone (yes vs. no), presence of Gottron’s papules, presence of heliotrope rash, and presence of abnormal nailfold capillaries.

Patients who were exposed to IVIg typically remained on IVIg for a few follow-up visits.⁴¹ Therefore, we are interested in the causal contrast in the probability of achieving daily living without any level of difficulty at 18 months between patients who were exposed to IVIg and remained on IVIg up to 18 months (‘always treated’) and patients who were IVIg free for 18 months (‘never treated’). In addition to the Bayesian latent class approach, all four frequentist methods included in the simulation study were applied to the JDM data. We fitted two Bayesian latent class models, one with two latent confounder classes at each visit and one with three latent confounder classes at each visit. The final Bayesian model was selected using the DIC (the smallest of the two models). We used diffuse normal priors (i.e. $N \sim (0, 10)$ selected on log-odds scale) and completed 18,000 MCMC draws from three MCMC chains each with 6000 draws (80,000 iterations and 50,000 burn-in with every fifth sample being gathered). MCMC convergence and mixing were evaluated using Geweke’s convergence $Z$ -score and graphically using traceplots.³⁹

The baseline CHAQ score was higher for patients who were exposed to IVIg compared to patients who were IVIg free within the first 18 months with the median CHAQ score valued at 1.63 for IVIg-exposed patients and the median CHAQ score valued at 1.06 for IVIg free patients. Over the study period, we observed an overall decrease in the CHAQ score for all patients (Figure 4). The majority of patients who started on IVIg remained on IVIg in subsequent visits. In total, 13 out of the 34 IVIg-exposed patients (38%) received IVIg since baseline and remained on IVIg for up to 18 months.

Figure 4.

CHAQ score distribution overtime by IVIg exposure within 18 months since diagnosis. CHAQ: child health assessment questionnaire; IVIg: intravenous immunoglobulin.

The DIC for the Bayesian model assuming two latent classes at each visit was 2905.2, while the DIC for the Bayesian model assuming three latent classes at each visit was 2924.9. Therefore, the Bayesian latent approach with two latent classes was selected. A trace plot of the causal parameter of interest is included in the Supplemental material (Figure S2). Table 4 examines the estimation stability of the latent class model, by showing the number and proportion of JDM patients who had their visit-specific latent class being consistently predicted across MCMC iterations. Consistently predicted latent classes were determined using the subject-visit specific posterior probability of being in each of the two classes with at least 90% or 80% probability. At baseline, it was possible to determine 75% of patients’ latent classes with at least 90% certainty and to determine 93% of patients with at least 80% certainty. Although the predicted probability of being in each class was reduced over follow-up visits, the majority of patients had their time-dependent latent classes identified with at least 79% certainty at follow-up visits.

Table 4.

Number of juvenile dermatomyositis subjects across the total 18,000 Markov chain Monte Carlo iterations with consistently predicted latent class memberships.

	Latent class A		Latent class B		Total
Visit	$n_{A}$	Proportion	$n_{B}$	Proportion	$n$	Proportion
At least 90% probability of being in class
baseline	25	0.21	66	0.55	91	0.75
6 months	13	0.11	70	0.58	83	0.69
12 months	3	0.02	78	0.64	81	0.67
At least 80% probability of being in class
baseline	33	0.27	80	0.66	113	0.93
6 months	15	0.12	85	0.70	100	0.83
12 months	10	0.08	88	0.73	98	0.81

$n_{A}$ : number of subjects labelled under class A; $n_{B}$ : number of subjects labelled under class B; $n$ : number of subjects labelled under class A or B; Proportion: proportion against the total number of study subjects.

Consistently predicted latent classes were determined using the subject-visit specific posterior probability of being in each of the two classes with at least 90% or 80% probability.

The estimated causal contrast of interest – the probability difference of achieving no living difficulty at 18 months between always treated and never treated, is presented in Table 5. G-computation, TMLE and the mode of the posterior predictive distribution under the Bayesian latent class approach were similar. Since only 34 patients were exposed to IVIg, MSMs might be sensitive to the time-dependent treatment assignment model specification. As expected, the TMLE estimate had the largest standard error compared to other methods. Although the point estimates of the causal contrast were all positive, all of the 95% confidence intervals corresponding to the four frequentist methods included zero, indicating a statistically insignificant difference in the outcome.

Table 5.

The estimated absolute risk difference on achieving no living difficulty at 18 months between always treated and never treated.

	Always treated versus never treated
Methods	Point estimate	Standard error
Adjust	0.170	0.123
MSMs	0.151	0.137
G-comp	0.105	0.135
TMLE	0.108	0.165
Bayes (mean)	0.043	0.112
(median)	0.050	–

Adjust: generalized linear regression on the primary outcome adjusting for time-dependent treatment and clinical indicators; MSM: marginal structural model; TMLE: targeted maximum likelihood estimation; Bayes: Bayesian latent class approach.

In addition to point estimates, the Bayesian method gives a probability distribution for the risk difference (Figure 5). Under the Bayesian latent class approach, we can conclude that there is a 69% chance of observing a higher probability of achieving no living difficulty (a risk difference >0) at 18 months under always treated compared to never treated. We can also easily assess the likelihood of observing other clinically relevant risk differences.

Figure 5.

Posterior predictive distribution of the estimated probability difference on achieving no living difficulty at 18 months between always treated and never treated. AUC: area under the curve; CR: credible region.

5. Discussion

In this article, we presented and demonstrated a Bayesian latent class approach to conduct causal inference with time-dependent treatment and time-dependent confounding. The introduction of the visit-specific latent confounder class to the causal framework permits a full Bayesian estimation of the causal effects via posterior predictive inference and naturally achieves dimension reduction of the confounders. Based on the simulation study, the proposed Bayesian latent class approach performs well when the clinician-driven treatment assignment follows the hypothesized unobserved classification process with medium- to high-quality class indicators. Furthermore, the Bayesian latent approach still achieves approximately unbiased estimation when the true outcome process is free of the latent confounder class as well as when both the true outcome and treatment assignment processes are free of the latent confounder class.

The proposed Bayesian latent class approach has a few advantages over existing Bayesian causal inference methods, namely, the parametric Bayesian g-computation and Bayesian MSMs. Compared to Bayesian g-computation, the Bayesian latent class approach is simpler to specify the joint distribution of a potentially large number of confounding indicators.¹¹ Additionally, compared to Bayesian MSMs, the Bayesian latent class approach is fully Bayesian.^9,12 Our approach has the potential to overcome unmeasured confounding (in the form of unmeasured class indicators), if the observed class indicators are sufficient to infer the distribution of the latent confounder class and the treatment assignment mechanism follows the clinician-driven classification process.

In this work, we have assumed that there is a true underlying latent class $U$ that the physician recovers without misclassification. However, in some cases, it may be reasonable to allow for misclassification, in which case the causal model needs to be expanded to include the misclassified $U$ ( $U^{'}$ ) (see Supplemental Figure S1). In this new DAG, we have $U^{'} \to Z$ and $U \to X$ ; while the indicators $X$ remain conditionally independent given $U$ , they are not conditionally independent given $U^{'}$ . We thus need to model both $U$ and $U^{'}$ . In cases where identifiability is an issue, external information on misclassification of $U$ may be necessary.

The estimation unbiasedness of the Bayesian latent class approach depends on the quality of the class indicators and the number of class indicators (although less sensitive to the latter). A post-hoc approach to examine indicator quality, as used in the applied JDM analysis, is by investigating the posterior predicted latent class membership. If most patients under the Bayesian latent class model have at least 90% of the predicted latent class membership being identical across posterior draws, we can argue the latent class model is fitted with good quality indicators.

In some cases, it is easy to verify the clinician-driven treatment assignment follows a classification process (e.g. studies that conducted at a single clinical centre with a small number of treating physicians), but in others less so (e.g. multicenter or multinational studies). It is reassuring that the Bayesian latent class approach achieved unbiased estimation in the two simulation studies with misspecified models. In the applied JDM analysis, without the knowledge of the treatment assignment mechanism, the Bayesian latent class approach returned similar point estimates and standard error compared to popular frequentist causal methods. In application, we recommend the use of standard information criteria such as DIC to assess the fit of the Bayesian latent class model, in particular, when comparing different latent class models with varying numbers of latent classes.

The proposed latent class causal mechanism in this article assumes no direct causal pathways connecting the time-dependent class indicators to the treatment and the outcome and that only time-independent confounders (acting as latent class model covariates) are considered as presented in the simulation study. A few alternative and more generalized causal mechanisms with latent confounder classes would be of interest in future work including causal structures with arrows pointing from the class indicators to the treatment, the outcome and time-dependent covariates with arrows pointing to the latent classes. The proposed latent class approach requires a binary class indicator. In future work, the latent class model can be extended with a mixture of continuous and dichotomized class indicators and a time-dependent number of latent classes.

In summary, our proposed Bayesian latent class approach performed well with medium- to high-quality indicators, was robust to the studied forms of misspecification, and remains tractable in the presence of high-dimensional confounders. We thus suggest that analysts consider it in cases where it is plausible that a latent disease classification drives both disease outcomes and treatment decisions.

Supplemental Material

sj-pdf-1-smm-10.1177_09622802241298704 - Supplemental material for A Bayesian latent class approach to causal inference with longitudinal data

Supplemental material, sj-pdf-1-smm-10.1177_09622802241298704 for A Bayesian latent class approach to causal inference with longitudinal data by Kuan Liu, Olli Saarela, George Tomlinson, Brian M Feldman and Eleanor Pullenayegum in Statistical Methods in Medical Research

Footnotes

Acknowledgements

The authors thank Ingrid Goh for facilitating access to the Hospital for Sick Children JDM clinical data.

Declaration of conflicting interests

The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This research was supported by funding from a Canadian Institutes of Health Research Doctoral Award (Frederick Banting and Charles Best Canada Graduate Scholarship) and the Connaught New Researcher Award to the first author.

ORCID iDs

Kuan Liu

Olli Saarela

Eleanor Pullenayegum

Supplemental material

Supplemental material for this article is available online.

References

Feinstein

. Epidemiologic analyses of causation: The unlearned scientific lessons of randomized trials. J Clin Epidemiol 1989; 42: 481–489.

Suissa

Garbe

. Primer: Administrative health databases in observational studies of drug effects—advantages and disadvantages. Nat Rev Rheumatol 2007; 3: 725.

Kimura

Grevich

Beukelman

, et al. Pilot study comparing the childhood arthritis & rheumatology research alliance (CARRA) systemic juvenile idiopathic arthritis consensus treatment plans. Pediatr Rheumatol 2017; 15: 23.

Nigrovic

Beukelman

Tomlinson

, et al. Bayesian comparative effectiveness study of four consensus treatment plans for initial management of systemic juvenile idiopathic arthritis: First-line options for systemic juvenile idiopathic arthritis treatment (FROST). Clin Trials 2018; 15: 268–277.

Liu

Tomlinson

Reed

, et al. Pilot study of the juvenile dermatomyositis consensus treatment plans: A CARRA registry study. J Rheumatol 2021; 48: 114–122.

McCandless

Gustafson

Austin

. Bayesian propensity score analysis for observational data. Stat Med 2009; 28: 94–112.

McCandless

Douglas

Evans

, et al. Cutting feedback in Bayesian regression adjustment for the propensity score. Int J Biostat 2010; 6(2): Ariticle 16.

Zigler

Watts

Yeh

, et al. Model feedback in Bayesian propensity score estimation. Biometrics 2013; 69: 263–273.

Saarela

Stephens

Moodie

, et al. On Bayesian estimation of marginal structural models. Biometrics 2015; 71: 279–288.

10.

Müller

Wahed

, et al. Bayesian nonparametric estimation for dynamic treatment regimes with sequential transition times. J Am Stat Assoc 2016; 111: 921–950.

11.

Keil

Daza

Engel

, et al. A Bayesian approach to the g-formula. Stat Methods Med Res 2018; 27: 3183–3204.

12.

Liu

Saarela

Feldman

, et al. Estimation of causal effects with repeatedly measured outcomes in a Bayesian framework. Stat Methods Med Res 2020; 29: 2507–2519.

13.

Oganisian

Getz

Alonzo

, et al. Bayesian semiparametric model for sequential treatment decisions with informative timing. Biostatistics 2024; 25: 947–961.

14.

Ding

Mealli

. Bayesian causal inference: A critical review. Philos Trans R Soc A 2023; 381: 20220153.

15.

Rosenbaum

Rubin

. Assessing sensitivity to an unobserved binary covariate in an observational study with binary outcome. J R Stat Soc Ser B (Methodol) 1983; 45: 212–218.

16.

Angrist

Imbens

Rubin

. Identification of causal effects using instrumental variables. J Am Stat Assoc 1996; 91: 444–455.

17.

Rosenbaum

. Observational studies. Springer series in statistics. New York, NY: Springer, 2002.

18.

Hernán

Robins

. Instruments for causal inference: An epidemiologist’s dream? Epidemiology 2006; 17: 360–372.

19.

Greenland

. Sensitivity analysis, Monte Carlo risk analysis, and Bayesian uncertainty assessment. Risk Anal 2001; 21: 579–584.

20.

McCandless

Gustafson

Levy

. Bayesian sensitivity analysis for unmeasured confounding in observational studies. Stat Med 2007; 26: 2331–2347.

21.

Gustafson

McCandless

Levy

, et al. Simplified Bayesian sensitivity analysis for mismeasured and unobserved confounders. Biometrics 2010; 66: 1129–1137.

22.

McCandless

Gustafson

Levy

, et al. Hierarchical priors for bias parameters in Bayesian sensitivity analysis for unmeasured confounding. Stat Med 2012; 31: 383–396.

23.

McCandless

Richardson

Best

. Adjustment for missing confounders using external validation data and propensity scores. J Am Stat Assoc 2012; 107: 40–51.

24.

McCandless

Gustafson

. A comparison of Bayesian and Monte Carlo sensitivity analysis for unmeasured confounding. Stat Med 2017; 36: 2887–2901.

25.

Miao

Geng

Tchetgen Tchetgen

. Identifying causal effects with proxy variables of an unmeasured confounder. Biometrika 2018; 105: 987–993.

26.

Shardell

Ferrucci

. Joint mixed-effects models for causal inference with longitudinal data. Stat Med 2018; 37: 829–846.

27.

McCandless

Gustafson

Levy

. A sensitivity analysis using information about measured confounders yielded improved uncertainty assessments for unmeasured confounding. J Clin Epidemiol 2008; 61: 247–255.

28.

Lanza

Coffman

. Causal inference in latent class analysis. Struct Equ Modelling 2013; 20: 361–383.

29.

Lanza

Schuler

Bray

. Latent class analysis with causal inference: the effect of adolescent depression on young adult substance use profile. In: von Eye A and Wiedermann W (eds) Statistics and causality: Methods for applied empirical research. Hoboken, New Jersey: John Wiley & Sons, 2016.

30.

Rupp

Templin

Henson

. Diagnostic measurement: Theory, methods, and applications. New York, NY: The Guilford Press, 2010.

31.

Dolgin

Fox

Gorlin

, et al. Nomenclature and criteria for diagnosis of diseases of the heart and great vessels, 9th ed. Boston, MA: Lippincott Williams and Wilkins, 1994.

32.

Rubin

. Randomization analysis of experimental data: the fisher randomization test comment. J Am Stat Assoc 1980; 75: 591–593.

33.

Bernardo

Smith

. Bayesian theory, vol. 405. Chichester, England: John Wiley & Sons, 2000.

34.

Schwarz

, et al. Estimating the dimension of a model. Ann Stat 1978; 6: 461–464.

35.

Spiegelhalter

Best

Carlin

, et al. Bayesian measures of model complexity and fit. J R Stat Soc Ser B (Stat Methodol) 2002; 64: 583–639.

36.

Petersen

Schwab

Gruber

, et al. Targeted maximum likelihood estimation for dynamic and static longitudinal marginal structural working models. J Causal Inference 2014; 2: 147–185.

37.

Schnitzer

Moodie

van der Laan

, et al. Modeling the impact of hepatitis C viral clearance on end-stage liver disease in an HIV co-infected cohort with targeted maximum likelihood estimation. Biometrics 2014; 70: 144–152.

38.

Lendle

Schwab

Petersen

, et al. ltmle: An R package implementing targeted minimum loss-based estimation for longitudinal data. J Stat Softw 2017; 81: 1–21.

39.

Geweke

. Evaluating the accuracy of sampling-based approaches to the calculation of posterior moments. Staff Report 148, Federal Reserve Bank of Minneapolis, 1991. https://ideas.repec.org/p/fip/fedmsr/148.html.

40.

Mendez

Lipton

Ramsey-Goldman

, et al. US incidence of juvenile dermatomyositis, 1995–1998: Results from the National Institute of Arthritis and Musculoskeletal and Skin Diseases registry. Arthritis Care Res 2003; 49: 300–305.

41.

Lam

Manlhiot

Pullenayegum

, et al. Efficacy of intravenous Ig therapy in juvenile dermatomyositis. Ann Rheum Dis 2011; 70: 2089–2094.

42.

Feldman

Ayling-Campos

Luy

, et al. Measuring disability in juvenile dermatomyositis: Validity of the childhood health assessment questionnaire. J Rheumatol 1995; 22: 326–331.

Supplementary Material

Please find the following supplemental material available below.

For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.

For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.

0.00 MB

0.57 MB