Sage Journals: Discover world-class research

Abstract

In clinical research, it is important to study whether certain clinical factors or exposures have causal effects on clinical and patient-reported outcomes such as toxicities, quality of life, and self-reported symptoms, which can help improve patient care. Usually, such outcomes are recorded as multiple variables with different distributions. Mendelian randomization (MR) is a commonly used technique for causal inference with the help of genetic instrumental variables to deal with observed and unobserved confounders. Nevertheless, the current methodology of MR for multiple outcomes only focuses on one outcome at a time, meaning that it does not consider the correlation structure of multiple outcomes, which may lead to a loss of statistical power. In situations with multiple outcomes of interest, especially when there are mixed correlated outcomes with different distributions, it is much more desirable to jointly analyze them with a multivariate approach. Some multivariate methods have been proposed to model mixed outcomes; however, they do not incorporate instrumental variables and cannot handle unmeasured confounders. To overcome the above challenges, we propose a two-stage multivariate Mendelian randomization method (MRMO) that can perform multivariate analysis of mixed outcomes using genetic instrumental variables. We demonstrate that our proposed MRMO algorithm can gain power over the existing univariate MR method through simulation studies and a clinical application on a randomized Phase III clinical trial study on colorectal cancer patients.

Keywords

Mendelian randomization mixed correlated outcomes multivariate analysis instrumental variable toxicity and quality of life high-dimensional modeling

1. Introduction

With the increasing collection of data on various clinical and patient-reported outcomes (e.g. toxicities and quality of life measures) in clinical and observational studies, there is growing interest in detecting the causal relationship between an exposure variable (X) and multiple outcomes of interest (Y).^1–5 For example, some baseline clinical characteristics may affect a patient's risk of experiencing certain toxicities after receiving treatment. Finding and studying such causal effects may help clinicians predict the risk of adverse events and make better treatment decisions.^6,7 This can be illustrated through the analysis of data from the CO.20 trial conducted by the Canadian Cancer Trials Group, which was a phase III randomized, placebo-controlled study aiming to examine the effect of the addition of brivanib (BRI) to cetuximab on colorectal cancer patients.^8–10 The original analysis of the study showed that compared to cetuximab plus placebo treatment, cetuximab plus BRI was associated with increased toxicities and did not significantly improve the overall survival of patients.⁸ It is of interest to investigate whether certain baseline variables have any effect on the selected toxicities that a patient may experience when undergoing the cetuximab plus placebo treatment, with the mixed-type measurements for these toxicities.

There are two major challenges, however, in these analyses. One challenge is that, although patients may be randomized by treatment, they may not be balanced by the baseline variable X of interest, and therefore, directly modelling the relationship between X and Y without considering the confounders, whether measured or unmeasured, may lead to misleading results.¹¹ The other challenge is that sometimes the multiple outcomes of interest may be correlated and mixed with different types of distributions (e.g. binary and continuous).¹² Studying each outcome separately is a commonly used straightforward approach but does not use the correlation information between outcomes, while modelling mixed outcomes together may utilize more information but require more complex models.^13,14

From a clinical standpoint, there exists a major analytical gap in our ability to simultaneously study the influence of a causal effect on multiple outcomes, particularly in the context of toxicities. In the areas of rheumatology, immunology/transplantation, and oncology, it is common for patients and physicians to accept multiple moderate to severe toxicities in their treatments, as these treatments are considered life-saving. Most of the drugs used in these settings exhibit widely variable toxicities from person to person. Understanding the etiological factors that lead to significant toxicities in some patients but not others is fundamental to precision medicine. Innovative methods for analyzing individual etiological factors on multi-dimensional toxicities, particularly those that are graded differently (e.g. ordinal symptoms vs continuous laboratory-based toxicities) are greatly needed.

Facing the challenge of confounders, people have been applying the Mendelian randomization (MR) approach, which uses genetic variants as instrumental variables.¹⁵ Under certain instrumental variable assumptions, MR methods can efficiently estimate and make inferences on the causal effect of the exposure on a single outcome. Different MR methods have been proposed to address a variety of problems, such as how to handle invalid instruments and how to incorporate multiple exposures.^16–21 Nevertheless, we are not aware of any existing MR techniques that allow researchers to jointly model mixed outcome variables. When multiple outcomes need to be examined, people usually conduct univariate analysis, which analyzes each outcome separately. This may not be ideal for testing the overall hypothesis, whether the exposure has any effects on any outcomes, since the correlation structure between different outcomes is not considered, and Bonferroni correction is usually required, which is known to be conservative.²²

To address the above issues, it is desirable to develop a multivariate MR approach that can handle mixed outcomes. Different methods have been proposed to directly model mixed outcomes jointly,^12,23–26 while it is computationally challenging to implement most of them into the MR framework, especially when the full likelihood is considered. Bai et al.²⁷ proposed using a composite-likelihood method with the Newton–Rapson algorithm to alleviate the computational issue.²⁸ All these existing methods only model the relationship between X and Y without using instrumental variables, which means they cannot handle unmeasured confounders.

We have developed a two-stage multivariate MR method for mixed outcomes, denoted by MRMO that combines the two-stage MR approach with the composite-likelihood-based mixed response model to assess the causal effects of the exposure on different outcomes. For instance, MRMO can be used to test whether a baseline variable (e.g. baseline magnesium) has a causal effect on potential disease outcomes such as rash and nausea, as its MR component can handle the effects of potential confounders such as age, gender, drinking status, etc. We propose using the gradient descent algorithm when jointly modeling mixed outcomes, which has been shown to have a number of advantages over the commonly used Newton–Rapson algorithm.^29–31 In terms of testing the overall hypothesis, it is natural to consider the Wald test for multivariate models. Besides that, we propose conducting the aSPU (adaptive sum of powered score) test for multiple outcomes,^32,33 which can combine a class of powered sum tests to gain power in certain scenarios. Through extensive simulations and an application on a randomized Phase III clinical trial study (CO.20) on colorectal cancer patients,^8–10 we demonstrate that compared to the commonly used univariate MR analysis, our proposed multivariate approach is able to improve the power of testing the overall hypothesis while controlling type I errors, which can help clinicians find potential associations and generate new hypotheses.

This manuscript is structured as follows. In section 2, we introduce the concept of univariate MR in the presence of multiple outcomes and then present our multivariate MR approach to model mixed outcomes jointly. We also propose different algorithms and tests for our multivariate analysis. In section 3, we show the settings and results of our simulation studies, as well as an application to the CO.20 study data to compare the performance of different methods. In section 4, we present discussions summarizing the advantages and limitations of our new approach and propose some potential future directions.

2. Methods

2.1 Univariate model for Mendelian randomization with multiple outcomes

In this section, we illustrate the basic framework of traditional MR analysis, depicted by Figure 1(A). Suppose we are interested in studying the causal relationship between an exposure (X) and a single outcome (Y₁). Directly modelling Y₁ versus X may give us very problematic results, since there may be confounders (U) that are unmeasured or not included in the model. MR analysis incorporates genetic variants G, typically SNPs (single-nucleotide polymorphisms), to serve as instrumental variables (IVs) for causal inference. Using IVs can help us avoid the confounding problem if certain IV assumptions are met (i.e. IVs should be associated with X; IVs should not affect or be affected by U; and IVs should not affect Y₁ through any pathways other than X). For example, in a colorectal cancer study,^8–10 clinicians may be interested in examining whether an exposure variable like body mass index (BMI) has a causal effect on a disease outcome like quality of life. With the causal estimand defined as the average change in the quality of life for a unit change in BMI, which can be denoted as $E (Y_{j} | X = x + 1) - E (Y_{j} | X = x)$ , it is hard to estimate or test this causal effect without the MR technique since there may be measured or unmeasured potential confounders (e.g. drinking status and physical activities) that affect the effect estimates. Using the MR technique properly with certain genetic instrumental variables, which are associated with BMI but independent of the potential confounders, the causal effects can be better estimated by alleviating the confounding effects.¹⁵

Figure 1.

Basic framework. (A) MR analysis with a single outcome. (B) MR with multiple outcomes and univariate analysis. (C) MR with multiple outcomes and multivariate analysis.

When there are multiple outcomes of interest (e.g. different types of toxicities such as rash, nausea, and vomiting in cancer trials), people normally conduct several univariate analyses separately, which applies MR to one outcome at a time, as depicted in Figure 1(B) for two outcomes Y₁ and Y₂. However, analyzing each outcome separately means ignoring the correlation information from the multiple outcomes, which may lead to loss of power, especially when we want to test the overall null hypothesis: the exposure does not affect any of the outcomes. Meanwhile, multivariate analysis, illustrated in Figure 1(C), models and analyzes multiple outcomes in a single model simultaneously. It may give us a better framework for testing the null hypothesis while accounting for the correlation structure of the different outcomes.

Before presenting our multivariate approach, we would like to give a brief overview of the two-stage univariate MR analyses using individual level data in a two-sample scenario, where we have two different datasets with different subjects for the exposure and outcomes. Suppose we have an exposure variable X, p genetic IVs and q outcomes, and the first sample set contains $n^{*}$ subjects with $G^{*} = (G_{i l}^{*})_{n^{*} \times p}$ and $X^{*} = (X_{i}^{*})_{n^{*} \times 1}$ , where $G_{i l}^{*}$ and $X_{i}^{*}$ are the l th genotype value and the exposure value, respectively, for the i th subject in the first sample. The second sample set contains n subjects with $G = (G_{i l})_{n \times p}$ and $Y = (Y_{i j})_{n \times q}$ , where $G_{i l}$ and $Y_{i j}$ are the l th genotype value and j th outcome value, respectively, for the i th subject in the second sample set. In the two-sample scenario, the $n^{*}$ subjects and n subjects in the two sample sets do not overlap.

The first-stage regression model for the univariate MR is

\begin{aligned} X_{i}^{*} = α_{0} + \sum_{k = 1}^{p} α_{k} G_{i l}^{*} + e_{X i}^{*} . \end{aligned}

(1)

The first-stage coefficient estimates, denoted by

{\hat{α}}_{0}

and

{\hat{α}}_{k}

's, are used to predict the exposure value in the second sample, with

{\hat{X}}_{i} = {\hat{α}}_{0} + \sum_{k = 1}^{p} {\hat{α}}_{k} G_{i l}

. Then the second-stage regression models are

\begin{aligned} {\begin{array}{cc} Y_{i j} = β_{j 0} + β_{j 1} {\hat{X}}_{i} + e_{Y, i j} (if outcome j is continuous) \\ l o g i t P (Y_{i j} = 1) = β_{j 0} + β_{j 1} {\hat{X}}_{i} (if outcome j is binary) \end{array} . \end{aligned}

(2)

Note that for each outcome j, a single regression model is fitted separately, and the correlation structure between different outcomes is ignored. There are possible variations of the two-stage univariate MR analysis, including the adjusted two-stage method. The literature has shown that the two-stage method with residual inclusion, denoted by 2SRI, may be more suitable for binary outcomes than the standard two-stage method without residual inclusion.³⁴ A study by Xue and Pan³⁵ has also shown that the standard method may perform well and even better than 2SRI in certain scenarios, given that the effect sizes of the SNPs are usually small. For simplicity, in this article, we only focus on the standard method, which is more commonly used and can provide a valid test of the association between the exposure and each outcome.¹⁵ The p-value of testing the exposure effect on a single outcome j is simply the p-value of the corresponding coefficient

β_{j 1}

. For binary outcomes, we also consider replacing logistic regression with probit regression, as it also allows us to test the association and is more comparable to the mixed response model we will describe, which uses the probit link.

2.2 Multivariate mixed response model for Mendelian randomization

In this section, we propose our two-stage multivariate MR method for mixed outcomes, which is expected to be more efficient than the univariate method when testing the overall null hypothesis (the exposure does not affect any of the outcomes). The first-stage model is the same as what we have described in the univariate two-stage MR method, of which the main purpose is to model the association between the genetic information and the exposure for the second sample. In the second stage, instead of modeling each outcome separately, we propose to model different outcomes jointly using a multivariate mixed response model.^26,27 The relationship between each outcome and the fitted exposure can be modelled using a generalized linear model

\begin{aligned} g_{j} (E (Y_{i j})) = β_{j 0} + β_{j 1} {\hat{X}}_{i}, \end{aligned}

(3)

where the link function

g_{j}

can be chosen as the identity function for continuous outcomes, and the logistic or probit function for binary outcomes.

To model all the outcomes simultaneously while incorporating their correlation structure, applying a conventional likelihood-based approach is possible but computationally very challenging. Using the pairwise composite likelihood approach as described in²⁷ may be a better option. For each pair of outcomes $(j, k)$ , the pairwise likelihood function can be written as

\begin{aligned} L_{j k} (θ_{j k}) = \prod_{i = 1}^{n} f (Y_{i j}, Y_{i k}), \end{aligned}

(4)

where

θ_{j k}

represents the relevant parameters for

L_{j k}

f (Y_{i j}, Y_{i k})

is the pairwise likelihood for a single subject i only considering the joint distribution of outcomes

j, k

, which we will discuss shortly. Then the pairwise composite likelihood should be

\begin{aligned} CL (θ) = \prod_{j = 1}^{q - 1} \prod_{k = j + 1}^{q} L_{j k} (θ_{j k}), \end{aligned}

(5)

where

θ

represents all of the parameters.^36,37 The composite score function is

\begin{aligned} U (θ) = \frac{\partial}{\partial θ} logCL (θ), \end{aligned}

(6)

which means we can estimate the parameters by solving

U (θ) = 0

. Now we only need to specify

f (Y_{i j}, Y_{i k})

properly according to the data types.

If a pair of outcomes $(j, k)$ are both continuous, we assume that they follow

\begin{aligned} (\begin{matrix} Y_{i j} \\ Y_{i k} \end{matrix}) \sim MVN ((\begin{matrix} μ_{i j} \\ μ_{i k} \end{matrix}), (\begin{array}{cc} σ_{j}^{2} & ρ_{j k} σ_{j} σ_{k} \\ ρ_{j k} σ_{j} σ_{k} & σ_{k}^{2} \end{array})), \end{aligned}

(7)

where

μ_{i j} = β_{j 0} + β_{j 1} {\hat{X}}_{i}

and

μ_{i k} = β_{k 0} + β_{k 1} {\hat{X}}_{i}

, which means

θ_{j k}

consists of

β_{j 0}

β_{j 1}

β_{k 0}

β_{k 1}

σ_{j}

σ_{k}

and

ρ_{j k}

. Based on the density of a bivariate normal distribution, we have

\begin{aligned} f (Y_{i j}, Y_{i k}) = \frac{1}{2 π σ_{j} σ_{k} \sqrt{1 - ρ_{j k}^{2}}} exp {- \frac{1}{2 (1 - ρ_{j k}^{2})} [\frac{{(Y_{i j} - μ_{i j})}^{2}}{σ_{j}^{2}} - \frac{2 (Y_{i j} - μ_{i j}) (Y_{i k} - μ_{i k})}{σ_{j} σ_{k}} + \frac{{(Y_{i k} - μ_{i k})}^{2}}{σ_{k}^{2}}]} . \end{aligned}

(8)

If a pair of outcomes

(j, k)

are both binary, following,²⁷ we set up a pair of latent variables

(Z_{i j}, Z_{i k})

and assume

\begin{aligned} Y_{i j} & = {\begin{array}{cc} 1 (if Z_{i j} \geq 0) \\ 0 (if Z_{i j} < 0) \end{array}, \end{aligned}

(9)

\begin{aligned} Y_{i k} & = {\begin{array}{cc} 1 (if Z_{i k} \geq 0) \\ 0 (if Z_{i k} < 0) \end{array}, \end{aligned}

(10)

\begin{aligned} (\begin{matrix} Z_{i j} \\ Z_{i k} \end{matrix}) & \sim MVN ((\begin{matrix} μ_{i j} \\ μ_{i k} \end{matrix}), (\begin{array}{cc} 1 & ρ_{j k} \\ ρ_{j k} & 1 \end{array})), \end{aligned}

(11)

where

μ_{i j} = β_{j 0} + β_{j 1} {\hat{X}}_{i}

and

μ_{i k} = β_{k 0} + β_{k 1} {\hat{X}}_{i}

. Note that modelling

Y_{i j}

is a way similar to using a probit regression model. Since

(Z_{i j} - μ_{i j})

follows a standard normal distribution, we know

P (Y_{i j} = 1) = P (Z_{i j} \geq 0) = P (Z_{i j} - μ_{i j} \geq - μ_{i j}) = Φ (μ_{i j})

with

Φ

being the cumulative distribution function of the standard normal distribution. As a result, we have

probitP (Y_{i j} = 1) = β_{j 0} + β_{j 1} {\hat{X}}_{i}

. Let

s_{i j} = 2 Z_{i j} - 1

, and

s_{i k} = 2 Z_{i k} - 1

. According to,²⁷ we have

\begin{aligned} f (Y_{i j}, Y_{i k}) = P (Y_{i j}, Y_{i k}) = Φ_{2} (s_{i j} μ_{i j}, s_{i k} μ_{i k}, s_{i j} s_{i k} ρ_{j k}), \end{aligned}

(12)

where

Φ_{2}

is the bivariate normal cumulative density function.

If one outcome j is binary, and another outcome k is continuous, we assume that

\begin{aligned} Y_{i j} & = {\begin{array}{cc} 1 (if Z_{i j} \geq 0) \\ 0 (if Z_{i j} < 0) \end{array}, \end{aligned}

(13)

\begin{aligned} (\begin{matrix} Z_{i j} \\ Y_{i k} \end{matrix}) & \sim MVN ((\begin{matrix} μ_{i j} \\ μ_{i k} \end{matrix}), (\begin{array}{cc} 1 & ρ_{j k} σ_{k} \\ ρ_{j k} σ_{k} & σ_{k}^{2} \end{array})), \end{aligned}

(14)

which gives us

\begin{aligned} Z_{i j} | Y_{i k} \sim N (μ_{i j | k} = μ_{i j} + \frac{ρ_{j k}}{σ_{k}} (Y_{i k} - μ_{i k}), σ_{j | k}^{2} = 1 - ρ_{j k}^{2}) . \end{aligned}

(15)

It can be derived that

\begin{aligned} f (Y_{i j}, Y_{i k}) = P (Y_{i j} | Y_{i k}) f (Y_{i k}) = [1 (Y_{i j} = 0) Φ (- \frac{μ_{i j | k}}{σ_{j | k}}) + 1 (Y_{i j} = 1) Φ (\frac{μ_{i j | k}}{σ_{j | k}})] \frac{1}{σ_{k} \sqrt{2 π}} \exp (- \frac{{(Y_{i k} - μ_{i k})}^{2}}{2 σ_{k}^{2}}) . \end{aligned}

(16)

As a result, we can calculate

f (Y_{i j}, Y_{i k})

for any pair of outcomes given

θ

, and we can find the best

θ

by solving

U (θ) = 0

. To perform hypothesis testing, we also need to obtain the covariance matrix of the coefficient estimates

\hat{θ}

. Based on the asymptotic theories for the composite likelihood function,^27,38–40 the covariance matrix

Cov (\hat{θ})

can be estimated as

\begin{aligned} \frac{1}{n} [H (\hat{θ}) J^{- 1} (\hat{θ}) H (\hat{θ})]^{- 1}, \end{aligned}

(17)

where

\begin{aligned} H (\hat{θ}) & = - \frac{1}{n} \sum_{i = 1}^{n} \nabla_{θ} U (θ; Y_{i}) |_{\hat{θ}}, \end{aligned}

(18)

\begin{aligned} J (\hat{θ}) & = - \frac{1}{n} \sum_{i = 1}^{n} U (\hat{θ}; Y_{i}) U (\hat{θ}; Y_{i})^{'} . \end{aligned}

(19)

$U (θ; Y_{i})$ denotes the score for subject i, which can be written as

\begin{aligned} \frac{\partial}{\partial θ} log \prod_{j = 1}^{q - 1} \prod_{k = j + 1}^{q} f (Y_{i j}, Y_{i k}) . \end{aligned}

(20)

To estimate $θ$ , we can use the traditional Newton–Raphson method, sometimes called Newton's method, as suggested by Bai et al.²⁷ to solve $U (θ) = 0$ . However, as explored by several researchers,^29–31 this algorithm relies heavily on the computation of the second derivative and may encounter problems like saddle points. If the initial estimates are far from the true parameters, Newton's method may not work well. Hence, we propose to apply the gradient descent approach, which is more widely used in machine learning.^41–43 Aiming to find $\hat{θ}$ that minimizes $logCL (θ)$ , our algorithm is described below:

Choose the initial values of the parameters, denoted by $θ^{(0)}$ . A simple but effective option is to use the marginal model estimates based on the approach described in section 2.1.

Apply the gradient descent method to update the parameter estimates. For iteration $t = 0, 1, 2, \dots$ , we have

\begin{aligned} θ^{(t + 1)} = θ^{(t)} - γ_{t} U (θ^{(t)}) . \end{aligned}

(21)

Calculate the new score function using the updated parameters.

Repeat the previous two steps until convergence. By default, the algorithm is stopped when $‖ θ^{(t + 1)} - {θ^{(t)}}_{1} ‖ \leq 1 / n$ .

Note that the choice of step size

γ_{t}

is crucial to the gradient descent algorithm. We recommend using the Barzilai–Borwein (BB) method, which is relatively straightforward and performs well in most scenarios.^44,45 According to the BB method, the step size is chosen as

\begin{aligned} γ_{t} = \frac{{(θ^{(t)} - θ^{(t - 1)})}^{'} [U (θ^{(t)}) - U (θ^{(t - 1)})]}{[U (θ^{(t)}) - U (θ^{(t - 1)})]^{'} [U (θ^{(t)}) - U (θ^{(t - 1)})]} . \end{aligned}

(22)

After obtaining

\hat{θ}

, we can estimate the covariance matrix

Cov (\hat{θ})

based on the previously described formulas. Then we can carry out the Wald test to test different null hypotheses, including whether the exposure affects a certain outcome j

(H_{0 j} : β_{j 1} = 0)

and whether the exposure affects any outcome

(H_{0} : β_{11} = β_{21} = \dots = β_{q 1} = 0)

2.3 Testing the overall hypothesis

In this section, we discuss more on the different choices for testing the overall hypothesis $H_{0} : β_{11} = β_{21} = \dots = β_{q 1} = 0$ , using univariate or multivariate MR methods. For the univariate analysis, the minP test with Bonferroni correction is usually applied. Suppose the p-values for the exposure effects on different outcomes are $p_{UVA, 1}$ , $p_{UVA, 2}$ , …, $p_{UVA, q}$ . The p-value for the overall test is $min (q p_{UVA, 1}, q p_{UVA, 2}, \dots, q p_{UVA, q}, 1)$ . This method does not take into account the correlation between different outcomes, and it tends to be conservative.

For the multivariate analysis, once we have $\hat{θ}$ and $Cov (\hat{θ})$ , we can conduct the Wald test. From $\hat{θ}$ and $Cov (\hat{θ})$ , we can easily extract ${\hat{β}}_{1} = ({\hat{β}}_{11}, \dots, {\hat{β}}_{q 1})^{'}$ and $Cov ({\hat{β}}_{1})$ . Then the test statistic is

\begin{aligned} T_{Wald} = {\hat{β}}_{1}^{'} [Cov ({\hat{β}}_{1})]^{- 1} {\hat{β}}_{1}, \end{aligned}

(23)

which approximately follows a chi-square distribution with q degrees of freedom under

H_{0}

Meanwhile, we propose to apply the aSPU test as an alternative method, which can combine a class of tests and may gain power when testing $H_{0}$ .^32,33 Denote $\tilde{Z} = ({\tilde{z}}^{(1)}, \dots, {\tilde{z}}^{(q)})^{'} = C^{- 1 / 2} {\hat{β}}_{1}$ , where $C$ is a diagonal matrix with the same diagonal elements as $Cov ({\hat{β}}_{1})$ . Under $H_{0}$ , $\tilde{Z}$ should asymptotically follow $MVN (0, \tilde{Σ})$ , where $\tilde{Σ} = C^{- 1 / 2} Cov ({\hat{β}}_{1}) C^{- 1 / 2}$ . The overall null hypothesis can be examined by testing whether all of the Z-scores in $\tilde{Z}$ are 0. Define

\begin{aligned} SPU (γ, \tilde{Z}) = T_{γ} = {\begin{array}{cc} \sum_{j} {\tilde{z}}^{(j) γ} (0 < γ < \infty) \\ max_{j} | {\tilde{z}}^{(j)} | (γ = \infty) \end{array}, \end{aligned}

(24)

where each different choice of

γ

gives us a different test statistic by summing the powered Z-scores. Under the null, the Z-scores should have mean 0, and the sum of powered Z-scores should be relatively small. If the sum is too big, we should reject the null hypothesis. We consider a set of different

γ

's, denoted by

Γ = {γ_{1}, γ_{2} \dots γ_{r}}

, which is usually chosen as {1, 2, …, 8,

\infty

}. To obtain the p-value of

SPU (γ_{t}, \tilde{Z})

, we sample

{\tilde{Z}}_{b}

(b = 1, 2, \dots, B)

from the null distribution

MVN (0, \tilde{Σ})

, and then we have

\begin{aligned} P_{SPU (γ_{t}, Z)} = \frac{1}{B} \sum_{b = 1}^{B} I (| SPU (γ_{t}, {\tilde{Z}}_{b}) | > | SPU (γ_{t}, \tilde{Z}) |) . \end{aligned}

(25)

As a result, we can obtain a set of p-values

P_{SPU (γ_{1}, \tilde{Z})}

, …,

P_{SPU (γ_{r}, \tilde{Z})}

. The aSPU test statistic is defined as

aSPU (\tilde{Z}) = min_{t = 1, \dots, r} (P_{SPU (γ_{t}, \tilde{Z})})

. The general idea is that as long as one of the p-values is small enough, we should reject the null hypothesis, so we only need to examine whether

aSPU (\tilde{Z})

is too small. To obtain the p-value of

aSPU (\tilde{Z})

, we need to use the empirical distribution of the aSPU test statistic under the null. For each

{\tilde{Z}}_{b}

, the aSPU test statistic can be calculated using

\begin{aligned} P_{SPU (γ_{t}, {\tilde{Z}}_{b})} & = \frac{1}{B - 1} \sum_{b^{'} = 1, \dots, B; b^{'} \neq b} I (| SPU (γ_{t}, {\tilde{Z}}_{b^{'}}) | > | SPU (γ_{t}, {\tilde{Z}}_{b}) |), \end{aligned}

(26)

\begin{aligned} aSPU ({\tilde{Z}}_{b}) & = min_{t} (P_{SPU (γ_{t}, Z_{b})}) . \end{aligned}

(27)

Hence, we can obtain the p-value of the aSPU test statistic by comparing the observed value

aSPU (\tilde{Z})

to the simulated values

aSPU ({\tilde{Z}}_{b})

's under the null, which gives us

P_{aSPU (Z)} = \sum_{b = 1}^{B} I (aSPU ({\tilde{Z}}_{b}) < aSPU (\tilde{Z})) / B

. The literature has shown that by combining a variety of tests, the aSPU test can perform well in various scenarios, whereas conventional tests are usually not robust.^32,33

2.4 General procedure

In this section, we present the general procedure for applying our new approach to study the causal relationship between an exposure and multiple outcomes, as shown in Figure 2. We would like to point out that the second step is crucial for MR analysis since including SNPs that violate the IV assumptions may lead to problematic results. Nevertheless, for this manuscript, we emphasize the fifth step, which mainly compares the multivariate approach with the univariate approach for modelling the relationship between the exposure and outcomes. We would also like to note that when conducting two-sample MR, it is more common to apply various MR techniques that use summary statistics (e.g. MR-Egger). We choose to focus on the two-stage MR method that is more comparable to our new multivariate approach, which requires individual-level data. Another reason why we would like to focus on the two-stage methods is that two-stage regression allows correlated instrumental variables in the first-stage model, whereas for summary-based MR methods, the chosen instrumental variables are required to be independent. In situations where there are not many available instrumental variables that are independent, it may be better to choose a looser correlation cutoff and apply the two-stage approach. We will discuss more on the strengths and drawbacks of our approach in the Discussion section. Besides, it is worth mentioning that the two-stage methods can also be applied to the one-sample situation, where we have a single dataset containing information on G, X and Y. However, we would like to focus on the two-sample scenario, since the literature has shown that only using one sample is less preferred, as it is more likely to result in biased estimates and inflated type I errors.^15,35 Some simulation results and discussion for the one-sample scenario are provided in Appendix A of the Supplementary materials.

Figure 2.

General procedure for applying MR to multiple outcomes using two samples with individual-level data.

3 Results

3.1 Simulation studies

We conduct simulation studies to compare the performance of our new approach and the univariate two-stage MR method. Among the common variants that show some marginal association (p < 1e-4) with baseline magnesium (MG) based on the CO.20 genotype data,^8–10 we randomly select 19 of them with weak or no correlations (pairwise correlations between −0.1 and 0.1). All of these SNPs have minor allele frequencies (MAFs) greater than 0.05, and 16 of them have MAFs greater than 0.1. In the simulations, we assume that these SNPs are the true association SNPs and simulate the genotypes for two independent samples, each with a sample size of 559, using multivariate binomial distributions and maintaining the MAF of each SNP. Since the correlations among the selected SNPs are very weak, for convenience, we simulate the SNPs as independent unless otherwise stated. Our experience shows that using weakly correlated SNPs or independent SNPs does not really affect the results. Suppose there are $q = q_{1} + q_{2}$ outcomes. The first $q_{1}$ outcomes are binary, while the other $q_{2}$ outcomes are continuous.

We simulate U, X and Y using the following models:

\begin{aligned} U_{i} & = \sum_{k = 1}^{p} β_{G U, k} G_{i k} + e_{U, i}, \end{aligned}

(28)

\begin{aligned} X_{i} & = \sum_{k = 1}^{p} β_{G X, k} G_{i k} + β_{U X} U_{i} + e_{X, i}, \end{aligned}

(29)

\begin{aligned} Z_{i j} & = β_{j 0} + β_{X Z, j} X_{i} + β_{U Z, j} U_{i} + e_{Z, i j}, \end{aligned}

(30)

\begin{aligned} Y_{i j} & = {\begin{array}{lc} 1 & (i f Z_{i j} \geq 0, j = 1, 2, \dots, q_{1}) \\ 0 & (i f Z_{i j} < 0, j = 1, 2, \dots, q_{1}) \\ Z_{i j} & (j = q_{1} + 1, q_{1} + 2, \dots, q) \end{array} . \end{aligned}

(31)

where

G_{i k}

U_{i}

X_{i}

Y_{i j}

are the k th genotype, confounder, exposure and j th outcome for subject i.

Z_{i j}

is the j th latent continuous outcome variable. For binary outcomes,

Y_{i j}

and

Z_{i j}

are connected by the probit link, where

Y_{i j}

follows a Bernoulli distribution with mean

Φ (Z_{i j})

(

Φ

is the cumulative density function of the standard normal distribution). For continuous outcomes,

Y_{i j} = Z_{i j}

β_{G U, k}

β_{G X, k}

are the effects of the k th SNP on U and X, respectively.

β_{X Z, j}

and

β_{U Z, j}

are the effects of X and U on the j th latent variable, respectively.

β_{U X}

is the effect of U on X.

e_{U, i}

and

e_{X, i}

are random errors that follow i.i.d. standard normal.

e_{Z, i} = (e_{Z, i 1}, \dots, e_{Z, i q})^{'}

follows an i.i.d. multivariate normal distribution with mean 0. The standard deviation of

e_{Z, i j}

σ_{j}

, and the correlation between

e_{Z, i j}

and

e_{Z, i l}

ρ_{j l}

. Denote the correlation matrix

(ρ_{j l})_{q \times q}

Ω

. Also denote

σ = (σ_{1}, \dots, σ_{q})^{'}

β_{G U} = (β_{G U, 1}, \dots, β_{G U, p})^{'}

β_{G X} = (β_{G X, 1}, \dots, β_{G X, p})^{'}

β_{X Z} = (β_{X Z, 1}, \dots, β_{X Z, q})^{'}

, and

β_{U Z} = (β_{U Z, 1}, \dots, β_{U Z, q})^{'}

Following Slob and Burgess,⁴⁶ we generate $β_{G X, k}$ 's from a normal distribution with mean zero and standard deviation 0.15 first, and subsequently select those with $β_{G X, k} > 0.08$ to avoid weak IVs. Then we scale $β_{G X, k}$ 's by a constant so that the proportion of X explained by IVs is about 20%. For simplicity, we assume $β_{G U, k}$ 's are 0. $β_{U X}$ and $β_{U Z, j}$ 's follow a standard uniform distribution.

At first, we consider a mixture of two binary and two continuous outcomes. We set $σ = (1, 1, 2, 3)^{'}$ . To make sure different binary outcomes have different event rates, we select $β_{0} = (- 0.4, - 0.8, 0, 0)^{'}$ . For the continuous outcomes, $β_{j 0}$ 's do not influence the results. In terms of the correlation structure, we let

\begin{aligned} Ω = (\begin{array}{cccc} 1 & ρ_{1} & ρ_{1} & ρ_{2} \\ ρ_{1} & 1 & ρ_{1} & ρ_{2} \\ ρ_{1} & ρ_{1} & 1 & ρ_{2} \\ ρ_{2} & ρ_{2} & ρ_{2} & 1 \end{array}), \end{aligned}

(32)

meaning that the first three outcomes have an exchangeable correlation structure with parameter

ρ_{1}

, while the last outcome's correlation with each of the first three outcomes is

ρ_{2}

. Furthermore, we simulate the two-sample data, we compare univariate and multivariate two-stage analyses in terms of type I errors and power. Next, we explore how univariate and multivariate methods perform when there are more mixed outcomes. We randomly select 7 outcomes (3 binary, 4 continuous) from the CO.20 data and use their correlation structure as our

Ω

. In this case, we select

σ = (1, 1, 2, 3, 4, 5)^{'}

and

β_{0} = (- 0.2, - 0.5, - 0.9, 0, 0, 0, 0)^{'}

As Figure 3 shows, both univariate and multivariate analyses are able to control type I errors when testing the overall hypothesis or each single outcome. In terms of power, we examine various scenarios, for which the different parameter settings are included in Table 1. Though we mainly focus on our proposed gradient descent algorithm for this manuscript, our experience shows that the results of gradient descent and Newton's method are very consistent under most circumstances, as shown in Figure 4, for example.

Figure 3.

Type I error comparison of univariate and multivariate methods for mixed outcomes. 5000 replications. (A) 2 binary and 2 continuous outcomes. $ρ_{1} = 0.3, ρ_{2} = 0.5$ . (B) 2 binary and 2 continuous outcomes. $ρ_{1} = 0.3, ρ_{2} = - 0.5$ . (C) 3 binary and 4 continuous outcomes. $β_{X Z, 1} = β_{X Z, 2} = \dots = β_{X Z, 7} = 0$ .

Table 1.

Different Scenario Settings Used for Power Comparison.

4 outcomes (2 binary, 2continuous)
Scenario	$β_{X Z, 1}$	$β_{X Z, 2}$	$β_{X Z, 3}$	$β_{X Z, 4}$
1	0	0.28	0	0
2	0.21	0.21	0	0
3	0.21	0	0	0.3
4	0.21	0	0.3	0.3
5	0.18	0.18	0.25	0.25

7 outcomes (3 binary, 4 continuous)
Scenario	$β_{X Z, 1}$	$β_{X Z, 2}$	$β_{X Z, 3}$	$β_{X Z, 4}$	$β_{X Z, 5}$	$β_{X Z, 6}$	$β_{X Z, 7}$
1	0	0	0	0	0.55	0	0
2	0	0	0	0.4	0.4	0	0
3	0.14	0.21	0.28	0	0	0	0
4	0.14	0.14	0	0.3	0.3	0	0
5	0.18	0.18	0.18	0.2	0.2	0.2	0.2

Figure 4.

Power comparison of the Wald tests for mixed outcomes (2 binary, 2 continuous) using gradient descent and Newton's method. 1000 replications. $ρ_{1} = 0.3, ρ_{2} = 0.5$ .

Comparing the power performances of univariate and multivariate analysis with regard to the overall test (minP for univariate analysis; Wald, aSPU for multivariate analysis), we find that the Wald test usually has higher power than the minP test, as shown in Figure 5. The aSPU test has the highest power when the proportion of outcomes affected by the exposure is relatively high, though it may be disadvantaged when there is only one outcome affected by the exposure.Our finding is consistent with the literature about the aSPU test.³³ The minP test is close to the SPU( $\infty$ ) test except that it does not use the correlation information, and thus it works relatively well when the signal is sparse (e.g. the exposure only has an effect on one of the outcomes). The Wald test is similar to the SPU(2) test, which performs the best when the signal is not very sparse. This may partially explain why the overall tests using multivariate analysis tend to show a larger power improvement over the minP test when the exposure affects more than one outcome.

Note that for the above scenarios, we assume that the IVs are independent, which is usually the case for Mendelian randomization studies using summary statistics. However, for the two-stage methods, we can demonstrate that IVs with moderate correlations can also be used. We explore another situation with 4 mixed outcomes (2 binary, 2 continuous) and 10 correlated IVs. We select $ρ_{1} = 0.3, ρ_{2} = 0.5$ , and assume the correlated IVs have an AR(0.4) correlation structure, meaning that the correlation between the $k_{1}$ th SNP and the $k_{2}$ th SNP is defined as ${0.4}^{| k_{1} - k_{2} |}$ . Each SNP's MAF is randomly chosen using Unif(0.3, 0.5). The other settings are the same as before. According to Figure 6, when the IVs are moderately correlated, both univariate and multivariate two-stage methods are still able to control type I errors. We also observe similar power performances to those in the previous simulation settings. Again, the Wald and aSPU tests outperform the minP test when there is more than one outcome affected by the exposure, and the aSPU test is the most advantageous when all outcomes are affected. More simulation results with modified settings are included in Appendix A of the Supplementary materials.

Figure 5.

Power comparison of univariate and multivariate methods for mixed outcomes. 1000 replications. (A) 2 binary and 2 continuous outcomes. $ρ_{1} = 0.3, ρ_{2} = 0.5$ . (B) 2 binary and 2 continuous outcomes. $ρ_{1} = 0.3, ρ_{2} = - 0.5$ . (C) 3 binary and 4 continuous outcomes.

Figure 6.

T1E and power comparison of univariate and multivariate methods for mixed outcomes using 10 correlated IVs. $ρ_{1} = 0.3, ρ_{2} = 0.5$ . (A) T1E, 5000 replications. (B) Power, 1000 replications.

Figure 7.

Workflow for the CO.20 study application.

3.2 Real data application

To further demonstrate the difference between univariate and multivariate analyses in practice, we look at the data from the CO.20 trial as mentioned in the introduction, which is randomly assigned into two treatment groups (cetuximab + BRI; cetuximab + placebo), which, for convenience, we denote as Group 1 and Group 2, respectively. We choose baseline magnesium level (scaled by 10) as our exposure of interest X, since it is associated with certain genetic variants, and we are interested in examining whether this variable affects any of the following outcome variables: 3 binary toxicity variables (rash, nausea, and vomiting) and 4 continuous lab variables (bilirubin [BIL], white blood count [WBC], alanine aminotransferase [ALT], and lactate dehydrogenase [LDH]). Each binary toxicity variable is coded as 0 or 1 (has not experienced or has experienced a toxicity event within 8 weeks), and each lab variable is defined as the worst (maximum) value within 8 weeks after allocation. We take log-transformation on BIL, ALT, and LDH and remove 3 subjects with outlying values, defined as more than 4 standard deviations away from the sample mean value. The sample correlation matrix of the 7 outcomes is illustrated in Appendix B of the Supplementary materials, which shows that most outcomes are weakly or moderately correlated.

Using the data from Group 1, we obtain the marginal associations of SNPs with X (adjusted for age and gender) and select SNPs with marginal p-values smaller than 1E-4 and MAFs greater than 0.05. After pruning the SNPs to control pairwise correlations to be within −0.1 and 0.1, we end up with 6 SNPs as our IVs and build a first-stage model of X vs. IVs. Next, we use the first-stage model to obtain fitted X for Group 2, after which we conduct the second-stage analysis, applying the univariate and multivariate modeling approaches to explore whether X has any effects on any of the mixed outcomes. Figure 7 shows a summary of our workflow.

The p-values for testing baseline magnesium's effect on each of the outcomes based on univariate analysis are shown in Table 2. We also include the results of univariate analysis without MR, where each outcome is regressed on observed X instead of fitted X. According to the table, based on univariate MR analysis, BIL and ALT show some significance with p-values less than 0.05. Meanwhile, possibly as a result of unhandled confounders, the p-values based on univariate analysis without MR are usually quite different. This may partially demonstrate the effect and importance of considering instrumental variables for causal inference.

Table 2.
P-Values for Testing Baseline Magnesium's Effect on Each of the Outcomes Based on Two-Stage Univariate MR Analysis and Univariate Analysis Without MR. Exposure: Baseline MG.

Rash Nausea Vomiting BIL WBC ALT LDH

MR 0.078 0.172 0.210 0.010* 0.289 0.033* 0.238

No MR 0.855 0.013* 0.786 0.071 0.511 0.019* 0.477

	Rash	Nausea	Vomiting	BIL	WBC	ALT	LDH
MR	0.078	0.172	0.210	0.010*	0.289	0.033*	0.238
No MR	0.855	0.013*	0.786	0.071	0.511	0.019*	0.477

Table 3 summarizes the results of univariate and multivariate analyses. For estimating and testing a single effect, univariate and multivariate analyses provide consistent results, though they are not exactly the same. However, the overall hypothesis test results are very different. The p-value of the minP test is not significant, but the p-values of the Wald and aSPU tests are. This agrees with our simulation results, demonstrating the potential power advantage of multivariate analysis.

Table 3.

Comparison Between Univariate and Multivariate Methods for Mixed Outcomes (Rash, Nausea, Vomiting, BIL, WBC, ALT, and LDH). Exposure: Baseline MG.

	UVA			MVA (MCLE)
	Effect size	SE	P-value	Effect size	SE	P-value
Rash	−0.266	0.151	0.078	−0.267	0.154	0.083
Nausea	−0.224	0.164	0.172	−0.224	0.164	0.171
Vomiting	0.248	0.198	0.21	0.259	0.191	0.174
ALT	−0.159	0.074	0.032*	−0.159	0.074	0.032*
BIL	−0.191	0.074	0.01*	−0.191	0.074	0.01*
LDH	−0.093	0.079	0.237	−0.093	0.072	0.193
WBC	−0.446	0.42	0.289	−0.446	0.351	0.204
		minP		Wald		aSPU
Overall		0.07		0.015*		0.021*

Thus, the proposed MRMO model is shown to identify the statistically significant associations between genetic predisposition of having lower magnesium levels to hepatotoxicities (elevated ALT levels and elevated bilirubin levels) in cetuximab-treated patients.

4 Discussion

We have presented a novel approach to conduct two-stage Mendelian randomization with multivariate analysis on mixed outcomes. As shown in our simulations, our innovative approach can increase the power of the overall test over the univariate approach in most scenarios, especially when more than one outcome is affected by the exposure. The two different overall tests based on multivariate analysis, Wald and aSPU tests, have different performances in different scenarios. The aSPU test shows better power when the proportion of the outcomes affected by the exposure is higher, while the Wald test performs the best when the exposure has causal effects on a small to medium proportion of the outcomes. We have also noticed that carrying out the tests using our proposed gradient descent algorithm tends to yield very similar results to those based on the Newton–Rapson method, which was used by Bai et al.²⁷ Nevertheless, given the possible drawbacks of the Newton–Rapson method in some other scenarios,^29–31 we recommend using the gradient descent approach, which may take more iterations to converge but can avoid some potential computational issues.

After applying the univariate and multivariate methods to the CO.20 data, we have found that the parameter estimations for single outcomes were close. However, the minP test for testing the overall hypothesis did not give a significant p-result, whereas both the Wald test and the aSPU test in our new method showed significant results. The increased statistical power offered by multivariate analyses was therefore able to identify a previous latent relationship between predisposition to hypomagnesemia and hepatotoxicity from cetuximab-treated chemo-refractory colorectal cancer patients. This is relevant clinically because hypomagnesemia has been associated with improved outcomes in cetuximab-treated patients. Because we identified that predisposition to low magnesium may lead to increased liver toxicities, new avenues of research have been opened, where predisposition to low magnesium levels may lead either to increased levels of cetuximab (pharmacokinetic association) or increased efficacy of cetuximab on metastatic cancer (pharmacodynamic association). Confounding this association is the knowledge that the elevated bilirubin may be due either to drug therapy or bulky liver metastasis. Because we also found a separate relationship between predisposition to hypomagnesemia and elevated ALT, this result may suggest but does not prove that the relationship may be related more to drug toxicities than liver metastases, because ALT abnormalities are more commonly seen with drug toxicity than with liver metastases.

We would like to point out that our chosen screening criteria for the instrumental variables were relatively loose due to the limited sample size of the CO.20 data, which may raise concerns about the possible violation of instrumental variable assumptions. In order to alleviate this potential problem, one approach we can consider is to combine different datasets to get a larger sample size. This may require modifying our model framework to take the different structures of different datasets and potential correlations into account, which is a future direction worth exploring. Another possible approach is to use the publicly available genome-wide association study (GWAS) results, which are usually based on sufficiently large samples, to select instrumental variables. However, it will not be feasible when there are no available GWAS results on the exposure of interest.

In the future, we may explore the possibility of developing multivariate methods that only require summary statistics, as well as methods that can handle certain invalid instrumental variables. Besides, we have demonstrated that our current approach can handle a mixture of binary and continuous outcomes. It can be directly applied to situations where all outcomes are binary, or all outcomes are continuous as well. We may also look at other types of outcomes (e.g. survival outcomes) and develop new methods that incorporate multivariate Mendelian randomization.

5 Software

R code and simulation data are available at https://github.com/yangq001/MRMO.

Table 4.

List of Abbreviations.

Abbreviation	Meaning
MR	Mendelian randomization
MRMO	(Two-stage multivariate) Mendelian randomization with mixed outcomes
BRI	Brivanib
aSPU	Adaptive sum of the powered score test
IV	Instrumental variable
SNP	Single-nucleotide polymorphism
BMI	Body mass index
MR-Egger	Mendelian randomization-Egger
MG	Magnesium
MAF	Minor allele frequency
BIL	Bilirubin
WBC	White blood count
ALT	Alanine aminotransferase
LDH	Lactate dehydrogenase
minP	Minimum–p test
GWAS	Genome-wide association study

Supplemental Material

sj-docx-1-smm-10.1177_09622802231181220 - Supplemental material for Two-stage multivariate Mendelian randomization on multiple outcomes with mixed distributions

Supplemental material, sj-docx-1-smm-10.1177_09622802231181220 for Two-stage multivariate Mendelian randomization on multiple outcomes with mixed distributions by Yangqing Deng, Dongsheng Tu, Chris J O'Callaghan, Geoffrey Liu and Wei Xu in Statistical Methods in Medical Research

Footnotes

Abbreviation list

shows a summary of the abbreviations used in this manuscript.

Acknowledgments

The authors would like to acknowledge the clinical contributions of Lillian Siu.

Declaration of conflicting interests

The author(s) declare no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This study was supported by the Lusi Wong Fund, Princess Margaret Cancer Foundation, Alan Brown Chair in Molecular Genomics (to GL). WX was funded by the Natural Sciences and Engineering Research Council of Canada (NSERC Grant RGPIN-2017-06672).

ORCID iDs

Yangqing Deng

Dongsheng Tu

Supplemental material

Supplemental material for this article is available online.

References

Ringash

Siu

, et al. Quality of life in patients with K-RASwild-type colorectal cancer. Cancer 2013; 120: 181–189.

Friese

Harrison

Janz

, et al. Treatment-associated toxicities reported by patients with early-stage invasive breast cancer. Cancer 2017; 123: 1925–1934.

Kolotkin

Andersen

. A systematic review of reviews: exploring the relationship between obesity, weight loss and health-related quality of life. Clin Obes 2017; 7: 273–289.

Patrinely

Young

Quach

, et al. Survivorship in immune therapy: assessing toxicities, body composition and health-related quality of life among long-term survivors treated with antibodies to programmed death-1 receptor and its ligand. Eur J Cancer 2020; 135: 211–220.

Martinsen

Rasmussen

LMP

Wentzel-Larsen

, et al.

Change in quality of life and self-esteem in a randomized controlled CBT study for anxious and sad children: can targeting anxious and depressive symptoms improve functional domains in schoolchildren?

BMC Psychol 2021; 9.

Sarntivijai

Lin

, et al. OAE: the ontology of adverse events. J Biomed Semantics 2014; 5: 29.

Gowen

Giles

Simpson

, et al. Baseline antibody profiles predict toxicity in melanoma patients treated with immune checkpoint inhibitors. J Transl Med 2018; 16.

Siu

Shapiro

Jonker

, et al. Phase III randomized, placebo-controlled study of cetuximab plus brivanib alaninate versus cetuximab plus placebo in patients with metastatic, chemotherapy-refractory, wild-type K-RAS colorectal carcinoma: the NCIC clinical trials group and AGITG CO.20 trial. J Clin Oncol 2013; 31: 2477–2484.

Ringash

Siu

, et al. Quality of life in patients with K-RASwild-type colorectal cancer. Cancer 2013; 120: 181–189.

10.

Shepshelovich

Townsend

Espin-Garcia

, et al. Fc-gamma receptor polymorphisms, cetuximab therapy, and overall survival in the CCTG CO.20 trial of metastatic colorectal cancer. Cancer Med 2018; 7: 5478–5487.

11.

Angrist

Imbens

Rubin

. Identification of causal effects using instrumental variables. J Am Stat Assoc 1996; 91: 444–455.

12.

Olkin

Tate

. Multivariate correlation models with mixed discrete and continuous variables. The Annals of Mathematical Statistics 1961; 32: 448–465.

13.

Sammel

Lin

Ryan

. Multivariate linear mixed models for multiple outcomes. Stat Med 1999; 18: 2479–2492.

14.

Teixeira-Pinto

Normand

SLT

. Correlated bivariate continuous and binary outcomes: issues and applications. Stat Med 2009; 28: 1753–1773.

15.

Burgess

Small

Thompson

. A review of instrumental variable estimators for Mendelian randomization. Stat Methods Med Res 2015; 26: 2333–2355.

16.

Bowden

Davey Smith

Burgess

. Mendelian randomization with invalid instruments: effect estimation and bias detection through egger regression. Int J Epidemiol 2015; 44: 512–525.

17.

Bowden

Davey Smith

Haycock

, et al. Consistent estimation in Mendelian randomization with some invalid instruments using a weighted median estimator. Genet Epidemiol 2016; 40: 304–314.

18.

Burgess

Thompson

. Multivariable Mendelian randomization: the use of pleiotropic genetic variants to estimate causal effects. Am J Epidemiol 2015; 181: 251–260.

19.

Burgess

Zuber

Gkatzionis

, et al. Modal-based estimation via heterogeneity-penalized weighting: model averaging for consistent and efficient estimation in Mendelian randomization when a plurality of candidate instruments are valid. Int J Epidemiol 2018; 47: 1242–1254.

20.

Sanderson

. Multivariable Mendelian randomization and mediation. Cold Spring Harbor Perspect Med 2020; 11: a038984.

21.

Sanderson

Davey Smith

Windmeijer

, et al. An examination of multivariable Mendelian randomization in the single-sample and two-sample summary data settings. Int J Epidemiol 2018; 48: 713–727.

22.

Moran

. Arguments for rejecting the sequential Bonferroni in ecological studies. Oikos 2003; 100: 403–405.

23.

Cox

. The analysis of multivariate binary data. Appl Stat 1972; 21: 113.

24.

Cox

Wermuth

. Response models for mixed binary and quantitative variables. Biometrika 1992; 79: 441–461.

25.

Gueorguieva

Agresti

. A correlated probit model for joint modeling of clustered binary and continuous responses. J Am Stat Assoc 2001; 96: 1102–1112.

26.

Zhang

Liu

Zhao

, et al. Modeling hybrid traits for comorbidity and genetic studies of alcohol and nicotine co-dependence. Ann Appl Stat 2018; 12.

27.

Bai

Zhong

Gao

, et al. Multivariate mixed response model with pairwise composite-likelihood method. Stats 2020; 3: 203–220.

28.

Atkinson

. An Introduction to Numerical Analysis. Hoboken, NJ: John Wiley & Sons, Inc, 1989.

29.

Press

Flannery

Teukolsky

. Numerical Recipes in C: The Art of Scientific Computing. Cambridge: Cambridge University Press, 1988.

30.

Strutz

. Data Fitting and Uncertainty (A practical introduction to weighted least squares and beyond). Wiesbaden: Springer Fachmedien, 2010.

31.

Pascanu

Dauphin

Ganguli

, et al. On the saddle point problem for non-convex optimization. ArXiv:1405.4604 2014.

32.

Pan

Kim

Zhang

, et al. A powerful and adaptive association test for rare variants. Genetics 2014; 197: 1081–1095.

33.

Kim

Bai

Pan

. An adaptive association test for multiple phenotypes with GWAS summary statistics. Genet Epidemiol 2015; 39: 651–663.

34.

Terza

Basu

Rathouz

. Two-stage residual inclusion estimation: addressing endogeneity in health econometric modeling. J Health Econ 2008 May; 27: 531–543. Epub 2007 Dec 4. PMID: 18192044; PMCID: PMC2494557.

35.

Xue

Pan

. Some statistical consideration in transcriptome-wide association studies. Genet Epidemiol 2019; 44: 221–232.

36.

Lindsay

. Composite likelihood methods. Contemp Math 1988; 0: 221–239.

37.

Cox

Reid

. A note on pseudolikelihood constructed from marginal densities. Biometrika 2004; 91: 729–737.

38.

Varin

Reid

Firth

. An overview of composite likelihood methods. Stat Sin 2011; 21: 5–42.

39.

Gao

Song

. Composite likelihood Bayesian information criteria for model selection in high-dimensional data. J Am Stat Assoc 2010; 105: 1531–1540.

40.

Godambe

. An Optimum property of regular Maximum likelihood estimation. The Annals of Mathematical Statistics 1960; 31: 1208–1211.

41.

Bottou

. Large-Scale machine learning with stochastic gradient descent. Proceedings of COMPSTAT’ 2010; 0: 177–186.

42.

Amiri

Gunduz

. Machine Learning at the Wireless Edge: Distributed Stochastic Gradient Descent Over-the-Air. IEEE International Symposium on Information Theory (ISIT) 2019; 0.

43.

Sun

Tang

. Gradient descent learning with floats. IEEE Transactions on Cybernetics 2020; 0: 1–9.

44.

Barzilai

Borwein

. Two-Point step size gradient methods. IMA J Numer Anal 1988; 8: 141–148.

45.

Fletcher

. On the barzilai-borwein method. Optimization and Control with Applications 2001; 0: 235–256.

46.

Slob

EAW

Burgess

. A comparison of robust Mendelian randomization methods using summary data. Genet Epidemiol 2020; 44: 313–329.

Supplementary Material

Please find the following supplemental material available below.

For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.

For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.

0.00 MB

1.95 MB