Sage Journals: Discover world-class research

Abstract

Background

Parameter uncertainty in EQ-5D-5L value sets often exceeds the instrument’s minimum important difference, yet this is routinely ignored. Multiple imputation (MI) accounts for parameter uncertainty in the value set; however, no valuation study has implemented this methodology. Our objective was to create a Canadian MI value set for the EQ-5D-5L, thus enabling users to account for parameter uncertainty in the value set.

Methods

Using the Canadian EQ-5D-5L valuation study (N = 1,073), we first refit the original model followed by models with state-level misspecification. Models were compared based on the adequacy of 95% credible interval (CrI) coverage for out-of-sample predictions. Using the best-fitting model, we took 100 draws from the posterior distribution to create 100 imputed value sets. We examined how much the standard error of the estimated mean health utilities increased after accounting for parameter uncertainty in the value set by using the MI and original value sets to score 2 data sets: 1) a sample of 1,208 individuals from the Canadian general public and 2) a sample of 401 women with breast cancer.

Results

The selected model with state-level misspecification outperformed the original model (95% CrI coverage: 94.2% v. 11.6%). We observed wider standard errors for the estimated mean utilities on using the MI value set for both the Canadian general public (MI: 0.0091; original: 0.0035) and patients with breast cancer (MI: 0.0169; original: 0.0066).

Discussion and Conclusions

We provide 1) the first MI value sets for the EQ-5D-5L and 2) code to construct MI value sets while accounting for state-level model misspecification. Our study suggests that ignoring parameter uncertainty in value sets leads to falsely narrow SEs.

Highlights

Value sets for health state utility instruments are estimated subject to parameter uncertainty; this parameter uncertainty may exceed the minimum important difference of the instrument, yet it is not fully captured using current methods.

This study creates the first multiply imputed value set for a multiattribute utility instrument, the EQ-5D-5L, to fully capture this parameter uncertainty.

We apply the multiply imputed value set to 2 data sets from 1) the Canadian general public and 2) women with invasive breast cancer.

Scoring the EQ-5D-5L using a multiply imputed value set led to wider standard error estimates, suggesting that the current practice of ignoring parameter uncertainty in the value set leads to falsely low standard errors.

Our work will be of interest to methodologists and developers of the EQ-5D-5L and users of the EQ-5D-5L, such as health economists, researchers, and policy makers.

Graphical Abstract

This is a visual representation of the abstract.

Keywords

EQ-5D-5L value set canadian parameter uncertainty multiple imputation

Value sets enable the scoring of preference-based instruments to calculate quality-adjusted life-years (QALYs). Decisions regarding the public reimbursement of drugs rely heavily on the incremental cost-effectiveness ratio of an economic evaluation, the additional cost of a new intervention relative to QALYs gained. QALYs are health utilities multiplied by quantity of life, a key measure of effectiveness in economic evaluations. Multiattribute utility instruments (MAUIs) are preference-based instruments that measure health-related quality of life, anchored at 0 (dead) and 1 (full health). MAUIs consist of attributes and levels within each attribute.

Responses to MAUIs are scored using a value set, which assigns a utility to each health state captured by the instrument. According to welfarist economics, utilities used in economic evaluations should reflect the preferences of the general public.¹ Utilities are elicited using direct approaches such as time tradeoff (TTO), standard gamble (SG), or discrete choice experiments (DCE).¹ While value sets are vital, they are also estimated, rather than known.

Conventionally, the uncertainty in estimated value sets is ignored, yet health utilities are input parameters in cost utility analyses. Parameter uncertainty in value sets includes uncertainty in the functional form of the model and estimation of regression coefficients.² Pullenayegum et al.² found that uncertainty in the functional form explained 84% of the mean squared prediction error. When both sources were accounted for in the US EQ-5D-3L valuation study, the mean 95% credible interval (CrI) width for the mean utilities in the value set was 0.152, exceeding the minimum important difference (MID) for the EQ-5D-3L (range: 0.03 to 0.08).^3–5 Kharroubi et al.⁶ found that the uncertainty around SF-6D health states is comparably large, with posterior standard deviations (s) of ∼0.03, which would lead to 95% CrI widths of about 0.12. While this magnitude of uncertainty is known for the EQ-5D-3L and SF-6D, parameter uncertainty is not known for other preference instruments. When it comes to the severity of health states, mild health states have a smaller magnitude of uncertainty; however, this uncertainty is still nonnegligibly large relative to their disutility. This uncertainty in health utilities is not conventionally accounted for, and therefore, 95% CrIs that do not account for the true extent of uncertainty are falsely narrow. Ignoring the uncertainty in health utilities results in incremental cost-effectiveness ratios (ICERs) that use health utilities being reported with a false level of precision and consequently misrepresentation of the level of confidence provided by these economic analyses.

Since the true value of the mean health utility for each health state is unknown, it can be conceptualized as missing data. Chan et al.⁷ proposed multiple imputation to account for uncertainty in the value set. When using multiple imputation to handle missing data, one takes multiple draws of the missing data from its distribution given the observed data to create multiple imputed data sets. Each imputed data set is analyzed to obtain multiple estimates of the parameter of interest. This allows us to quantify uncertainty in the estimated parameter due to not knowing the true values of the missing data: when computing the standard error (SE) of an estimated parameter, one considers both the estimated SE from analyzing each imputed data set as well as the variance of the estimates across imputations (known as Rubin’s rule⁸).

The same principle applies to value sets when we consider the true value of the value set as missing data. We take multiple draws of the value set from its posterior distribution given the valuation data; we thus have multiple value sets. Now suppose a sample of respondents in a randomized controlled trial has filled out the MAUI and we wish to score their responses to estimate the incremental mean QALYs (ΔQALY). The researcher would score the MAUI responses using each imputed value set and estimate ΔQALY using each imputed value set. The uncertainty in ΔQALY due to both sampling variation in MAUI responses and due to parameter uncertainty in the value set is then quantified using both the estimated SEs in ΔQALY on using each value set and the variance in estimates of ΔQALY across value sets.

While this methodology has been shown to account for uncertainty in the value set,⁷ the methodology has never been implemented in practice. No valuation study has published multiple draws of the value set from its posterior distribution. The EQ-5D-5L is the most widely used MAUI.^9–13 The EQ-5D-5L has 5 dimensions of health (i.e., mobility, self-care, usual activities, pain/discomfort, anxiety/depression),^14,15 each with 5 levels of severity (i.e., no, slight, moderate, severe, and extreme problems),¹⁵ representing 3,125 (5⁵) possible health states. The Canadian EQ-5D-5L value set used individual-level TTO utilities to predict population mean utilities for each health state as a function of health state attributes. The overall objective of this study was to create a Canadian multiply imputed value set for the EQ-5D-5L that can be used to account for parameter uncertainty. Toward this objective, we compared 3 models: 1) the original model, 2) a main effects model with state-level misspecification terms, and 3) a main effects model with spatially correlated misspecification terms. We used the best performing model to create 100 imputed value sets. We then illustrated how these imputed value sets can be used to account for uncertainty in 2 studies: in the first, we estimated mean utility experienced by the Canadian general population, and in the second, we estimated mean health utility among women with breast cancer from the Breast Utility Instrument study.^16,17

Methods

Canadian EQ-5D-5L Data

We used EQ-VT TTO responses collected from members of the general public from the Canadian valuation study (N = 1,073).¹⁸ A previous multicountry pilot study determined that a sample size of 1,000 per country and valuing 86 health states would allow estimating 50 parameters with acceptable precision.¹⁹ In this study, participants were quota sampled to represent the Canadian general population from Ontario, British Columbia, Alberta, and Quebec.¹⁸ Participants first completed the EQ-5D-5L to indicate their own health state. Next, participants valued a block of 10 health states using composite TTO (cTTO) tasks presented in random order. The 86 health states valued were grouped into 10 blocks, with 10 health states per block.¹⁸ These 86 health states included 5 very mild health states with only 1 dimension at level 2 (e.g., 21111, 12111) and the worst state (55555). The remaining 80 health states were selected to cover a range of severity.¹⁸ Each participant also completed traditional TTO (tTTO) for 2 severe health states from among the 86 health states.

Models Used to Estimate Mean Health State Disutilities

First, we refit the original model, followed by fitting models with model misspecification terms. As in the original study, due to concerns about using cTTO to elicit utilities for health states worse than dead, cTTO values were censored at zero while the tTTO values were left uncensored.¹⁸

Notation

Let T_ik be participant i’s disutility (i.e., 1-utility) for the kth health state that they value (thus k = 1, 2, . . ., 12 since each respondent valued 10 health states using cTTO and 2 health states using tTTO). Let j_(ik) index be the state number corresponding to the kth health state valued by individual i, with j_(ik) taking values in 1, . . ., 86. Let $ν$ _i be the random intercept for the ith respondent, ε_ik a residual, X_j a row vector consisting of 12 health state descriptors, and $β$ a vector of regression coefficients corresponding to elements of X_j. As in the original valuation study,¹⁸ for j = 1, . . ., 86, we took X_j = (1, MO_j, SC_j, UA_j, PD_j, AD_j, MO45_j, SC45_j, UA45_j, PD45_j, AD45_j, Num45_j²), where MO_j, SC_j, UA_j, PD_j, and AD_j denote the level (from 1 to 5) of health state j for each of the 5 dimensions of the EQ-5D-5L (mobility, self-care, usual activities, pain/discomfort, anxiety/depression); MO45_j, SC45_j, UA45_j, PD45_j, AD45_j are dummy variables for whether there were any levels 4 or 5 within each dimension and can take on values of 0 or 1; and Num45_j is the number of dimensions at level 4 or 5 beyond the first and can take on values of 0, 1, 2, 3, or 4.

Original Model

The original scoring model for EQ-5D-5L responses developed by Xie et al.¹⁸ is expressed as

T_{ik} = 1 - (X_{j (ik)} β + ν_{i} + ε_{ik}) = 1 - (μ_{j (ik)} + ν_{i} + ε_{ik})

(1)

where $μ$ _j = X_j $β$

$ν$ _{i ∼} N(0, $σ$ _b²)

ε_ik,∼ N(0, σ_ε²)

where T_ik, j_(ik), X_j, $β$ , $ν$ _i, and ε_ik are as previously defined (see the “Notation” section) and $μ$ _j is the mean health utility for health state j. The set of $μ$ _j, j = 1, . . ., 86 for the value set; for the purposes of fitting the model, we consider the 86 directly valued health states only, that is, j = 1, . . ., 86. The $β$ ’s are the parameters of interest. We shall refer to this model as the original Xie model.¹⁸

Adding Misspecification

The original Xie model assumes that $μ$ _j = X_jβ, that is, that the mean utility is a perfect linear function of the health state descriptors. This strong and likely unrealistic assumption² can be relaxed by adding a misspecification term²:

\begin{matrix} μ_{j} = X_{j} β + δ_{j} \\ δ_{j} ~ iid N (0, {σ_{d}}^{2}) \end{matrix}

(2)

where $δ$ _j is the deviation between the modeled mean (X_j $β$ ) and true mean disutility of health state j². The original Xie model is a special case of this model where σ_d = 0; that is, it assumes no model misspecification. In the frequentist context, $δ$ _j can be thought of as a random effect. In the Bayesian context we adopt here, $δ$ _j is an unknown parameter. We refer to this model as the Chan model.⁷

Main Effects Model with Correlated Model Misspecification

Shams and Pullenayegum²⁰ extended equation 2 above, where $δ$ _j are assumed to be spatially correlated with a Gaussian correlation structure. These authors found that 1) states that are directly valued can be estimated with less uncertainty than states that were not valued, and 2) states that are closer together have more similar utilities than states further apart.²⁰ Therefore, modeling correlated misspecification terms (“spatial correlation”) allows information from directly valued health states to be pulled to states that are not directly valued. Overall, models with spatial correlation have been found to estimate mean health state utilities more precisely than models with an independent correlation structure²⁰ and have been shown to improve predictive precision.²¹ Despite these advantages, no Canadian EQ-5D-5L model with spatial correlation has been tested or applied to date.

Mathematically, the model with correlated misspecification can be specified as

\begin{matrix} δ ~ MVN (0, Σ) \\ Σ_{mn} = σ_{d}^{2} \exp (- θ {d^{2}}_{mn}), \end{matrix}

(3)

where $δ$ is the vector of all $δ$ _j and $Σ$ _mn is the mth row and nth column of the variance-covariance matrix $Σ,$ where m (1, 2, . . ., 86) and n (1, 2, . . ., 86). In this model, $θ$ is the range parameter, governing how quickly the correlation decreases as distance increases, and d_mn is the Euclidean distance between health states m and n, defined as

\begin{matrix} {d^{2}}_{mn} = {(M O_{m} - M O_{n})}^{2} + {(S C_{m} - S C_{n})}^{2} \\ + {(U A_{m} - U A_{n})}^{2} + {(P D_{m} - P D_{n})}^{2} + {(A D_{m} - A D_{n})}^{2} . \end{matrix}

We refer to this model as the Shams model.²⁰

Model Fitting

We fit the original model (Xie model)¹⁸ and misspecification models without spatial correlation (“Chan model”) and with spatial correlation (“Shams model”) under the Bayesian paradigm. We assigned vague priors, as follows: $σ_{ε}$ ∼ uniform (0,1), $σ d$ ∼ uniform (0,1) and $σ_{b}$ ∼ uniform (0,1). In the Chan model, the priors for the 1st to 11th β were N(0,1) and for the 12th β we used N(0, 0.1), where the second parameter in the normal distribution is for the precision (1/variance). In the Shams model, the priors for the 1st through 12th β were N(0, 0.1), and the prior for $θ$ was uniform (0.01, 1.25). Model convergence was assessed with the Geweke test²²; models were run until test statistic values were less than 2. All models were fitted using R v4.2.1, with package rjags²³ and survival²⁴ and data visualization using ggplot2.²⁵ We used the model-generated values for health state 11111, as per the original model.¹⁸

Model Diagnostics

The 3 fitted models were evaluated based on 3 criteria: 1) posterior predictive assessment of model fit,²⁶ 2) adequacy of 95% CrIs, and 3) width of 95% CrI for out-of-sample states. We describe these criteria in more detail below.

Posterior predictive assessment of model fit

Gelman et al.²⁶ outlined an approach to assessing model fit via posterior predictive distributions, which we operationalized in our analysis as follows. We sampled 1,000 replicate data sets from the posterior predictive distribution. Specifically, we saved 1,000 samples of $μ$ _j, $σ_{u}$ , and $σ_{ε}$ from the Gibbs sampler, and for each sample, we used these values to simulate hypothetical respondent-level data T_ik for 1,073 respondents. These were used to compute state-specific sample mean utilities by fitting, for each state, a Tobit model (with censoring at 0 for cTTO responses and no censoring for tTTO responses) using the survival package in R; we denote the sample mean utility for state j and sample s by $μ_{j}^{rep, s}$ .

We also computed the sample mean utilities for the original data, which we denote $μ_{j}^{obs}$ .

For each replicate data set s, we formed the statistics

D_{s}^{rep} = \sum_{j = 1}^{86} (μ_{j}^{rep, s} - X_{j} β^{s})^{2}

and

D_{s}^{obs} = \sum_{j = 1}^{86} (μ_{j}^{obs} - X_{j} β^{s})^{2}

We counted the proportion of times that $D_{s}^{r e p}$ exceeded $D_{s}^{obs}$ , for s = 1, . . ., 1,000; this identifies how far in the tails of the posterior predictive distribution $D_{s}^{obs}$ lies. Goodness-of-fit exists on a scale, and thus there are no cutoffs for whether a tail probability is or is not acceptable; however, the closer a tail probability is to 0 or 1, the more we might question the fit of the model.

2. Adequacy of 95% CrIs derived from posterior predictive distributions.

For all 3 models, we iteratively omitted 1 health state (state j, j = 1, . . .,86), refit the model, and obtained the predicted mean utility ${\hat{μ}}_{(j)}$ for the omitted health state and its corresponding SE (se_(j)). We also obtained the observed mean utility ( $μ_{j}^{obs}$ ) for the omitted health state and its SE ( ${se}_{j}^{obs}$ ). The interval ( ${\hat{μ}}_{(j)} - μ_{j}^{o b s} - 1.96 \sqrt{s e_{(j)}^{2} + {(s e_{j}^{o b s})}^{2}}, {\hat{μ}}_{(j)} - μ_{j}^{o b s} + 1.96 \sqrt{s e_{(j)}^{2} + {(s e_{j}^{o b s})}^{2}}$ ) should then contain zero 95% of the time; failure to do so indicates either biased estimates of the predicted mean utilities or corresponding SEs of the predictions that are too small.

With 86 health states, we had 86 such intervals. We computed the proportion of these intervals that contained zero. If the model is correct, we would expect this proportion to lie between 0.91 and 0.99 with probability 0.95 (these numbers were obtained by inverting the Clopper-Pearson exact binomial test for the probability of a proportion being equal to 0.95).

3. The 95% CrI of the posterior predictive distribution of the mean utility for each omitted health state is $({\hat{μ}}_{(j)} - 1.96 {se}_{(j)}, {\hat{μ}}_{(j)} + 1.96 {se}_{(j)})$ . We computed the width of this interval for each health state (i.e., 3.84 se_(j)) and found its median.

We used the first 2 criteria to assess the adequacy of each model. Among models that were adequate, we used the third criterion to identify the model that gave the most precise predictions and that would therefore be preferred.

Creating a Multiply Imputed Value Set

Using these criteria, we selected a preferred model and took draws from its posterior distribution to create a multiply imputed value set for the EQ-5D-5L in Canada. Specifically, using the preferred model, we ran an additional 10,000 iterations and took draws at lags of 100 to create 100 imputed value sets, with each value set comprising a draw from the posterior distribution of the utilities for each of 3,125 health states.

Applying the Multiply Imputed Value Set to 2 Data Sets

With the multiply imputed value set, we calculated pooled means and SE in the Canadian general public and in a sample of women with invasive breast cancer. We describe these 2 populations below.

Before completing their valuation tasks, participants in the Canadian EQ-5D-5L valuation study indicated their own health state using the EQ-5D-5L. There were 1,208 valid EQ-5D-5L responses.¹⁸ The target of inference in this data set is the mean utility of the adult Canadian general public.

The Breast Utility Instrument (BUI) study included a convenience sample of 401 women 18 y and older with invasive breast cancer.^16,17 As part of the larger study, participants completed the EQ-5D-5L. The target of inference in this study is the mean health utility among women with invasive breast cancer.

For the purposes of comparison, we began by ignoring parameter uncertainty in the value set and estimated the mean utility and its associated SE using the originally published value set.

We then used our imputed value sets. Specifically, for each data set and for each value set k, k = 1, . . ., 100, we computed the sample mean utility U_k and its associated SE S_k. The estimate of the population’s mean utility is then the mean of the U_k’s, that is, $\bar{U} = \frac{1}{100} \sum_{k = 1}^{100} U_{k} .$

Its associated SE is $\sqrt{T}$ , where T is given by Rubin’s rules^7,27:

Within imputation variance (W):

\frac{1}{100} \sum_{k = 1}^{100} S_{k}^{2}

Between imputation variance (B):

\frac{1}{100 - 1} \sum_{k = 1}^{100} (U_{k} - \bar{U})^{2}

Total variance (T):

W + (1 + \frac{1}{100}) B

Research Ethics Board Approval

Research ethics approval was obtained from all recruitment sites for the Canadian EQ-5D-5L valuation study. Use of the BUI data was approved by the Sunnybrook Research Institute Research Ethics Board. All participants provided informed consent.

Results

The Shams model performed better than the other 2 models based on our model assessment criteria. None of the posterior predictive tail probabilities suggested that the assumed models were a poor fit to the data (Table 1). The Shams model achieved 94% coverage of the 95% CrIs, while the original model had 12% coverage. As seen in Figure 1, the original model has narrower CrIs and undercoverage compared with the Shams model. The Chan model had 90% coverage of the 95% CrIs (Table 1). In addition to achieving adequate CrI coverage, the median 95% CrI width for the Shams model was smaller than for the Chan model (0.0886 v. 0.0891; Table 1).

Table 1

Posterior Predictive Tail Probability, Percentage Coverage, and Credible Interval Width of the 3 Compared Models

Model Criterion	Model Misspecification
Model Criterion	None (Xie)	Independent (Chan)	Correlated (Shams)
Posterior predictive tail probability	0.80	0.70	0.53
95% CrI coverage on cross-validation	11.6%	89.5%	94.2%
Median 95% CrI width	0.0453	0.0891	0.0886

Figure 1

Coverage confidence intervals.

We therefore used the Shams model to create 100 imputed value sets. For both the Canadian general public and patients with breast cancer, the multiply imputed value set yielded a larger SE than the originally published value set did (Table 2): a 2.60-fold increase in the SE for the Canadian general public data set and a 2.56-fold increase for the breast cancer data set.

Table 2

Pooled Mean and Standard Errors

	Original Value Set	Multiply Imputed Value Sets
a) 1,208 respondents from the Canadian general public data
Mean	0.8640	0.8659
SE	0.0035	0.0091
b) 401 patients with breast cancer
Mean	0.8174	0.8157
SE	0.0066	0.0169

A table of the parameters of each of the models is found in Appendix 6.

Discussion

Value sets are estimated subject to parameter uncertainty, which is typically ignored. We created the first multiply imputed value set for the EQ-5D-5L, allowing users of the instrument in Canada to account for parameter uncertainty in the value set. When applied to 2 data sets collecting the EQ-5D-5L, using multiple imputation led to larger SE estimates, illustrating that the current practice of ignoring parameter uncertainty in the value set underestimates SEs.

We demonstrated that incorporating spatially correlated model misspecification terms correctly captures parameter uncertainty, whereas omitting model misspecification terms results in poor CrI coverage. This explains why our findings are in contrast to the previous work of Gray et al.,²⁸ which suggested that parameter uncertainty in value sets was not important; this previous work assumed that there was no model misspecification.²⁸ Our study found that without the misspecification terms, SEs are indeed much smaller; however, this results in 95% CrIs with inadequate coverage (12% instead of 95%). Coverage becomes 94% when misspecification terms are added, and the 95% CrI width increases from a median of 0.0453 to 0.0886. Consequently, we suggest that researchers investigating parameter uncertainty in their value sets should assess the adequacy of their credible/confidence interval coverage on out-of-sample health states and also consider model misspecification terms if this is found to be inadequate.

As noted in previous work,^3–5 the width of the 95% CrIs for state-specific mean utilities in the value set exceeds the MID. The EQ-5D-5L valuation protocol includes direct valuation of 86 of 3,125 (2.8%) states captured by the instrument.¹⁸ Shams and Pullenayegum²⁰ found that health states that were directly valued could be estimated with a higher degree of precision than health states that were not. Future EQ-5D-5L valuation studies and emerging EQ-5D-Y (youth) instruments could consider valuing a larger proportion of health states as another strategy to reduce the SEs in estimated health state utility values.

There are several limitations to this study. The Canadian EQ-5D-5L valuation study was in the first wave of valuation studies conducted by the EuroQol Group. The functional form of the EQ-5D-5L value set may affect the predictive performance of the model, dependent on the method of omitting health states. The Canadian value set has 11 parameters, different than the common functional form, and the newer cross-attribute-level effects (CALE) model. The common functional form is a 21-parameter model,²⁹ and the CALE model has 8 parameters.³⁰ Despite these different functional forms, Che et al.²¹ found that the MAEs for the published model were similar among the 7 countries studied, ranging from 0.031 for Japan to 0.070 for the Netherlands, with Canada having an MAE of 0.040. Che et al. also found that modeling the spatial correlation reduced the MAE in all 7 countries studied. We thus believe that despite the different functional form used in Canada, we would find a similar need to account for uncertainty in the value set in other countries.

There were concerns about the validity of the methodology for negative cTTO data, and this was one reason why the original Canadian EQ-5D-5L valuation study censored cTTO values at 0,¹⁸ relying on tTTO values to estimate the distribution below zero. We applied the same modeling decision in our analysis. Valuation studies in other countries either did not censor at all or chose to censor at −1.³¹ Decisions to censor data can affect the parameter uncertainty of the scoring model, which is beyond the scope of this study and therefore warrants investigation.

Logic dictates that for any 2 health states where 1 dominates the other, the utility of the dominated health state should be lower than that of the dominating health state. This is particularly relevant for adjacent health states, which differ in just 1 dimension by just 1 level. Given that incorporating state-level uncertainty results in wider 95% CrIs for mean health state utilities, it is reasonable to wonder whether the posterior distribution of incremental mean utilities for some pairs of adjacent health states includes logically inconsistent values. Of 756,250 dominant pairs, there were 2 pairs of health states for which the posterior means were logically inconsistent: 21515 and 21415, with posterior means of 0.4455 and 0.4443 respectively, and 22515 and 22415, with posterior means of 0.4089 and 0.4082, respectively. Since these logical inconsistencies occur in the third decimal place, we do not consider them to be practically important. Differences between utilities of adjacent health states are used to establish MIDs and minimal clinically important differences (MCIDs); given that these have larger uncertainties than previously believed, reevaluation of reported MIDs/MCIDs is warranted.

Our methodology is suitable for value sets generated with TTO (cTTO or tTTO) or SG utilities, but the models would need further development to be used with DCE data. Given that many of the EQ-5D-5L and EQ-5D-3L-Y value sets have used a hybrid model, extension of the modeling approach to joint estimation using cTTO and DCE data is an important area for future research. Furthermore, with increasing interest in DCE with duration as a standalone approach to valuation,^32,33 methods that account for state-level model misspecification using DCE data alone are needed.

Although we have focused exclusively on parameter uncertainty in the value set, other sources of uncertainty exist, for example, uncertainty due to poor data quality, uncertainty due to valuation task (TTO, SG, and DCE typically give different answers), uncertainty around the correct population (e.g., general public v. patient v. experience-based value sets), or modeling a large number of health states worse than dead.³⁴ We argue that all of these aspects of uncertainty are important to consider. In some cases, these other sources can be addressed, for example, the EuroQol Group introduced a quality control program³⁵ that substantially reduced protocol violations and interviewer effects. We argue that data quality and parameter uncertainty should be considered to ensure methodological rigor of future valuation studies; the 2 are not mutually exclusive.

Future research could use our imputed value sets to assess the impact of accounting for parameter uncertainty in the value set on cost-utility analyses that use the EQ-5D-5L to elicit utilities. Of particular interest would be to assess whether parameter uncertainty in the value set could have implications on the sensitivity analyses of economic models that inform the approval of new drugs and reevaluation of existing drugs. By capturing the full range of uncertainty of health state utilities with our multiply imputed value set, sensitivity analyses will test whether the ICER is affected by these wider ranges. If the ICER is affected, quantifying the expected value of partial perfect information³⁶ on the value set vis-à-vis other parameters in the economic model would be helpful in determining whether investment in more precise value sets is worthwhile.

While we created an imputed value set for the EQ-5D-5L in Canada, imputed value sets are still required for other countries and other instruments. We have provided our code in Appendix 3 to assist other researchers in creating these.

To facilitate the use of our imputed value sets, we have provided an R package to score the EQ-5D-5L using each imputed value set. This function allows users to fit their data to a model of their choice to derive health utilities from the EQ-5D-5L and to pool the results using Rubin’s rules. We provide the code and an example of how to install and use the package in the Appendix. This function is only slightly more complex to use than the eq5d function that is currently available to score the EQ-5D (3L or 5L).³⁷ This code can be implemented with value sets that have different functional forms and allow for heteroscedasticity. Future research could include broader knowledge translation efforts and the development of user-friendly interfaces to facilitate the use of our imputed value sets beyond users of R.

In summary, we have shown that the current practice of ignoring parameter uncertainty in EQ-5D-5L value sets leads to an underestimation of uncertainty in analyses that use the EQ-5D-5L to elicit utilities. Our analysis showed that correctly quantifying the extent of uncertainty in the Canadian EQ-5D-5L value set requires explicit incorporation of state-level model misspecification terms when producing a multiply imputed value set. In creating the first multiply imputed value set, we have made it possible for users of the EQ-5D-5L in Canada to account for parameter uncertainty in the value set. Our contribution to the literature provides a practical example and code to enable researchers and decision makers to better account for this uncertainty.

We thus recommend that 1) users of the EQ-5D-5L in Canada use multiple imputation to account for parameter uncertainty in the value set and 2) multiple imputed value sets be developed for other countries and for other MAUIs.

Supplemental Material

sj-docx-1-mdm-10.1177_0272989X241241328 – Supplemental material for Creating a Multiply Imputed Value Set for the EQ-5D-5L in Canada: State-Level Misspecification Terms Are Needed to Characterize Parameter Uncertainty Correctly

Supplemental material, sj-docx-1-mdm-10.1177_0272989X241241328 for Creating a Multiply Imputed Value Set for the EQ-5D-5L in Canada: State-Level Misspecification Terms Are Needed to Characterize Parameter Uncertainty Correctly by Teresa C. O. Tsui, Kelvin K. W. Chan, Feng Xie and Eleanor M. Pullenayegum in Medical Decision Making

Footnotes

The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article. The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: Financial support for this study was provided in part by the Mitacs Accelerate Postdoctoral Fellowship, the Canadian Statistical Science Institute (Ontario) Top-up Award for Postdoctoral Fellows in Data Science, and the Canadian Centre for Applied Research in Cancer Control (ARCC). The funding agreement ensured the authors’ independence in designing the study, interpreting the data, writing, and publishing the report.

ORCID iDs

Teresa C. O. Tsui

Kelvin K. W. Chan

Feng Xie

Eleanor M. Pullenayegum

References

Drummond

Sculpher

Claxton

Stoddart

Torrance

GW.

Methods for the Economic Evaluation of Health Care Programmes. 4th ed. Oxford (UK): Oxford University Press; 2015.

Pullenayegum

Chan

KKW

Xie

Quantifying parameter uncertainty in EQ-5D-3L value sets and its impact on studies that use the EQ-5D-3L to measure health utility: a Bayesian approach. Med Decis Making. 2016;36:223–33.

Doctor

Zoellner

Feeny

NC.

Minimal clinically important differences for the EQ-5D and QWB-SA in Post-traumatic Stress Disorder (PTSD): results from a Doubly Randomized Preference Trial (DRPT). Health Qual Life Outcomes. 2013;11:59. DOI: 10.1186/1477-7525-11-59

Walters

Brazier

JE.

Comparison of the minimally important difference for two health state utility measures: EQ-5D and SF-6D. Qual Life Res. 2005;14:1523–32.

Pickard

Neary

Cella

Estimation of minimally important differences in EQ-5D utility and VAS scores in cancer. Health Qual Life Outcomes. 2007;5:70. DOI: 10.1186/1477-7525-5-70

Kharroubi

O’Hagan

Brazier

JE.

Estimating utilities from individual health preference data: a nonparametric Bayesian method. J R Stat Soc Ser C Appl Stat. 2005;54:879–95.

Chan

Xie

Willan

Pullenayegum

EM.

Underestimation of variance of predicted health utilities derived from multiattribute utility instruments. Med Decis Making. 2017;37:262–72. DOI: 10.1177/0272989X16650181

Rubin

DB.

Multiple Imputation for Nonresponse in Surveys. New York: John Wiley & Sons; 1987.

Howell

Anker

Walker

Dorth

Kharofa

JR.

Analysis of patient-reported outcome utilization within national clinical trials network cooperative group radiation oncology trials over the past 2 decades. Int J Radiat Oncol Biol Phys. 2021;109(5):1151–60. DOI: 10.1016/j.ijrobp.2020.12.007

10.

Lidgren

Wilking

Jönsson

Rehnberg

Health related quality of life in different states of breast cancer. Qual Life Res. 2007;16:1073–81. DOI: 10.1007/s11136-007-9202-8

11.

Lorgelly

Doble

Rowen

Brazier

; Cancer 2015 Investigators. Condition-specific or generic preference-based measures in oncology? A comparison of the EORTC-8D and the EQ-5D-3L. Qual Life Res. 2017;26:1163–76. DOI: 10.1007/s11136-016-1443-y

12.

Pickard

De Leon

Kohlmann

Cella

Rosenbloom

Psychometric comparison of the standard EQ-5D to a 5 level version in cancer patients. Med Care. 2007;45:259–63. DOI: 10.1097/01.mlr.0000254515.63841.81

13.

Pickard

Wilke

Lin

Lloyd

Health utilities using the EQ-5D in studies of cancer. Pharmacoeconomics. 2007;25:365–84.

14.

Dolan

Gudex

Kind

Williams

A Social Tariff for EuroQol: Results from a UK General Population sUrvey. York (UK): University of York; 1995.

15.

Herdman

Gudex

Lloyd

, et al. Development and preliminary testing of the new five-level version of EQ-5D (EQ-5D-5L). Qual Life Res. 2011;20:1727–36. DOI: 10.1007/s11136-011-9903-x

16.

Tsui

TCO

Trudeau

Mitsakakis

, et al. Developing the breast utility instrument, a preference-based instrument to measure health-related quality of life in women with breast cancer: confirmatory factor analysis of the EORTC QLQ-C30 and BR45 to establish dimensions. PLoS One. 2022;17:e0262635. DOI: 10.1371/journal.pone.0262635

17.

Tsui

TCO

Trudeau

Mitsakakis

Krahn

Davis

. Developing the breast utility instrument to measure health-related quality-of-life preferences in patients with breast cancer: selecting the item for each dimension. MDM Policy Pract. 2022;7:23814683221142267. DOI: 10.1177/23814683221142267

18.

Xie

Pullenayegum

Gaebel

, et al. A time trade-off-derived value set of the EQ-5D-5L for Canada. Med Care. 2016;54:98–105.

19.

Oppe

Devlin

van Hout

Krabbe

de Charro

A program of methodological research to arrive at the new international EQ-5D-5L valuation protocol. Value Health. 2014;17:445–53. DOI: 10.1016/j.jval.2014.04.002

20.

Shams

Pullenayegum

Reducing uncertainty in EQ-5D value sets: the role of spatial correlation. Med Decis Making. 2019;39:91–99. DOI: 10.1177/0272989X18821368

21.

Che

Xie

Thomas

Pullenayegum

Bayesian models with spatial correlation improve the precision of EQ-5D-5L value sets. Med Decis Making. 2023;43(5):587–94. DOI: 10.1177/0272989X231173699

22.

Geweke

Evaluating the accuracy of sampling-based approaches to the calculation of posterior moments. In: Bernardo

Berger

Dawid

Smith

, eds. Bayesian Statistics. New York: Oxford University Press; 1992. p 169–93.

23.

Plummer

Stukalov

Denwood

‘rjags’: Bayesian graphical models using MCMC. R package 4-13 ed. 2022. Available from: https://cran.r-project.org/web/packages/rjags/rjags.pdf

24.

Therneau

. ‘survival’: survival analysis. 3.5-3 ed. 2023. Available from: https://cran.r-project.org/web/packages/survival/survival.pdf

25.

Wickham

. ggplot2: elegant graphics for data analysis. New York: Springer-Verlag; 2016. Available from: https://ggplot2.tidyverse.org

26.

Gelman

Meng

X-L

Stern

Posterior predictive assessment of model fitness via realized discrepancies. Stat Sin. 1996;6:733–807.

27.

Marshall

Altman

Holder

Royston

Combining estimates of interest in prognostic modelling studies after multiple imputation: current practice and guidelines. BMC Med Res Methodol. 2009;9:57. DOI: 10.1186/1471-2288-9-57

28.

Gray

Rivero Arias

Leal

Dakin

Ramos Goñi

How important is parameter uncertainty around the UK EQ-5D-3L value set when estimating treatment effects?

Joint Meeting of the Health Economists’ Study Group and the Collège des économistes de la santé Aix-en-Provence. Aix-en-Provence (France): Collège des Economistes de la Santé, 2012.

29.

Feng

Devlin

Shah

Mulhern

van Hout

New methods for modelling EQ-5D-5L value sets: an application to English data. Health Econ. 2018;27:23–38. DOI: 10.1002/hec.3560

30.

Yang

Rand

Busschbach

Luo

Cross-attribute level effects models for modeling modified 5-level version of EQ-5D health state values: is less still more?

Value Health. 2023;26(6):865–72. DOI: 10.1016/j.jval.2022.12.012

31.

Versteegh

Vermeulen

Evers

SMAA

de Wit

Prenger

A Stolk

Dutch tariff for the five-level version of EQ-5D. Value Health. 2016;19:343–52. DOI: 10.1016/j.jval.2016.01.003

32.

Devlin

Pan

Kreimeier

, et al. Valuing EQ-5D-Y: the current state of play. Health Qual Life Outcomes. 2022;20:105. DOI: 10.1186/s12955-022-01998-8

33.

Norman

Mulhern

Lancsar

, et al. The use of a discrete choice experiment including both duration and dead for the development of an EQ-5D-5L value set for Australia. Pharmacoeconomics. 2023;41:427–38. DOI: 10.1007/s40273-023-01243-0

34.

Devlin

Shah

Feng

Mulhern

van Hout

Valuing health-related quality of life: an EQ-5D-5L value set for England. Health Econ. 2018;27:7–22. DOI: 10.1002/hec.3564

35.

Ramos-Goni

Oppe

Slaap

Busschbach

Stolk

Quality control process for EQ-5D-5L valuation studies. Value Health. 2017;20:466–73. DOI: 10.1016/j.jval.2016.10.012

36.

Coyle

Oakley

Estimating the expected value of partial perfect information: a review of methods. Eur J Health Econ. 2008;9:251–9. DOI: 10.1007/s10198-007-0069-y

37.

Morton

Nijjar

. eq5d: methods for analysing ‘EQ-5D’ data and calculating ‘EQ-5D’ index scores. R package 0.11.0 ed. 2022. Available from: https://CRAN.R-project.org/package=eq5d

Supplementary Material

Please find the following supplemental material available below.

For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.

For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.

0.00 MB

0.14 MB