Abstract
Introduction
An explosion of medical technologies with various levels of benefits, toxicity, and costs have complicated the evaluation of strategies that optimize care by health care providers and patients. 1 The evaluation of these medical technologies requires understanding their features and potential outcomes based on the best evidence available. But these evaluations also require understanding the judgments that patients make about the impact of such features and outcomes. Just as treatment response can vary across patients, the perceived impact of that response can vary as well.
It is common to assume that clinical variations are correlated with perceived treatment impacts, so that patients who can see greater clinical improvements should also be the ones who can benefit the most from those treatments.2–4 This may not be the case when clinical improvements are accompanied by tolerability issues, or risks of serious adverse events. Patients who can see the greatest improvements in outcomes may also be the most averse to the potential problems with therapies.5–7 In such situations, the relative importance of treatment outcomes becomes crucial in optimizing care. Once the two layers—clinical outcomes and patient preferences—are decoupled, it is easy to see how both require rigorous evaluation at an individual level to provide patients with realistic expectations about health outcomes that are also meaningful and valuable. 8
Recently, value frameworks have been developed as tools to systematically evaluate treatment strategies in an effort to optimize care. Value frameworks assess the value of therapies as a function of the value of their attributes. 9 These frameworks incorporate clinical and nonclinical aspects of treatments, establishing the relevant attributes that meet the framework objectives and a scoring rubric that tallies the importance of attributes and ultimately produce a measure of treatment value.
The number of frameworks developed by organizations that seek to measure value in health services has grown in recent years. These organizations include the American Society of Clinical Oncology (ASCO), the National Comprehensive Cancer Network (NCCN), Memorial Sloan Kettering Cancer Center (MSKCC), the Institute for Clinical and Economic Review (ICER), and the European Society for Medical Oncology (ESMO).1,9–13 The ASCO, NCCN, and MSKCC frameworks focus exclusively on assessing the value of oncologic drugs and treatments. Despite important recent advances in the development of value frameworks, they still face multiple analytic challenges. 14 There is no consensus about what dimensions should be considered in these frameworks, or the scoring approaches that adequately represent the value of treatments. 14
While patient preferences would be expected to play a key role in a patient-centric value assessment tool, none of the three most notable frameworks focused on oncologic treatments explicitly consider patients’ perspectives. 13 So far, these frameworks have primarily focused on identifying relevant treatment outcomes and the relative weights the outcomes ought to have based on their supporting clinical evidence. The frameworks estimate the value of oncology treatments through clinical benefits, side effects, and improvements in patient symptoms or quality of life in the context of cost. Some, like the ASCO value framework, acknowledge that the context of treatment decisions can influence value and allow variations in the treatment scoring rubric based on disease stage (i.e., advanced disease v. adjuvant therapy). Nevertheless, all frameworks use formulaic or expert-driven scoring rubrics for specific treatment attributes that bear no relationship to the tradeoffs that patients would be willing to make between these outcomes. 14 To the extent that judgments in these frameworks correlate with patients’ values, they are assumed to be mostly constant across patients.
Our aim is to assess patient preferences for breast cancer treatment outcomes to improve our understanding of the relative importance of these outcomes, including whether and how relative importance changes across breast cancer patients. This information can help refine and improve methods for assessing the comparative value of treatment options for patients with cancer and to use these assessments to support decision making among clinicians, patients, payers, and other stakeholders.
It is important to note that our objective is not to collect information that could be used to directly expand value frameworks but rather to test the implicit assumption of homogeneity of patient values for treatment outcomes. With this in mind, our effort is only to test whether the assumption is adequate among a highly homogeneous group of respondents—in terms of disease stage and background. Failure to meet this assumption in our sample would suggest that further work is needed to obtain a valid (and representative) set of values for this population.
To accomplish our objective, we used a discrete-choice experiment (DCE) to collect quantitative evidence on patient preferences to accomplish the following: 1) assess the treatment attributes that patients with breast cancer consider most important in the value of cancer treatments; 2) assess whether patient preferences for aspects of breast cancer treatments are homogeneous; and 3) potentially help inform the definition of new scoring rubrics in future value frameworks.
Methods
Patients with breast cancer completed a DCE that helped assess the relative importance of breast cancer treatment attributes. DCEs have been increasingly used to inform health policies, including regulatory decisions on the benefits and risks of new medical technologies.15–17 A DCE is a survey-based method that asks respondents to choose between experimentally designed treatment options. The experimentally designed options are presented in terms of categories (attributes). The treatment options assume a level under each attribute (attribute levels) that represents how the treatment would perform if taken. Table 1 includes the list of final attribute levels in the DCE survey.
Attributes and Attribute Levels
One can think of each DCE question as eliciting patients’ stated preference for treatments that would receive different scores under a value framework. Choices between options reveal the frequency with which patients think they would be willing to forgo specific positive and negative aspects of treatments—correlating the value framework score to a latent preference construct that signals the impact of each attribute on patient well-being. Figure 1 presents an example choice question.

Example choice question
The DCE survey instrument was developed and administered during Spring 2017. The study team followed ISPOR good-practice guidance during the development of the instrument. 18 Treatment characteristics under value frameworks provided the basis for the attributes considered in the study. These attributes were expanded based on literature review and feedback from clinical experts and patients during in-person qualitative interviews. Seven stakeholders who met the eligibility criteria for the DCE survey participated in the qualitative interviews. The final selection of attribute levels represented improvements over the current standard of care so all options would be a feasible option over patients’ current treatments.
After the development of the survey instrument, a D-efficient design was used to construct 36 choice questions, each including 3 hypothetical treatments. Some attributes in the design repeated attribute levels across alternatives (attribute overlap) to reduce the complexity of the choice questions. 19 Each respondent was only asked to answer 12 questions, and the order of the questions presented to patients was randomized to avoid sequencing effects. Definitions for all attributes were provided to respondents prior to the start of the choice questions, and made available to participants at any time during the experiment (participants could hover over an attribute to trigger a pop-up display that would refresh their memory regarding the attribute definition). The full definitions as provided to respondents are available in Appendix A. The survey was administered online to eligible survey participants. The study protocol was reviewed and deemed exempt by the University of Southern California Institutional Review Board. (The study principal investigator was affiliated with the University of Southern California at the time the study was conducted.)
Study Participants
Study participants were recruited through a professional survey panel managed by Survey Healthcare Universal. The recruitment effort was part of a broad study eliciting treatment preferences from patients (or caregivers of patients) with rheumatoid arthritis, pediatric asthma, Alzheimer’s disease, and hypertension. Respondents who completed the DCE questions presented here were required to be female adults (i.e., 18 years of age or older) with a self-reported physician diagnosis of stage 3 or stage 4 breast cancer. Also, diagnosis had to be received at least 1 month before the patient completed the survey to avoid some of the emotional distress expected during a new diagnosis. Patients had to be currently under the care of a physician and receiving treatment.
The study exclusion criteria restricted patients who met the inclusion criteria for other groups studied, so respondents would only qualify to one of the survey versions administered in the broader study. Thus, patients could not have a self-reported diagnosis of rheumatoid arthritis or Alzheimer’s disease. Also, prior to breast cancer diagnosis, patients could not have a history of other cancers, diseases of blood flow to the brain, chronic kidney disease, chronic obstructive pulmonary disease, coronary artery/heart disease, type 1 or type 2 diabetes, heart failure, hepatitis B, hepatitis C, HIV/AIDS, multiple sclerosis, or tuberculosis. It is important to highlight that these inclusion/exclusion criteria was not designed to obtain a representative set of preferences from the study sample, but rather to test for preference heterogeneity in a sample with a relatively homogenous clinical background.
Analysis
Choices from respondents were analyzed using logit-based regression models, following good-practice guidance for the analysis of these data. 20 Logit-based regression models relate the choices made by each respondent to the tradeoffs implicitly accepted with each choice given the experimental design used to construct the DCE questions. Results from the logit-based regressions are considered preference weights for each attribute level. While the absolute value of the preference weights is meaningless, higher preference weights indicate greater intensity of preferences. 20 Differences in preference weights within attributes indicate the importance of changing the attribute between the levels contrasted.
A random-parameters logit (RPL) model was used to estimate population-level preference weights (with their 95% confidence intervals) and standard deviation of preferences based on individual-level choice patterns. 21 The final model specification used dummy-coded variables for each attribute level in the experimental design—the omitted level in the model was set to be equal to zero and looked as follows:
where V is the well-being (utility) associated with a specific treatment alternative, the variables LIFE, FUNC, SIDE, REQ, PCOST, ICOST, and UNC represent each of the attribute levels as presented in Table 1. Finally, the betas are estimated by the model to optimize the choice patterns observed for each respondent. All betas in the model specification were assumed to be normally distributed across respondents. Interaction terms were estimated between the UNC attribute and the efficacy attributes (i.e., LIFE and FUNC), but none were found to be statistically significant at the 95% confidence level.
A latent class (LC) logit model was also used to leverage the repeated choices recorded from each respondent to systematically identify subgroups who shared similar choice patterns. The LC logit used an expectation-maximization algorithm through which class-specific preference weights are estimated and individual’s class-membership probabilities are iteratively determined. 22 The number of classes was determined based on model fit (i.e., AIC and BIC) and parsimony. 20 Class assignment was probabilistically determined and used to calculate individual-level preference weights following Greene and Hensher. 23 The individual-level preference weights represent a weighted average of the class-specific preference estimates, weighted by the probability that each individual is in the identified classes. The individual class probabilities were determined using a Bayesian procedure based on the specific choice patterns exhibited by the respondent. Patient characteristics were also correlated with individual’s probability class assignment to identify statistically significant predictors of class membership.
The relative importance of changes in attributes can be used to understand how much one attribute needs to change to offset a prespecified change in another attribute (attribute equivalence). We used attribute equivalence to infer the treatment out-of-pocket cost that patients thought would be needed to exactly offset the benefits of a treatment (monetized treatment value [MTV] measures). The MTV measures for specific treatment benefits were calculated for each individual, based on the individual-level preference estimates obtained through the LC logit model.
Results
Potentially eligible participants were invited to complete the DCE and 100 did so. Most respondents were white (97%), at least 45 years old (72%), and reported having at least $75,000 in income (58%). Also, nearly 30% reported having Medicare. Finally, most (84%) respondents reported having distant metastases. Table 2 summarizes the characteristics of these respondents.
Respondent Characteristics (N = 100)
Percentages may not sum to 100 because of rounding, nonresponse.
Also includes those who graduated from technical or trade school.
Also includes those who earned a two-year associate’s degree.
Not mutually exclusive categories.
Includes employer-sponsored insurance or insurance purchased through exchanges or the private market.
Includes other forms of government insurance such as Tricare.
Figure 2 shows the estimated preference weights for each attribute level in the DCE and the 95% confidence interval. The full set of parameters from the RPL model with their standard deviations are presented in Appendix B. The absolute value of the estimates is meaningless, but higher preference weights indicate greater preference for treatments with an attribute level, ceteris paribus. As expected, better clinical outcomes are associated with higher preference weights. Life extensions were considered to be most important (i.e., greatest change in the preference weights given the levels in the attribute). Other important attributes were out-of-pocket cost of treatment, treatment route of administration, and the availability of reliable tests to help gauge treatment efficacy.

Population-level preference weights
In the latent class model, a two-class model was found to be superior to a one-class model of preferences. Although a three-class model was also supported by the data, the limited sample size and the interpretability of the results in a two-class model was judged to be more appropriate for the purpose of the study. Figure 3 shows the preference weights associated with each of the two classes identified with the latent-class analysis. The model parameter estimates are presented in Appendix C. To facilitate comparisons across class results, we set the overall importance of life extension (i.e., the importance of the maximum extension in survival offered in the experiment) to lie between 0 and 10 for both sets of class-specific parameters. All other parameter estimates in each class were adjusted accordingly to preserve attribute relative importance. Differences in class-specific preferences were primarily associated with route of administration, out-of-pocket treatment cost, and the availability of a test to gauge treatment efficacy, where Class 1 preferences show treatment out-of-pocket cost as the most important treatment attribute. On the other hand, Class 2 preferences show treatment efficacy was most important. When correlating individual characteristics with class-membership probabilities, only cancer stage was found to be correlated with class assignment (P = 0.035), with late-stage patients (stage 4) being more likely to be represented by preferences in Class 2.

Preference results from the two-class latent-class analysis
Individual-level MTV estimates were obtained for three treatment benefits: 1) increasing expected survival from 3 to 24 months (Figure 4), 2) changing the treatment route of administration from injections administered 12 days per month to 1-hour infusions every 3 weeks (Figure 5), and 3) having a test that can help gauge treatment efficacy (Figure 6). These figures summarize the proportion of respondents who are expected to have specific MTV values for each treatment improvement. For example, in Figure 4 it is possible to see that more than 15% of respondents had an MTV below $8,000. Results from this analysis show that while MTV for improvements in the treatment route of administration and the availability of a test to help gauge treatment efficacy are likely bimodal, the MTV for survival benefits is less likely to be bimodal as individual MTV values appear to be more evenly distributed within the estimated range for the measure.

Proportion of respondents with specific values for improving expected survival from 3 to 24 months. We used individual-specific preference weights to calculate the out-of-pocket cost that would completely offset the treatment improvement for each respondent. Individual-specific preference weights were based on the results from the latent-class model.

Proportion of respondents with specific values for improvements in the route of administration (from injections 12 days per month to 1-hour infusions every 3 weeks). We used individual-specific preference weights to calculate the out-of-pocket cost that would completely offset the treatment improvement for each respondent. Individual-specific preference weights were based on the results from the latent-class model.

Proportion of respondents with specific values for a test that can help gauge treatment efficacy. We used individual-specific preference weights to calculate the out-of-pocket cost that would completely offset the benefit of having a test to gauge treatment efficacy for each respondent. Individual-specific preference weights were based on the results from the latent-class model.
Discussion
We find that the scoring rubrics for some of the value frameworks are consistent with some treatment preferences elicited from patients with breast cancer. First, the relatively large importance of treatment efficacy seems to be aligned with the views of the patients we surveyed. Second, the use of disease stage as the basis for variations in treatment values, as incorporated in the ASCO value framework, appears to be appropriate based on our results. This is, however, where our results start deviating from the frameworks’ scoring rubrics. Although correlated with disease stage, out-of-pocket treatment cost and route of administration are not included in the ASCO or NCCN value frameworks. These are particularly important among some respondents (mostly those without distant metastases), suggesting that early in the disease path patients tend to be more concerned about treatment burden in terms of frequency and duration, as well as financial toxicity.
A two-class latent-class model is consistent with the current grouping strategy in the ASCO value framework where only two patient types are considered. It is also true that the types of patients we identified in each class are similar to those considered in the ASCO framework. However, our results suggest that it is likely not appropriate to use a generic scoring rubric for patients with breast cancer, even after adjusting for disease stage. Although we find that preferences for breast cancer treatments vary systematically with disease stage, a bimodal distribution of preferences for treatment efficacy does not seem supported by our results. This is particularly relevant because treatment efficacy is the component of ASCO’s current scoring rubric with the largest weight. Hence, failing to capture the relative importance of this attribute appropriately can have the greatest impact on the assessment of treatment value.
Notably, our results show that some highly important aspects of treatment value for patients may still be missing from most value frameworks. These include the availability of reliable tests to help gauge treatment efficacy, suggesting that certainty around expectations for treatment efficacy, not just improvements in that expectation, are of great value to patients.
It is also worth noting that toxicity, at least as specified in our experiment, does not seem to be nearly as important as we see in some of the value frameworks. A 10% increase in major side effects, for example, would induce a substantial decrease with the ASCO or the NCCN scoring rubric, but the equivalent changes in the form of health gains from toxicity-free days or reductions in severe side effects, appeared to barely draw respondents’ attention in our application. These results may be more aligned with the framework proposed by MSKCC where drug-value discounts can be quite limited even in the presence of treatment toxicity.
DCEs rely on stated choices between hypothetical treatments which do not have the same consequences as real-world decisions. We attempted to reduce the hypothetical nature of the questions by closely mimicking the real-world decision context in the choice questions and by including all the attribute information expected to be available to patients.
The number of attributes in the survey instrument was beyond the usual number found in DCEs. 24 This could have led to greater burden on respondents. As mentioned before, we attempted to minimize this issue by allowing some attributes to show the same level in the DCE questions (attribute overlap). This reduces the number of changes respondents are asked to consider across treatment options, and with it, the overall burden of the questions. 19
We also rely on patient self-reporting of diagnosis and disease stage. There is some evidence that using self-reported information from breast cancer patients is adequate for diagnosis and to determine the extent of axillary nodes involvement.25,26 Unfortunately, it is not possible to corroborate the self-reported information in this study with the data at hand. We attempted to minimize false self-reporting of breast cancer by asking respondents to select their condition from a list of medical problems. This avoided signaling participants the disease they must report to be able to complete the survey. Moreover, we excluded respondents who selected all or most health problems in the provided list. Furthermore, recent evidence on the use of online consumer panels with self-reported physician diagnosis suggests that the approach is reliable to the extent that estimated preferences do not seem to differ from those in samples with physician confirmation of diagnosis. 27
Finally, our sample size was relatively small and homogeneous, so it is not possible to establish the generalizability of the study results. Particularly, among women with non-metastatic disease, our results were based on a small number of patients (n = 16), which makes difficult the generalization of findings from this group. While a larger sample size could have included respondents with more diverse backgrounds, the uncovering of preference heterogeneity and the identification of important attributes outside of the current oncologic value frameworks does not require a representative sample. In that sense, our results fully support the issues raised about current framework scoring rubrics.
Conclusions
Value frameworks are an important step in the systematic evaluation of medications in the context of a complex treatment landscape. However, frameworks are still largely driven by expert judgment. The values still fail to incorporate patients’ perceptions on the impact of treatments in a transparent and rigorous way. This can lead to over- or underestimation of the value of benefits for specific patients. If patient preferences are heterogeneous or different from the weights provided by the frameworks, the use of such tools may not be adequate or meaningful for a number of patients.
This study shows that DCEs may offer a way to inform a patient-centric scoring rubric for value frameworks. If, as expected, a revised version of the ASCO value framework operates in six different clinical scenarios to guide decision making, 28 the approach followed in this study could prove helpful to accomplish this objective in a patient-centric way. A similar study could be used to determine an appropriate number of clinical scenarios based on patient input. It could also be used to determine how the scoring rubric should change between scenarios, and even whether the tool can be further simplified with some patients as some attributes may be unimportant under some scenarios rendering their inclusion unnecessary in some cases.
Supplemental Material
Appendix_A_-_Details_attribute_definitions_online_supp – Supplemental material for Do Patient Preferences Align With Value Frameworks? A Discrete-Choice Experiment of Patients With Breast Cancer
Supplemental material, Appendix_A_-_Details_attribute_definitions_online_supp for Do Patient Preferences Align With Value Frameworks? A Discrete-Choice Experiment of Patients With Breast Cancer by Ilene L. Hollin, Juan Marcos González, Lisabeth Buelt, Michael Ciarametaro and Robert W. Dubois in MDM Policy & Practice
Supplemental Material
Appendix_B_-_Mixed_logit_results_online_supp – Supplemental material for Do Patient Preferences Align With Value Frameworks? A Discrete-Choice Experiment of Patients With Breast Cancer
Supplemental material, Appendix_B_-_Mixed_logit_results_online_supp for Do Patient Preferences Align With Value Frameworks? A Discrete-Choice Experiment of Patients With Breast Cancer by Ilene L. Hollin, Juan Marcos González, Lisabeth Buelt, Michael Ciarametaro and Robert W. Dubois in MDM Policy & Practice
Supplemental Material
Appendix_C_-_LC_logit_results_online_supp – Supplemental material for Do Patient Preferences Align With Value Frameworks? A Discrete-Choice Experiment of Patients With Breast Cancer
Supplemental material, Appendix_C_-_LC_logit_results_online_supp for Do Patient Preferences Align With Value Frameworks? A Discrete-Choice Experiment of Patients With Breast Cancer by Ilene L. Hollin, Juan Marcos González, Lisabeth Buelt, Michael Ciarametaro and Robert W. Dubois in MDM Policy & Practice
Footnotes
The author(s) declared the following potential conflicts of interest with respect to the research, authorship, and/or publication of this article: The following authors are employed by the sponsor: Lisabeth Buelt, Michael Ciarametaro, and Robert W. Dubois. Ilene L. Hollin was employed by the sponsor at the time the study was conducted.
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: Financial support for this study was provided entirely by a grant from the National Pharmaceutical Council. The funding agreement ensured the authors’ independence in designing the study, interpreting the data, writing, and publishing the report.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
