Abstract
Background
In economic evaluations of novel therapies, assessing lifetime effects based on trial data often necessitates survival extrapolation, with the choice of model affecting outcomes. The aim of this study was to assess accuracy and variability between alternative approaches to survival extrapolation.
Methods
Data on HER2-positive breast cancer patients from the Swedish National Breast Cancer Register were used to fit standard parametric distribution (SPD) models and excess hazard (EH) models adjusting the survival projections based on general population mortality (GPM). Models were fitted using 6-y data for stage I and II, 4-y data for stage III, and 2-y data for stage IV cancer reflecting an early data cutoff while maintaining sufficient events for comparison of model estimates with actual long-term outcomes. We compared model projections of 15-y survival and restricted mean survival time (RMST) to 15-y registry data and explored the variability between models in extrapolations of long-term survival.
Results
Among 11,224 patients compared with the observed registry 15-y RMST estimates across the disease stages, EH cure models provided the most accurate estimates in patients with stage I to III cancer, whereas EH models without cure most closely matched survival in patients with stage IV cancer, in which cure assumption was less plausible. The Akaike information criterion–averaged model projections varied as follows: −8.2% to +5.3% for SPD models, −4.9% to +5.2% for the EH model without a cure assumption, and −19.3% to −0.2% for the EH model with a cure assumption. EH models significantly reduced between-model variance in the predicted RMSTs over a 50-y time horizon compared with SPD models.
Conclusions
EH models may be considered as alternatives to SPD models to produce more accurate and plausible survival extrapolation that accounts for general population mortality.
Highlights
Excess hazard (EH) methods have been suggested as an approach to incorporate background mortality rates in economic evaluation using survival extrapolation.
We highlight that EH models with or without a cure assumption can produce more accurate survival projections and significantly reduce between-model variability in comparison with standard parametric distribution models across cancer stages.
EH models may be a preferred modeling method to reduce model uncertainty in health economic modeling since models that would otherwise have produced implausible extrapolations are constrained by the EH framework.
Reduced uncertainty in economic evaluations will enhance the application of evidence-based health care decision making.
Keywords
Introduction
Health economic evaluation is crucial for informed decision making around the implementation of new health technologies, especially in areas of rapid innovation, such as in oncological pharmacotherapy. Cost-effectiveness of a novel therapy is assessed to optimize resource allocation and promote transparency in pricing and reimbursement in health care policy.1,2 A cost-effectiveness analysis often relies on randomized controlled trials (RCTs) as the primary data source, but assessing lifetime effects of the novel therapy is challenging due to a limited follow-up duration. Immature data from RCTs necessitate survival extrapolation beyond the trial period, while the choice of extrapolation methods mostly affects the outcomes. In recent years, there has been a growing discussion with regard to uncertainty associated with different methodologies used for survival extrapolation in cost-effective analyses of oncology medicines, including in breast cancer.3–7 Population cancer registries provide a rich source of long-term mortality follow-up, which can be used to help investigate the accuracy of extrapolations.
The National Institute for Health and Care Excellence (NICE) recently updated the recommendation to incorporate alternative modeling methods that are aimed at reducing uncertainty in survival extrapolation. 8 The standard parametric distribution (SPD) models may induce inconsistent results when the distributions are derived solely from the trial data. During the trial period, the disease-related risk predominantly drives mortality in the study population, whereas mortality from other causes (driven by age-related risk factors) becomes more prominent during the extrapolation period. As the extrapolation is projected further, the non–disease-related risk is expected to rise and eventually account for a substantial proportion or even the entirety of the all-cause mortality rate in the study population. When the extrapolation driven solely from the trial data indicates a lower risk of mortality to the study population than that of a matched general population, the projection is deemed implausible considering the long-term follow-up of oncology patients in clinical practice.
General population mortality (GPM) adjustment ensures that all-cause mortality risk in the study population is at least equal to, if not greater than, the expected mortality risk in the general population. This method is commonly referred to as excess hazard modeling, also known as the relative survival framework,8,9 and its ability to provide more reliable long-term all-cause survival estimates has been recently demonstrated.10,11 Postestimation truncation of hazard rates to prevent SPD models from falling below background mortality is an alternative approach that is sometimes applied by researchers once models have been fitted. 12 This approach often relies on background rates based on the average age of the population in an ad hoc manner, while excess hazard modeling uses background mortality rates that are individually matched to trial participants by age, sex, and calendar year. Thus, excess hazard modeling is considered as a more principled and preferred method, as discussed extensively in van Oostrum et al. 13
Excess hazard modeling is an instrumental and practical method to implement GPM adjustment that partitions overall mortality rates into the expected rates (of the general population) and the excess rates (due to the disease). The expected rates are obtained from GPM rates, which are commonly available from life tables matched to the study population. The excess rates are estimated from the model fitted to the trial data, which indicate the additional mortality risk incurred within the study population.
The excess mortality risk observed during the trial period is expected to diminish over time. Excess hazard (EH) models may project excess mortality rates that reflect this decreasing risk, but this requires trial data with sufficient maturity to enable the models to capture the decreasing excess rate. If the excess rate eventually converges with the expected rate, then this indicates the absence of cancer-driven excess risk, implying that the patients have reached a state of “statistical cure.”9,14 EH cure models can be fit that enforce this behavior in the long term. However, in the patient population with severe disease and a poor prognosis, cure may not be expected, resulting in the cancer-driven mortality risk consistently exceeding the GPM risk. Some types of cancer, such as lymphoma and leukemia, have longer survival and require lifelong oncology surveillance, during which an excess mortality persists compared with the general population over an extended period.15,16
This study focuses on survival extrapolation in HER2-positive breast cancer patients, a subgroup of interest following the introduction of targeted therapies such as trastuzumab. 17 HER2 overexpression not only serves as a prognostic marker but has also become crucial for determining the suitability of anti-HER2 therapies,18,19 making it essential to study the impact of extrapolation methods. By combining clinical and pathologic markers, breast cancer can be classified into subgroups, such as cases in which HER2-positive tumors may be considered low risk in treatment with trastuzumab with or without endocrine therapy. Defining low risk in breast cancer necessitates factoring in patient-related elements such as comorbidities and age, as the risk of competing mortality may outweigh concerns about cancer recurrence from a patient’s perspective. 20
In this study, we aim to assess accuracy and variability of survival extrapolations using SPD models and EH models, with and without assuming cure, using survival data on HER2-positive breast cancer patients from the National Breast Cancer Register of Sweden (NBCR).
Methods
Breast Cancer Data
The NBCR has collected data on primary invasive and in situ breast cancer since 2008. The register is known for high data quality for research with high completeness and coverage across regions and years. 21 Considering the extensive follow-up and precise mortality data linked to the registry, this data source is highly appropriate for evaluating the predictive accuracy of modeling approaches. Data were extracted on HER2-positive breast cancers diagnosed between January 1, 2008, and December 31, 2020. All patients were followed until death or censored on March 19, 2022. Ethical permission has been granted to conduct the current study by the Swedish ethical review authority (Etikprövningsmyndigheten) in February 2023.
Survival Models
We fitted survival models, separately by cancer stage, using SPD models and EH models with or without a cure assumption based on a range of distributions including exponential, Weibull, Gompertz, gamma, log-normal, log-logistic, and generalized gamma. The framework of these models is explained in Supplementary Table 1 with further details elsewhere. 10 EH models were fitted using Swedish life tables for the expected GPM rates matched on age, sex, and calendar year. EH cure models were fitted using a mixture-cure model, which considers the relative survival as a mixture of 2 latent subpopulations: one that is cured and never experiences mortality due to the disease and the other that is uncured with nonzero excess mortality. 22 Over time, this model leads to excess rates in the population tending toward zero. If the probability of cure is not a function of covariates, then it can be interpreted as an overall cure fraction (the proportion of the population estimated to be eventually cured of their disease if other causes of mortality were not acting on the population). The cure fraction should be interpreted cautiously since its estimation is sensitive to the distribution choice for the uncured component of the model.23,24 In a case in which the probability of cure is zero, EH cure models collapse to EH no cure models.
Analysis
Baseline patient characteristics were stratified by cancer stage and summarized using counts and percentages for categorical variables and mean and standard deviation for continuous variables. Chi-square tests and analysis of variance tests with standardized mean differences were used to compare characteristics across cancer stages. The Kaplan–Meier (KM) estimator was used to calculate survival probabilities from the date of primary diagnosis to date of death with patients administratively censored at the end of the follow-up.
Models were fitted using trimmed analysis datasets at specific data cutoff points, reflecting the common practice in economic evaluations using immature trial data. Data cutoffs were selected to ensure sufficient statistical power, based on the number of events required for survival analysis (i.e., 6 y for stage I and II cancers, 4 y for stage III, and 2 y for stage IV). Model predictions were compared with the KM estimates up to the maximum follow-up in the registry. To assess the accuracy of the predictions between the alternative approaches, we compared the 15-y restricted mean survival time (RSMT) from the registry with the individual model estimates and the Akaike information criterion (AIC)–weighted average estimate, with 95% confidence intervals (CIs) derived via bootstrapping (N = 100). To assess variability in long-term survival predictions due to the choice of survival model (between-model variance), the between-model variance in 50-y RMST estimates between the alternative approaches was calculated. A higher between-model variance indicates greater variability in projections. Within-model variances, which pertain to stochastic error within the predictions of a single model, were reported with 95% CI per model by stage.
Results
Of 12,345 patients with invasive HER2-positive breast cancer, 1,027 patients were excluded for unknown status for mortality and cancer stage. After excluding 31 patients who received previous treatments for breast cancer or other cancers before their primary diagnosis and 63 male patients, 11,224 treatment-naïve patients were finally included. The cancer severity was defined as cancer stage I to IV, which was derived from the original records according to the TNM (tumor, node, metastasis) staging system. 25 Cases of carcinoma in situ (stage 0) were not included in the analysis. Figure 1 illustrates the process of patient inclusion.

The CONSORT flow diagram.
Table 1 presents baseline patient characteristics by cancer stage. Most of the patients were diagnosed at an early stage (stage I: 41.8%, stage II: 47.7%, stage III: 6.8%, and stage IV: 3.7%), and there was a trend of increasing age with severity of the disease, with an overall mean age of 60.2 y (s: 14.2 y). Most patients were postmenopausal (66.7%), estrogen receptor positive (67.4%), progesterone receptor negative (52.1%), fluorescence in situ hybridization amplified (88.4%), and had ductal carcinoma (87.7%) and Nottingham histological grade ≥2 (85.2%). Preoperation therapy including breast cancer conservation therapy, primary operation, and neoadjuvant or adjuvant antibody therapy (mostly trastuzumab) were commonly received except for stage IV patients. The median follow-up until death or censoring was 5.71 y (
Baseline Patient Characteristics by Cancer Stage
FISH, fluorescence in situ hybridization; IHC, immunohistochemistry; SMD, standard mean difference.
Survival Analysis Using SPD Models
To illustrate the implications of extrapolation without external information, survival data were fitted separately by cancer stage using SPD models and projected over a 50-y time horizon along with the expected survival of the matched general population across cancer stages. Figure 2 displays the survival projections using the Weibull distribution. Notably, the long-term survival projections of the early-stage cancer patients surpassed those of the matched general population, highlighting the issue with extrapolations based solely on short-term data. The Weibull models predicted a mortality rate (hazard) that plateaus for each cancer stage and drops below the GPM rate at 10, 20, 30, and 40 y for cancer stages I, II, III, and IV, respectively. Other distributions also suffered from a similar issue, with predicted long-term hazard rates that were lower than GPM rates (Supplementary Figures 2 and 3). AIC goodness-of-fit tests on the full suite of SPD models showed that the exponential and Gompertz models gave relatively poor fits (Supplementary Table 2).

Survival by cancer stage using a standard parametric distribution (Weibull) projected over a 50-y time horizon, compared to the expected survival of the age-, gender-, and calendar year–matched Swedish life table.
Comparison between SPD Models and EH Models with or without a Cure Assumption
Survival extrapolations based on immature data with early data cutoff compared with the mid-term KM data are presented for stage II in Figure 3 and for other cancer stages in Supplementary Figures 4 to 6. For the early-stage cancers with large sample sizes, these models generally aligned well with the KM data. For the late-stage cancers with smaller sample sizes, significant deviations were projected between the models. In stage II, both SPD and EH no-cure models underestimated survival by year 15, whereas EH cure models provided a closer match to the mid-term KM data. In stage I, the survival projection was similar across the models, yet EH cure models yielded the most precise reflections of the KM data. A similar pattern was observed in stage III, in which the AIC-averaged projections from SPD models and EH no-cure models underestimated 15-y survival, whereas the AIC-averaged EH cure models provided more acute predictions. In stage IV, in which the cure assumption is less plausible, the AIC-averaged EH no-cure models delivered the mid-term projections, most closely aligning with the KM data.

Survival extrapolation over a 15-y time horizon using standard parametric distribution models versus excess hazard models based on data cutoff at 6 y in stage II. (a) Standard parametric models. (b) Excess hazard (no-cure) models. (c) Excess hazard (cure) models. Gompertz was removed because of poor converge in the excess hazard (no-cure) model. The vertical dashed lines represent the maximum follow-up of KM data, while the vertical solid lines represent data cutoff. AIC, Akaike information criterion; KM, Kaplan–Meier.
Based on these models, long-term survival was projected across the cancer stages. The variance in the model projections was pronounced where SPD models projected large deviations toward the end of the projection across all cancer stages while EH models projected survival probabilities converging to zero. In stage II, although SPD models displayed small between-model variance up to midterm survival, the deviation continued to increase, resulting in large deviations by the end of the projection. Meanwhile, EH models effectively reduced between-model variance (Figure 4). This pattern was similarly observed for other cancer stages (Supplementary Figures 7–9).

Survival extrapolation over a 50-y time horizon using standard parametric distribution models versus excess hazard models based on data cutoff at 6 y in stage II. (a) Standard parametric models. (b) Excess hazard (no-cure) models. (c) Excess hazard (cure) models. Gompertz was removed because of poor converge in the excess hazard (no-cure) model. The vertical solid lines represent data cutoff. AIC, Akaike information criterion.
Table 2 presents the RMST for each model across cancer stages, all of which were fitted to the maximum follow-up KM data from the registry. A significant reduction in between-model variance in RMST was observed when comparing SPD models to EH models with and without a cure assumption. Overall, EH models, regardless of assuming cure, tended to predict lower RMST compared with SPD models. Notably, in early-stage cancer, SPD models estimated higher RMST, such as 32.5 y for stage I, which exceeded the expected RMST of 26.0 y in the general population.
Restricted Mean Survival Time (95% Confidence Interval) at 50 y by Cancer Stage
Gompertz was removed in stage I because of poor converge in the excess hazard (no cure) model. Excess hazard models using generalized gamma estimated the excess hazard to be nearly zero, in which the predicted survival is just the expected survival and confidence intervals are not given.
The expected 50-year RMST in the Swedish life tables matched on age, sex, and calendar year of the cancer registry was 26.0 y.
Discussion
EH cure models yielded the most precise estimates of mid-term survival, except for stage IV, in which a cure assumption was less plausible and EH models without cure most closely aligned with the KM data. The between-model variance in long-term extrapolation was especially large among patients with early-stage cancer whose prognosis was better. This arose since extrapolations are likely to be more challenging when survival remains high at the end of follow-up, allowing for greater room for deviation in the later-phase extrapolations.
The between-model variances in survival and RMST estimates were significantly reduced across cancer stages with EH models compared with SPD models. These estimates were derived from a large population registry cohort. However, deviations in mid- and long-term survival and RMST projections might be greater if the models are fitted to studies with smaller sample sizes, such as those from clinical trials. Although the significant reduction in between-model variances does not confirm that the survival projections from EH models are more accurate than those from SPD models given the absence of KM data for comparison over a 50-y horizon, the results suggest that EH models provide more consistent long-term survival projections. It is also challenging to trust the accuracy of SPD models given that the survival and RMST estimates for early-stage cancers are higher than those for the matched general population. Our findings align with a recent study that compared survival extrapolation with and without GPM adjustment via a relative survival framework using the Swedish cancer registry. This study finding was that for predicting 40-y survival, extrapolations from a relative survival framework corresponded more closely with observed survival from the registry, although the outcome was not stratified by cancer stage. 11
Survival extrapolation is the main source of uncertainty in an economic evaluation, and it often leads to inconclusive decisions. Among 22 of the health economic assessment of PD-(L)1 inhibitors published by the Swedish Dental and Pharmaceutical Benefits Agency, 96% were found to be either uncertain or very uncertain, and the main source of the uncertainty (59%) was survival extrapolation. 26 Inconclusive health economic evaluations may lead to suboptimal practice of value-based pricing and delay in access to innovative oncology medicines, which are problems not only for patients and their caregivers who may benefit from the treatments but also health care providers and health care decision makers being responsible for substantial social and economic burden of cancer diseases.27–30 These findings potentially suggest an alternative method to reduce uncertainty of the economic evaluation and to improve the process of information-based decision making of novel medicines.
The findings from the survival analysis of the NBCR data were comparable with the trial follow-up data of HER2-positive early-stage breast cancer patients who were treated with trastuzumab. 17 The study reported that the 12-y OS rate was 73% (hormone receptor–positive cohort: 76%, hormone receptor–negative cohort: 70%) while we estimated a 10-y OS rate of 74.3% in overall patients, 84.4% for stage I patients, and 71.5% for stage II patients. Our study extended the long-term follow-up outcomes for the late-stage patients (53.8% 10-y OS rate for stage III patients and 20.4% for stage IV patients).
Our study has the following limitations. First, due to lack of data, we did not investigate projections for relapse-free survival (RFS) in early-stage breast cancer or progression-free survival (PFS) in late-stage breast cancer, which is another important outcome measurement of novel oncology medicines. However, we may infer that RFS/PFS extrapolation can also be improved by EH models, since there have been several studies showing a correlation between OS and RFS/PFS.31–33 Second, we did not consider a spline-based method, which could be more flexible to simultaneously model the registry data and external data. 34 Third, the impact on comparative outcomes of cost-effectiveness such as incremental life-years and the incremental cost-effectiveness ratio was not investigated, which could be an area for future research. This analysis was conducted using breast cancer registry data with relatively long follow-up, which enabled us to validate mid-term extrapolations from the models. However, registry data are more limited when exploring how different methods might affect estimates of relative treatment effects.
Conclusion
Survival extrapolation with EH models may be preferred to SPD models to reduce uncertainty in economic evaluations when the study population is adequately matched with the general population. Our findings suggest that the most plausible scenarios with survival extrapolations are provided by EH models with or without a cure assumption. EH cure models may be considered for patients with a favorable prognosis, while EH models may be considered for patients with a poor prognosis.
Supplemental Material
sj-docx-1-mdm-10.1177_0272989X241275969 – Supplemental material for General Population Mortality Adjustment in Survival Extrapolation of Cancer Trials: Exploring Plausibility and Implications for Cost-Effectiveness Analyses in HER2-Positive Breast Cancer in Sweden
Supplemental material, sj-docx-1-mdm-10.1177_0272989X241275969 for General Population Mortality Adjustment in Survival Extrapolation of Cancer Trials: Exploring Plausibility and Implications for Cost-Effectiveness Analyses in HER2-Positive Breast Cancer in Sweden by Kun Kim, Michael Sweeting, Nils Wilking and Linus Jönsson in Medical Decision Making
Footnotes
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
