Abstract
Introduction:
With recent advances in breast cancer (BC) treatment, the disease-free survival (DFS) of patients is increasing and the risk factors for recurrence and metastasis are changing. However, a dynamic approach to assessing the risk of recurrent metastasis in BC is currently lacking. This study aimed to develop a dynamically changing prediction model for recurrent metastases based on conditional survival (CS) analysis.
Methods:
Clinical and pathological data from patients with BC who underwent surgery at the Affiliated Hospital of Qingdao University between August 2011 and August 2022 were retrospectively analysed. The risk of recurrence and metastasis in patients with varying survival rates was calculated using CS analysis, and a risk prediction model was constructed.
Results:
A total of 4244 patients were included in this study, with a median follow-up of 83.16 ± 31.59 months. Our findings suggested that the real-time DFS of patients increased over time, and the likelihood of DFS after surgery correlated with the number of years of prior survival. We explored different risk factors for recurrent metastasis in baseline patients, 3-year, and 5-year disease-free survivors, and found that low HER2 was a risk factor for subsequent recurrence in patients with 5-year DFS. Based on this, conditional nomograms were developed. The nomograms showed good predictive ability for recurrence and metastasis in patients with BC.
Conclusion:
Our study showed that the longer patients with BC remained disease-free, the greater their chances of remaining disease-free again. Predictive models for recurrence and metastasis risk based on CS analysis can help improve the confidence of patients fighting cancer and help doctors personalise treatment and follow-up plans.
Plain language summary
With recent advances in breast cancer (BC) treatment, the disease-free survival of patients is increasing and the risk factors for recurrence and metastasis are changing. One of the key risk factor is the human epidermal growth factor receptor 2 (HER2). However, the recent advent of anti-HER2 antibody-drug conjugates (ADC) has challenged the traditional binary classification based on HER2. Patients in the traditional HER2-negative group can now be further classified as HER2-low (ISH-negative with IHC1 or IHC2) or HER2-0 (ISH-negative and IHC-0). Does this categorisation also have some value for the prognosis of BC?
To figure this out, we retrospectively analysed the clinical and pathological data of BC patients who underwent surgery at the Affiliated Hospital of Qingdao University between August 2011 and August 2022. The risk of recurrence and metastasis in patients with varying survival rates was calculated using conditional survival analysis, and a risk prediction model was constructed.Our findings suggested that the real-time disease-free survival (DFS) of patients increased over time, and the likelihood of DFS after surgery correlated with the number of years of prior survival. Conditional nomograms were developed for baseline patients, 3-year and 5-year disease-free survivors. The nomograms showed good predictive ability for recurrence and metastasis in patients with BC.
Introduction
The prognosis of patients with breast cancer (BC) has improved significantly in recent years. 1 However, 10–30% of patients with BC still experience recurrence and metastases after surgery.2–4 A number of clinicopathological factors, including age at diagnosis, tumour size, number of axillary lymph node metastases, lymphovascular invasion (LVI) status, and histological grading, have been shown to correlate with the prognosis of BC. 5
One of the key BC indicators is the human epidermal growth factor receptor 2 (HER2). The traditional classification of patients into HER2-positive and HER2-negative groups is based on immunohistochemistry (IHC) and in situ hybridisation (ISH). 6 Anti-HER2 drugs are used primarily in HER2-positive patients. However, the recent advent of anti-HER2 antibody–drug conjugates (ADC) has opened up new therapeutic options for BC and challenged the traditional binary classification based on HER2. Patients in the traditional HER2-negative group can now be further classified as HER2-low (ISH negative with IHC1 or IHC2) or HER2-0 (ISH negative and IHC-0). This classification also has some value for determining BC prognosis.7,8
Various risk assessment techniques have been developed to determine the risks of recurrence and metastasis in patients with BC, in light of these changes.9–11 However, traditional survival analyses can only take the patient’s status at the time of initial diagnosis as the starting point for assessment. While these data may be valuable at the time of initial diagnosis, such static assessments may be irrelevant for cases with better prognoses. The risk of BC recurrence should be assessed dynamically, as the risk factors can change over time with follow-up. 12 The risk factors for recurrence and metastasis in patients with BC vary with longer disease-free survival (DFS).
Conditional survival (CS) was initially proposed to define the probability of a patient surviving y years, after surviving x years following initial diagnosis and treatment. 13 It was gradually extended to conditional disease-free survival (CDFS). CDFS can provide a more dynamic risk of recurrence and help analyse the risk factors for recurrence accordingly. Currently, CDFS analysis is widely used for many malignancies, including colorectal, gastric, and hepatocellular cancers.14–16 However, no CS prognostic model has been developed for BC that includes the new HER2 classification statuses.
In this study, CS analysis was used to calculate patient CDFS, identify clinically and pathologically significant risk variables for recurrence and metastasis in patients with HER2-negative BC, and build useful predictive models.
Materials and methods
Patient selection
We retrospectively collected the clinical and pathological data of patients with BC who underwent surgery at the Affiliated Hospital of Qingdao University (Qingdao, China) between August 2011 and August 2022. The following formed the patient inclusion criteria: (1) patients with pathologically proven invasive BC and stage cT1-3, N0-3, or M0 disease based on the Eighth TNM classification of the American Joint Committee on Cancer (AJCC) 17 ; (2) patients who were treated via mastectomy or breast-conserving surgery combined with sentinel lymph node biopsy or axillary dissection (a small number of breast reconstruction surgeries are classified as mastectomy); (3) patients who had at least 6 months of follow-up data after their surgeries; and (4) patients who had detailed HER2, IHC, and ISH results and were diagnosed with HER2-low (ISH negative, IHC1, or IHC2) and HER2-0 (ISH-negative, IHC-0) BC.
The exclusion criteria were as follows: (1) patients with severe additional malignant tumours; (2) patients who had been diagnosed with non-primary BC; (3) patients whose medical records were lost during follow-up; and (4) patients who did not receive postoperative radiotherapy and chemotherapy according to appropriate guidelines.
In the end, the study comprised 4244 patients who met the inclusion and exclusion criteria. According to a 7:3 ratio, all patients were divided into a training (2983 patients) and a validation (1261 patients) group. To avoid any possible bias, we compared baseline characteristics between the two groups, where we found no significant differences. The final follow-up was conducted in August 2022. Follow-up data were collected through outpatient reviews, telephone calls, or text messages. The reporting of this study conforms to the Strengthening the Reporting of Observational Studies in Epidemiology statement. 18
Data collection
We retrospectively analysed the clinical and pathological data of the patients. Age at diagnosis, menopausal status, pathology type, T stage, N stage, histologic grade, LVI status, HER2 status, hormone receptor (HR) status, Ki67 expression, surgery type, and survival outcomes were selected as the research variables.
HR+ status was considered negative if >1% of the tumour cells of the oestrogen receptor (ER) or progesterone receptor (PR) showed nuclear staining by IHC, and HR and endocrine therapy (ET) were combined into HR/ET. This was further divided into three variables: HR+/ET+ were patients who were HR positive and received standardised ET; HR+/ET− were patients who were HR positive and did not receive standardised ET; and HR−/ET− were patients who were HR negative and did not receive ET. Ki-67 was divided into high- and low-expression groups using a cut-off point of 20% expression. 19 T and N stages were based on postoperative pathological results according to the AJCC’s Eighth TNM classification. HER2-low was defined as a HER2 score of 1 or 2 according to IHC with a negative ISH assay, and HER2-0 was defined as a HER2 score of 0 according to the same evaluation. The histological and IHC results for all tumour sections were assessed multiple times by two senior pathologists under 100× magnification using a light microscope.
DFS was defined as the time between the conclusion of the original surgery and the beginning of recurrence and metastasis. CDFS, such as 3-year-CDFS, was defined as the period for 3-year disease-free survivors from the third year of the postoperative period to the time of recurrence and/or metastasis. The diagnosis of recurrence and metastasis was based on histological evidence or corroborative ultrasound, brain magnetic resonance imaging, computed tomography (CT), radionuclide bone scan, or positron emission tomography/CT (PET)/CT evidence. For the first 5 years after surgery, every patient was followed up every 6 months; after 5 years, they were followed up annually. The final follow-up visit was conducted in February 2023.
Statistical analysis
Chi-squared and Student’s t-tests were used to compare continuous and categorical data, respectively. Continuous data are shown as means and standard deviations, while categorical variables are shown as numbers and percentages.
The concept of CS analysis is based on the conditional probability proposed by Hansen and Rees. CS analysis is used to describe the survival of patients who survived for a specific number of years. CDFS analyses are used to determine the exact DFS of patients who remained disease-free for a specific number of years. For example, the X-year-CDFS is the DFS rate of a patient who has remained disease-free for X years. Therefore, X-year-CDFS of Y years is calculated by dividing the survival rate in years X + Y by the survival rate in X-year. Risk variables for mortality were determined using Cox regression modelling. Variables with p values of ⩽0.05 in the univariate Cox analysis were included in the multivariate Cox analysis. Multivariate Cox regression analysis was performed using a stepwise backward selection. Finally, factors with prognostic significance in the multivariate Cox regression analyses were used to construct a nomogram to predict CDFS after the completion of surgery and at 3- and 5-year post-surgery.
To evaluate the anticipated effectiveness of the nomogram, receiver operating characteristic (ROC) curves and time-dependent areas under the ROC curve (AUC) were constructed. To determine how well the projected probability and actual findings coincided, a calibration curve was constructed. A bootstrapping method with 500 re-samples was used to examine both calibration and discrimination. A decision curve analysis (DCA) plot was used to assess the clinical utility of the nomogram.
All patients then were split into ‘low-risk profile’ and ‘high-risk profile’ groups based on the median scores, and the difference in survival was calculated using Kaplan–Meier curves. The log-rank test was used to compare the results. Statistical significance was defined as p < 0.05. R version 3.5.3 (R core development team, Vienna, Austria; https://r-project.org/) was used to perform all statistical analyses.
Results
Patient characteristics and follow-up
Based on the inclusion criteria, 4244 patients were enrolled in this trial. The median follow-up time was 83.16 ± 31.59 months, and the mean age was 52.73 ± 11.36 years. By the end of the observation period, 322 patients had experienced cancer events, including 204 with local recurrence and 118 with distant metastases. Based on a 7:3 ratio, we placed 2983 patients in the training cohort and 1261 in the validation cohort (Figure 1). No significant differences were observed in the baseline characteristics and tumour features between the two groups (Table 1).

Flowchart for the study sample.
Conditional overall survival estimates.
DFS, disease-free survival.
Instantaneous DFS risk and survival curve
Real-time DFS increased over time, and the likelihood of DFS following surgery was associated with the number of disease-free years. For example, as the number of years a patient had been disease-free increased, the patient’s 10-year survival rate increased from 92.70% after the first surgery to 93.70%, 94.47%, 95.13%, 95.26%, 96.66%, 97.11%, 97.27%, 98.47%, and 99.37% (over 1–9 years of DFS). The longer patients remained disease-free, the greater their chances became of once again reaching a disease-free state. For example, the 5-year DFS rate for postoperative patients was 95.15%, increased to 96.94% if the patients remained disease-free for 2 years, rose again to 98.14% after four disease-free years, and eventually levelled off after many years. CDFS probabilities are shown in Table 2.
Comparison of demographics and tumour characteristics.
BMI, body mass index; IDC, infitrating ductal carcinoma; ILC, infiltrating lobular carcinoma; ET, endocrine therapy; HER2, human epidermal growth factor receptor 2; HR, hormone receptor; LVI, lymphovascular invasion; SD, standard deviation.
Feature selection and nomogram construction
The results of Cox regression analysis suggested that at baseline, the independent risk variables for DFS were age, T stage, N stage, grade, LVI, and HR status. In patients with 36 months of event-free survival, the independent risk variables for DFS were age, menopause status, T stage, N stage, and LVI. However, in patients with 60 months of event-free survival, we found that the independent risk variables were pathology, T stage, N stage, and HER2 status (Tables 3–5).
Cox regression analysis for baseline patients.
CI, confidence interval; HER2, human epidermal growth factor receptor 2; HR, hormone receptor; LVI, lymphovascular invasion; OR, odds ratio; NA, not available.
Cox regression analysis for 3-year disease-free survivors.
CI, confidence interval; LVI, lymphovascular invasion; HER2, human epidermal growth factor receptor 2; HR, hormone receptor; OR, odds ratio; NA, not available.
Cox regression analysis for 5-year disease-free survivors.
CI, confidence interval; HER2, human epidermal growth factor receptor 2; LVI, lymphovascular invasion; HR, hormone receptor; OR, odds ratio; NA, not available.
Based on these risk factors, we constructed survival prediction models at baseline [Figure 2(a)], 36 months [Figure 2(c)], and 60 months [Figure 2(e)]. The different risk factors in each column-line diagram were assigned corresponding scores, and the scores of the different risk factors were summed to obtain a total score, based on which the probability of recurrence and metastasis could be predicted.

DFS predictive nomogram model and time-dependent AUC of the model. (a) Nomogram for baseline patients; (b) time-dependent AUC for baseline patients; (c) nomogram for 3-year disease-free survivors; (d) time-dependent AUC 3-year disease-free survivors; (e) nomogram for 5-year disease-free survivors; and (f) time-dependent AUC for 5-year disease-free survivors.
Conditional nomogram construction and validation
In the CDFS prediction model for patients who had just completed surgery, the ROC was 0.726 in the training cohort and 0.719 in the validation cohort. The time-dependent AUC curves for CDFS were plotted according to the sensitivity and specificity of the dynamic changes, and the progression of the AUCs is shown in Figure 2(b,d,f). The agreement between the training and validation cohorts was very good in terms of column-line graph prediction and actual observations (Figure 3). The DCA curves indicated that our column charts were able to accurately predict the probability of recurrent metastasis. Figure 4 suggests that the model yielded a high net clinical benefit.

The calibration curves to predict 3-year and 5-year DFS rates in training and validation cohort. (a–d) 3-year, 5-year DFS rates in the training cohort and 3-year, 5-year DFS rates in the training cohort of baseline nomogram; (e–h) 3-year, 5-year DFS rates in the training cohort and 3-year, 5-year DFS rates in the training cohort of 3-year disease-free survivors nomogram; (i–l) 3-year, 5-year DFS rates in the training cohort and 3-year, 5-year DFS rates in the training cohort of 5-year disease-free survivors nomogram.

DCA curves of the nomogram. (a) DCA of 3-year DFS rates in baseline nomogram; (b) DCA of 5-year DFS rates in baseline nomogram; (c) DCA of 3-year DFS rates in 3-year disease-free survivors nomogram; (d) DCA of 5-year DFS rates in 3-year disease-free survivors nomogram; (e) DCA of 3-year DFS rates in 5-year disease-free survivors nomogram; and (f) DCA of 5-year DFS rates in 5-year disease-free survivors nomogram.
Meanwhile, in the 3-year-CDFS prediction model built for patients who had survived disease-free for 36 months, the ROC values for the training and validation cohorts were 0.715 and 0.697, respectively. In the 5-year-CDFS prediction model for patients who had survived disease-free for 60 months, the ROC values for the training and validation cohorts were 0.727 and 0.688, respectively. The 3- and-5-year-CDFS models showed good AUC curves, DCA curves, and clinical benefits.
These findings suggest that the generated competing risk column-line diagram model has good predictive power for recurrence and metastasis in BC.
Risk stratification revalidation
The sum of the risk scores was calculated for each patient in the constructed 0-year-CDFS nomogram, and the median score was used as the threshold for risk stratification. Patients in the training and validation cohorts were divided into two groups based on their risk levels, to form ‘low-risk profile’ and ‘high-risk profile’ groups. Kaplan–Meier curves were constructed concurrently for the two subgroups, and the log-rank test showed that the curves of the training (p < 0.001) and validation cohorts (p < 0.001) were significantly different.
The training (p = 0.001) and validation (p = 0.005) cohorts of the 3-year-CDFS model, and the training (p = 0.001) and validation (p = 0.042) cohorts of the 5-year-CDFS model both produced similar findings (Figure 5). These results suggest that column-line plots can clearly differentiate between the probabilities of recurrence and metastasis in patients with HER2-negative BC.

Kaplan–Meier curves of DFS rates for risk stratification: (a) in the training cohort of baseline nomogram; (b) in the validation cohort of baseline nomogram; (c) in the training cohort of 3-year disease-free survivors nomogram; (d) in the validation cohort of 3-year disease-free survivors nomogram; (e) in the training cohort of 5-year disease-free survivors nomogram; and (f) in the validation cohort of 5-year disease-free survivors nomogram.
Discussion
In this study, the CDFS probabilities of patients with HER2-negative BC were evaluated, and a CDFS probability plot was constructed. It confirms that the longer patients remain disease-free, the greater their chances of keeping disease-free again. Our AUCs and calibration plots showed good discriminatory ability for predicting recurrence and metastasis in patients with BC.
HER2-negative BCs make up approximately 80% of BCs. 20 Traditionally, this patient group did not benefit from conventional anti-HER2 agents (e.g. trastuzumab and patuximab). 21 With the emergence of novel anti-HER2 ADC such as T-DXd and the validation of their clinical benefits in patients with BC who have low levels of HER2 expression, a new classification of HER2 status has gained increasing attention in recent years.22,23 Currently, this type of analysis focuses on standard clinicopathological features or the expression of common biomarkers.24,25 Its significance in predicting prognosis and survival has not yet been well elucidated; thus, a better understanding of the characteristics of low HER2 BC is essential for the development of future therapies.
Most traditional survival estimates begin after surgery, providing the probability of a patient’s DFS from that moment until another point in time and allowing for the assessment of prognostic differences using the Kaplan–Meier method.26,27 However, as patient prognostic statuses improve with increasing survival time, staging systems or prognostic scores for survival estimates based on postoperative baseline information have become inaccurate, so many patients do not receive prognostic information based on their current statuses. 28 Follow-up should be a dynamic process; therefore, we developed a dynamic assessment tool to provide accurate and dynamic DFS probabilities during follow-up. Our study showed a 5-year DFS rate of 95.15% for patients at initial diagnosis. After 3 years of follow-up, if a patient remained event-free, their 5-year DFS probability increased to 97.66%. These types of predictions can greatly enhance patients’ confidence levels as they fight cancer, reduce their anxiety levels, and improve their adherence to follow-up. This will also allow clinicians to adjust each patient’s monitoring plan in real time, according to changes in risk. 29
The risk factors for recurrent metastasis also vary with each patient’s DFS time, so we constructed a conditional column chart based on the corresponding pattern of change. Our results in this study showed that, at baseline, the independent risk variables for DFS were age, T stage, N stage, grade, LVI, and HR status, with T and N stages being the most influential factors. In patients with 36 months of event-free survival, the independent risk variables for DFS were age, menopausal status, T stage, N stage, and LVI – all of which are consistent with what has been suggested in previous reports.30–32 And in patients with 60 months of event-free survival, the independent risk variables we identified were histologic subtype, T stage, N stage, and HER2 status. It is worth noting that when survival was analysed for patients at baseline, HR+ was one of the risk factors for DFS. However, when patients with 36 and 60 months of event-free survival were analysed using CS, the results showed that HR+ was no longer a risk factor, which is inconsistent with previous studies. It is hypothesised that the reason for this may be related to changes in treatment regimens in recent years, such as intensive treatment with CDK4/6 inhibitors for HR+ high-risk BC patients. 33 Some variables that were significant at baseline, such as histological grading and LVI, were not considered predictive variables in the 60-month case, probably because although these two variables are usually associated with a poorer prognosis, their impact on prognosis gradually diminished as DFS increases.
Some previous studies have shown no significant difference in survival between the HER2-low and HER2-0 groups. 34 Some studies have also shown that patients in the HER2-low group had a better prognosis than those in the HER2-0 group after stratification according to HR status, suggesting that the prognostic impact of the HER2-low group on the patients varied according to HR status.35,36 In the present study, HER2-low status did not show prognostic value in DFS at patients in baseline and with 36 months of event-free survival after surgery, similar to the results of previous studies. However, in patients who had kept DFS for 60 months, HER2-low was a risk factor for subsequent recurrence, which differed from the results of previous studies. It is hypothesised that the reason for this may be related to the higher proportion of HR − patients in the training cohort (20.55%). It may also be related to the diminished role of traditional risk factors for recurrence (e.g. KI67) for patients who have survived disease-free for 5 years. Moreover, previous relevant studies have focused on 3 or 5 years postoperative DFS but for patients with 60 months of event-free survival, HER2-low may indeed be a risk factor that we have overlooked.
The innovation in this study is that the instantaneous recurrence and metastasis risk were calculated for patients at different periods. This study included HER2-low and HER2-0 in the conditional column line plot diagram for the first time and developed a recurrence risk prediction model for different survival periods. We confirmed that the model has good accuracy and clinical validity. In addition, compared to previous studies, this study was based on 10-year follow-up data from 4000 HER2-negative cases of BC, which represents a large sample size and long follow-up period.
Given the limited pathologies of patients 10 years ago, we were unable to obtain data on certain indicators that can affect survival, such as tumor-infiltrating lymphocytes. Moreover, the time span of this study is large, and patients’ postoperative treatment regimens, such as adjuvant chemotherapy, may vary, all of which can affect the diagnostic efficacy of the model and lead to a low ROC value of the model (especially in the validation group). Then, in this study, there was no distinction between ‘local recurrence’ and ‘ipsilateral second primary BC’ in some of the early cases, and both were classified as recurrences, which may have led to an overestimation of recurrence events. Finally, this study lacked an external validation group, which will be remedied by a multicentre prospective study in the future.
Conclusion
We calculated the risk of transient recurrence and metastasis at different time points, based on 10-year follow-up data from 4000 patients with HER2-negative BC, and confirmed that the longer patients with BC remain disease-free, the greater their chances become of once again becoming disease-free. This adds to our understanding of the dynamic assessment of BC prognosis and full treatment course. In addition, our CDFS results identified factors such as HER2-low as risk factors for patients with different survival periods. Based on these results, we further constructed conditional nomograms to predict the risk of recurrence and metastasis for patients over different periods, which may help raise confidence levels in such patients as they battle cancer, as well as facilitate the development of personalised treatment and follow-up plans by clinicians.
