Abstract
Background
Immune checkpoint inhibitors (ICIs) provide a significant survival benefit in non-small cell lung cancer (NSCLC) patients; however, accurately predicting which patients will benefit remains a challenge. As previously shown, the STOP model, a machine learning model based on serum tumor markers, is capable of identifying non-responders after 6 weeks of ICIs.
Objective
This study aims to externally validate this model and to assess the predictive value in combination with radiological response assessment using RECIST criteria.
Methods
In a cohort of 242 metastatic NSCLC patients, CYFRA, CEA, and NSE were measured before start and after 6 weeks of ICI treatment. The ability of the STOP model to predict no durable benefit (NDB; progressive disease, death within 6 months or disease control of less than 6 months) was assessed using specificity and positive predictive value (PPV). Moreover, a combination of the STOP model with RECIST after 6–8 weeks of ICIs was investigated.
Results
The STOP model achieved a specificity of 96% (95% CI 95%–97%) and a PPV of predicting NDB of 88.1% (95% CI 85.9%–90.3%). Combining the STOP model with RECIST improved specificity and PPV to 100% and predicted NDB on average 11.6 weeks (IQR 1.8–18.0 weeks) prior to developing radiologically defined progression.
Conclusions
After 6 weeks of ICIs, the blood-based STOP model was capable of accurately predicting NDB in metastatic NSCLC patients, earlier than conventional radiological assessment. The combined serological and radiological response assessment creates an early opportunity to safely stop ICI treatment in patients who will not benefit, although the clinical utility of the assay is limited since the high specificity comes at the cost of a lower sensitivity.
Keywords
Highlights
• Only a minority of NSCLC patients treated with immunotherapy achieve a durable clinical benefit. • Our study successfully validated a tumor marker model to predict which patients will not develop a durable benefit. • The STOP model (Serum Tumor marker-based Outcome Prediction model) based on CYFRA, CEA, and NSE measurements achieved 96% specificity. • The combined model predicted no durable benefit on average 11.6 weeks earlier than conventional radiological assessment. • This model creates an early opportunity to stop immunotherapy in patients who will not benefit.
Introduction
Immune checkpoint inhibitors (ICIs) have provided a significant survival benefit in metastatic non-small cell lung cancer (NSCLC) patients.1–4 Nonetheless, the 5-year overall survival of 17–24% in patients treated with first-line ICIs shows that only a minority of patients experience a durable clinical benefit.5–7 Accurately predicting which patients will benefit remains a challenge. For NSCLC patients not harboring an actionable mutation, ICIs with or without concurrent chemotherapy have become the standard first-line therapeutic approach. 8 In current clinical practice, an oncologist’s decision to continue or terminate ICI therapy is based on tumor response assessment including clinical symptoms and radiographic evaluation using Response Evaluation Criteria in Solid Tumors (RECIST). 9 However, radiographic assessment may not correspond to clinical benefit since response to ICIs can be delayed and extra thoracic disease can be missed in routine follow-up with thoracic CT scans. Timely identification of no durable benefit (NDB) could enable early treatment discontinuation, preventing elongated exposure to ineffective and potential harmful therapy and providing a window of opportunity for alternative, potentially more beneficial treatment options.
Two biomarker strategies may be used to minimize the amount of ICIs administered to patients who eventually turn out not to benefit from this therapy. First, before start of treatment, biomarkers can select those patients which are most likely to benefit from ICIs. Examples of upfront predictive biomarkers are PD-L1 and tumor mutational burden (TMB). However, several studies have shown that predictions by PD-L1 and TMB are not precise enough to withhold treatment since lung cancers across all PD-L1 expression levels and TMB values may respond to ICIs.10,11 Other limitations include the temporal and spatial heterogeneity of PD-L1 expression and the lack of consensus on which TMB assay and corresponding cut-off point to use.12,13 A second strategy is to use biomarkers at an early stage of treatment to identify those patients who will likely not respond to therapy. The advantage of this approach is that actual tumor response can be captured and therefore prediction may be more precise. Since the consequence of incorrectly stopped treatment includes the loss of potential treatment benefit, false positive results (i.e., NDB prediction in patients who do benefit) should be minimized. Therefore, high specificity and positive predictive value (PPV) of NDB predictions are necessary.
Previous studies have shown the potential of serial measurements of carcinoembryonic antigen (CEA), cytokeratin 19 fragment (CYFRA), neuro specific enolase (NSE), and cancer antigen 125 (CA125) in monitoring therapy response and early detection of disease progression in NSCLC patients.14–18 This study investigates whether these markers are capable of predicting ICI benefit. Van Delft et al. compared multiple methods of model development, including logistic regression and machine learning techniques, to predict non-response using tumor markers in the first 6 weeks of ICI treatment. 19 In a cohort of NSCLC patients treated with ICIs an algorithm including CYFRA, CEA, and NSE provided the most robust performance and identified 61% of patients with a NDB (i.e., progressive disease, death within 6 months of treatment, or disease control of less than 6 months) at a specificity of 95%. Previous studies have also shown prognostic value of RECIST criteria since an increase in tumor size was associated with shorter OS.20,21
Implementation of biomarkers is often challenging and only few biomarkers make it to actual use in clinical practice, mainly due to a lack of external validation in the target population.22,23 Therefore, we aimed to (1) externally validate a previously developed tumor marker algorithm to predict NDB based on measurements obtained during the first 6 weeks of ICIs in a real-world cohort and (2) assess the added value of combining this serum tumor marker algorithm with radiological response evaluation using RECIST.
Methods
Study population
Clinical characteristics of the validation cohort (n = 242).
SD: standard deviation, ECOG: Eastern Cooperative Oncology Group, PD-L1: programmed death-ligand 1.
aMono immunotherapy consisted of pembrolizumab (68 patients), nivolumab (24 patients), atezolizumab (8 patients), avelumab (2 patients), or durvalumab (1 patient). Chemo-immunotherapy regimes consisted of carbo- or cisplatin, paclitaxel or pemetrexed with pembrolizumab (82 patients) or carboplatin, paclitaxel, bevacizumab with atezolizumab (34 patients).
bThe category “other” histology subtype includes large cell neuroendocrine carcinoma (10 patients), NSCLC-NOS (not otherwise specified; 10 patients), adenosquamous carcinoma (5 patients), and simultaneous adenocarcinoma and squamous cell carcinoma (1 patient).
cThe category “other” reasons for treatment discontinuation includes patient’s preference, patient’s condition, or complications.
dThe category “other” RECIST includes patients where no CT scan was performed after 6 months because of poor condition, patients who were already deceased or CT scan evaluation after 6 months was not applicable because patients developed progressive disease earlier and started a new treatment.
This study was conducted according to the guidelines of the Declaration of Helsinki. This study was approved by the local institutional review board of the Netherlands Cancer Institute (IRBd21-149) and the medical research ethics committee of the Radboud university medical center (2021–13207). This validation has been performed according to the TRIPOD (Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis) statement. 25
Tumor marker measurements
Blood samples for tumor marker measurements were collected prior to treatment initiation and every 3 weeks thereafter as part of regular care. Directly after blood collection, the concentration of CYFRA, CEA, CA-125, and NSE were assessed using a Roche Cobas 6000, 8000 or a Roche Cobas Pro analyzer system (Roche Diagnostics, Germany). Tumor marker results at two time points, baseline and 6 weeks after ICI initiation, were required. The baseline measurement was defined as the measurement between 31 days before and 7 days after treatment initiation. The 6 weeks timepoint aligned with the first CT scan to evaluate tumor response during treatment and was set between 35 and 49 days after start of treatment. If multiple measurements within the predefined time range were available, the tumor marker measurement closest to day 0 (i.e., start of treatment) or day 42 (i.e., week 6) was taken.
Tumor marker models for validation
In a previous study, tumor marker models based on combinations of CYFRA, CEA, CA-125, and NSE were developed.19,26 Model development aimed to predict non-response with high specificity in order to avoid false positive results, which could lead to undertreatment when ICI therapy is withdrawn. The five models which achieved high specificity (i.e., >90%) and reasonable sensitivity (i.e., >55%) in the validation set of this study were selected for external validation: (1) boosting with CYFRA and CEA; (2) boosting with CYFRA, CEA, and NSE; (3) boosting with CYFRA, CEA, NSE, and CA-125; (4) random forest with CYFRA, CEA, and NSE; and (5) recurrent neural network with CYFRA, CEA, and NSE.
To assess the predictive value of radiographic evaluation by RECIST, we calculated the concordance between progressive disease according to RECIST at 6–8 weeks after treatment initiation and NDB. Next, we investigated whether combining early RECIST evaluation with a serum tumor marker model would increase accuracy in predicting NDB compared to using RECIST or the serum tumor marker model alone. For this purpose, we calculated the specificity, sensitivity, PPV, and NPV of (1) the serum tumor marker model alone; (2) RECIST alone and; and (3) the serum tumor marker model combined with RECIST (Figure 1(a)). Study design. (a) A previously developed tumor marker model based on CYFRA, CEA, and NSE measurements at baseline (day 0) and week 6 (i.e., STOP model) was combined with radiographic evaluation by RECIST after 6 weeks to predict NDB after 24 weeks of ICI treatment. (b) Flowchart of inclusion of patients in this validation cohort.
Statistical analyses
Sample size calculations were performed to assess how many patients in the external validation cohort would be needed to obtain results with a predefined confidence interval. 27 Based on an observed/expected ratio, C-statistic and net benefit calculation which target a 95% confidence width of 0.3, 0.15 and 0.3, respectively, the largest sample size needed was 206 patients with a minimum of 93 events (i.e., NDB cases).
For each tumor marker model, only complete cases with measurement results of the included tumor markers were analyzed. For sensitivity, specificity, PPV, and NPV, 95% confidence intervals for proportions were calculated. Potential confounding by type of treatment, treatment line, histology subtype and smoking status was assessed by calculating the performance of the model in different subgroups. Kaplan–Meier analyses were conducted for PFS and OS and the logrank test was used for statistical comparison between groups. Hazard ratios were calculated using Cox proportional hazards regression analyses. A calibration plot was designed to test the agreement between the observed and predicted outcomes and the calibration slope was calculated. All analyses were executed in R version 4.0.4 using “ggplot,” “survival,” and “survminer” packages.28–30
Results
Clinical characteristics of the validation cohort
In total, 242 patients were included (Figure 1(b), Supplementary Figure 1). Patients had a mean age of 64 years, were mainly current or former smokers (88.4%) and mostly diagnosed with an adenocarcinoma (74.4%) (Table 1). Approximately half of the cohort was treated with ICIs only (46.7%) and the other half was treated with ICIs in combination with chemotherapy (53.3%). Most patients received ICIs as a first-line treatment (46.7%). In total, 111 out of 242 patients developed a NDB (45.9%). The median PFS and OS of the whole cohort were 7.3 months (95% CI 6.4–8.9 months) and 19.0 months (95% CI 15.7–24.4 months), respectively (Supplementary Figure 2A-B). The median follow-up of our cohort is 16.0 months (range 1.4–90.6 months).
Performance of the tumor marker model in the external validation cohort
Predictive accuracy of STOP, RECIST, and the combination of these models.
aEarly PD; progressive disease at the first RECIST evaluation at 6–8 weeks after start of treatment, sensitivity; proportion of non-responders which are correctly predicted by the test, specificity; proportion of responders which are correctly predicted by the test, PPV, positive predictive value; proportion of predicted non-responders which are actually non-responders, NPV, negative predictive value; proportion of predicted responders which are actually responders.
bCombined model; NDB prediction by the STOP model in combination with early PD at CT scan evaluation by RECIST after 6–8 weeks of ICI treatment.
The detailed patient counts in 2 × 2 matrices are represented in Supplementary Table 5.
In five patients the STOP model incorrectly predicted a NDB outcome, resulting in the specificity of 96% (Supplementary Table 2; Supplementary Table 4A-B; Figure 2(a)). One of these patients was diagnosed with a membranous nephropathy secondary to lung cancer, resulting in elevated CEA throughout ICI treatment (range 296–1300 ng/mL) (Supplementary Figure 4). In two patients, an elevation in NSE was observed which likely resulted in the false positive prediction of the STOP model. In the other two patients, a small absolute elevation in tumor marker concentrations was present but because the baseline concentrations were low this resulted in a relatively large increase in terms of percentage. Swimmer plots including STOP model and combined model predictions. (a) The predictions by the STOP model are categorized as DCB prediction (green circle) and NDB prediction (red circle). Swimmer plot in which patients are categorized based on whether they developed DCB (left window, n = 131) or NDB (right window, n = 111). All individual patients are represented as bars on the y-axis and time in months is plotted on the x-axis. The duration of ICI treatment is shown in light blue and the duration of follow-up in light gray. If a patient deceased, this is indicated with a gray square at the end of the bar. The occurrence of progressive disease is indicated with the orange triangle. (b) The predictions by the combined model (imaging and STOP model) are categorized as DCB prediction (green circle) and NDB prediction (red circle).
The STOP model, based on tumor marker measurements before start and after the first 6 weeks of treatment, predicted NDB with a median of 11.7 weeks (IQR 2.1–18.1 weeks) prior to radiologically defined progression, indicating a clinically significant gain in time by the STOP model compared to regular radiological assessment (Figure 2).
NDB prediction by the STOP model was associated with a median PFS of 1.9 months (95% CI 1.5–3.4 months) compared to 9.3 months (95% CI 8.1–14.5 months, p < 0.0001) for DCB prediction. The hazard ratio (HR) to differentiate between patients with and without response to ICIs was 4.2 (95% CI 2.9–6.1) (Figure 3). The calibration plot of the STOP model showed the model is performing well in the validation setting and that the event rate of NDB approaches the predicted risk of NDB by the model (Supplementary Figure 5). This data shows that after 6 weeks of ICI treatment, the blood-based STOP model was capable of accurately predicting non-response in metastatic NSCLC patients. Progression-free survival analysis of the STOP model. Kaplan–Meier curve of progression-free survival categorized by the STOP model prediction at 6–8 weeks after ICI initiation.
Additional value of the STOP model next to RECIST evaluation
Progressive disease (PD) according to the first RECIST evaluation at 6–8 weeks after start of treatment corresponded with NDB after 6 months in 87% of patients (Table 2). Moreover, the proportion of patients with a DCB which were correctly predicted as non-PD by RECIST was 96%, which is comparable to the specificity of 96% of the STOP model. RECIST might be limited in its specificity and PPV of predicting DCB due to pseudoprogression. In our cohort, five patients in total (2.3%) were misclassified as having progressive disease at the first RECIST evaluation while they developed a durable benefit. For all of these five patients, the STOP model predicted a DCB, underscoring the added value of this model (Figure 4; Supplementary Figure 6; Supplementary Table 4A-B). Predictions by the STOP model in relation to RECIST evaluations. Scatter plot with class probabilities of the STOP model predictions on the y-axis and RECIST evaluation 6–8 weeks after ICI initiation on the x-axis. Patients were stratified by outcome of NDB (green circle) versus DCB (red triangle). The horizontal line at Y = 0.75 indicates the threshold identified in the previously published cohort by van Delft et al. to discriminate between NDB and DCB predictions of the STOP model.
Combining RECIST with the STOP model resulted in a specificity of 100% and PPV of predicting NDB of 100%, compared to 95.9% and 87.2% for RECIST and 95.9% and 88.1% for the STOP model alone, highlighting the complementarity of these approaches. The gain in specificity and PPV of the combined model comes at the cost of a decrease in sensitivity to 21%. The combined model predicted NDB with a median of 11.6 weeks (IQR 1.8–18.0 weeks) prior to radiologically defined progression, indicating a similar gain in time by the combined model compared to the STOP model alone.
When using the combined model, the median PFS of patients with a NDB prediction was 1.3 months (95% CI 1.2–1.5 months) compared to 8.7 months (95% CI 7.4–10.9 months) in patients with a DCB prediction (p < 0.0001; Figure 5(a); Supplementary Figure 7A). Evaluation of PFS depending on the outcome of the combined model showed a HR of 23.7 (95% CI 12.6–44.6), confirming the ability of the model to stratify patients based on their response to ICI treatment. The OS analysis showed a significant difference as well, with 4.9 months (95% CI 2.9–12.4 months) versus 24.2 months (95% CI 18.8–32.5 months) for NDB prediction versus DCB prediction, respectively, and a HR of 4.3 (95% CI 2.6–7.1) (p < 0.0001; Figure 5(b), Supplementary Figure 7B). Survival analyses of the STOP model in combination with RECIST. (a) Kaplan–Meier curve of progression-free survival analysis categorized by prediction of the STOP model combined with RECIST evaluation at 6–8 weeks after ICI initiation. * PR/SD; partial response or stable disease, PD; progressive disease. (b) Kaplan–Meier curve of overall survival analysis categorized by outcome by prediction of the STOP model combined with RECIST evaluation at 6–8 weeks after ICI initiation. * PR/SD; partial response or stable disease, PD; progressive disease.
How to use the combined model in clinical practice
In our cohort, 20 patients (9%) were categorized as non-responders by the combined model and indeed no patient achieved durable benefit (Figure 6). In these patients, ICIs could have been safely discontinued after a median of 6.1 weeks after the start of treatment. In 41 patients (19%), the predictions are contradictory (i.e., PD combined with negative STOP prediction or CR/PR/SD with a positive STOP prediction) and out of these 41, 31 patients (76%) developed NDB. Since the chance of NDB is relatively high, we would advise to closely monitor these patients and await the next radiological evaluation at 3 months after start of treatment. Lastly, 158 patients (72%) were categorized as responders by the combined model. In these patients we would continue ICIs after the first 6 weeks of treatment. Potential application of the combined model to personalize treatment. Patients start with (chemo)-immunotherapy according to the national guidelines and after 6–8 weeks, radiological evaluation by RECIST and blood-based evaluation of serum tumor markers by the STOP model will be performed. When the CT scan indicates PD in combination with a positive prediction by the STOP model, our advice would be to stop ICIs and switch to another treatment. If the results are contradictory (i.e., PD combined with negative STOP prediction or PR/SD with positive STOP prediction), we would advise to await the next radiological evaluation. In case PR or SD is seen and the STOP model is negative, we would continue ICIs.
Discussion
Here we present the external validation of the previously published STOP model, which includes CYFRA, CEA and NSE measurements in the first 6 weeks of ICI treatment. Our results show that the STOP model was able to achieve 96% specificity in an external cohort. Furthermore, no confounding by type of therapy, line of treatment and histology subtype was observed. By combining the STOP model with readily available RECIST evaluations at 6–8 weeks after initiation of therapy, a specificity and positive predictive value of 100% were reached. Furthermore, in misclassified PD cases by RECIST (i.e., patients with pseudoprogression), the STOP model predicted NDB which highlights the added value of this model. To the best of our knowledge, this is the first study that successfully validated a serum tumor marker model in order to discontinue ICI treatment in NSCLC patients without clinical benefit at an early stage.
Nowadays, the majority of metastatic NSCLC patients start with a treatment regimen which includes ICIs. Since the effect of ICI treatment may not be directly noticeable and progressive disease on a CT scan may also be the result of pseudoprogression, the decision to stop ICIs if often postponed. 31 As a result, patients remain on ICIs for prolonged periods of time, which can have several negative effects. A higher number of pembrolizumab cycles have been associated with an increased risk of immune-related adverse events, implying that early discontinuation could reduce the risk of adverse events. 32 Moreover, grade 3 or 4 immune-related adverse events have been reported in up to 20% of patients with single-agent ICI treatment, indicating the potential reduction in burden of immune-related adverse events. 33 Furthermore, the prolonged treatment in these patients also has a significant financial impact on the healthcare system since ICI treatment costs up to 132.535 dollar per patient per year. 34
In our external validation cohort, the majority of patients were treated with ICI as first-line treatment (46.7%), whereas in the cohort of van Delft et al., in which the STOP model was developed, almost all patients received ICI as part of a second or later line treatment (98.0%, Supplementary Table 6). 26 In our cohort, the combination of ICIs with chemotherapy was more prominent (53.3% vs 0%) and a lower percentage of patients with NDB (45.9%) was observed compared to the cohort of van Delft et al. (68.4%). Despite these differences between the cohorts, the STOP model achieved 96% specificity when predicting DCB, which is similar to the 93% specificity in the cohort in which the model was developed. The sensitivity, however, was significantly lower compared to the van Delft et al. cohort (38% vs 61%, respectively). Although this reduces the number of patients eligible to stop treatment early, it does not impact the validity of these decisions since the specificity remained high.
To implement a predictive biomarker model in clinical care, certain requirements should be considered, including the diagnostic performance, external validation and practical feasibility. The diagnostic performance of a test can be expressed in terms of specificity, sensitivity, positive and negative predictive value. These metrics show the actual effect of a predictive biomarker in a certain patient population. Nonetheless, previous studies investigating the predictive performance of tumor markers in the context of ICIs mainly focused on other outcome parameters such as a stratification in PFS or area under the curve (AUC).16,17,35,36 Although these studies do show a clear difference in high and low risk groups, clinical utility of the biomarker cannot be judged based on these results. In our study, we investigated a tumor marker model specifically developed to withhold non-beneficial treatment. We focused on a high specificity and PPV for the prediction of DCB, since this is needed to minimize false positive predictions of NDB and thereby avoid treatment discontinuation in patients who are expected to benefit.
Assessing the performance of a predictive model in a different cohort than the training cohort, and which is similar to the population of intended usage, is deemed necessary to assess the generalizability. 23 When adding more biomarkers in a single prediction model, biomarker models will become more prone to overfitting and to avoid this pitfall validation is warranted. 37 Nonetheless, only a minority of the published studies about biomarker models validate their results in a separate validation cohort. Positive examples include Donker et al., who presented a naïve Bayes classifier based on smoking status, histology, therapy line and 7 genes tracked in ctDNA that could predict durable benefit with 84% sensitivity and 55% specificity in a separate test cohort. 38 Furthermore, Nabet et al. developed a Bayesian model based on TMB, CD8 fractions, and ctDNA fold change within the first 4 weeks of treatment which achieved a specificity of 94% in their validation cohort. 39 Although the performance of the biomarker model of Nabet et al. is in line with our tumor marker model, their validation cohort consisted of only 38 patients and therefore warrants replication in a larger cohort. In contrast, we successfully included the number of patients required for sufficient power, based on sample size calculations performed before the start of study.
Lastly, a biomarker model should be easy to implement, for which serum tumor markers have significant advantages over other biomarkers. Serum tumor markers have been (pre-) analytically well characterized and are generally available. 40 Other advantages are their quick turn-around-time, the robust and quality controlled automated instruments, and the cost-efficient measurements since they are relatively cheap, allowing testing in serial manner. Upfront biomarkers which are only measured before treatment, such as PD-L1, cannot predict NDB precise enough to withhold treatment (Supplementary Table 7). Circulating tumor DNA (ctDNA) is another biomarker capable of discriminating between ICI responders and non-responders.41–43 However, a major disadvantage of ctDNA is that it cannot be detected in 7–27% of metastatic NSCLC patients at baseline when using panel sequencing.41–43 Larger gene panels and increased depth of coverage are needed to increase utility, but will also significantly increase costs, hampering implementation of ctDNA as a monitoring tool in clinical practice.
When we started with the inclusion of patients, tumor marker measurements were not part of regular clinical practice yet. During our inclusion period, tumor marker measurements became common practice in our hospitals. However, as a result of this, tumor marker data were missing in 70% of patients which were theoretically available based on stage and treatment with ICIs.
Although this may limit the generalizability of our results, the main advantage of our study is that it comprises a real-world cohort which was treated with conventional treatment following the clinical guidelines. Therefore, the results of our cohort may still be better generalizable compared to a prospective study, for which often strict inclusion criteria are used. An intrinsic limitation of tumor markers is the fact that these markers are not specific for a certain cancer entity and various (patho)physiological conditions may influence tumor marker concentrations.44,45 In the validation cohort, only one patient was incorrectly classified as having a NDB because of elevated tumor marker concentrations due to comorbidity. This implies the effect of this limitation seems to be minimal when using the STOP model.
Both clinicians and patients may be reluctant to stop ICI therapy, even if only a small chance of DCB is present. Therefore, the specificity and PPV of our STOP model by itself (96% and 88%, respectively) may not be high enough for treatment discontinuation. When our model is used in combination with RECIST, however, a 100% specificity and PPV was obtained. This paves the way for a prospective clinical trial using the combination of the STOP model with RECIST evaluation. In patients with NDB prediction by STOP and early PD and, our recommendation would be to stop ICIs. If results are contradictory (i.e., PD with DCB prediction by STOP or PR/SD with NDB prediction by STOP), we would advise close monitoring of these patients and await the next radiological evaluation. In patients with a DCB prediction and CR/PR/SD, ICIs can be continued. A prospective clinical trial should ultimately proof clinical utility, by confirming the early window of opportunity to switch to another treatment, the reduction of immune-related adverse events, and cost-effectiveness of this approach. Future treatment options may include antibody-drug conjugates (ADCs) or combinations of targeted therapies with immunotherapy. ADCs represent a promising therapeutic strategy which combine the specificity of monoclonal antibodies with the cytotoxic potential of chemotherapeutic agents, targeting specific antigens expressed on cancer cells. ADCs have showed significant efficacy in various solid tumors, including NSCLC.46,47 Furthermore, the integration of targeted therapies with immune checkpoint inhibitors is being explored to potentially overcome resistance mechanisms and enhance anti-tumor responses. 48
Conclusion
In conclusion, the STOP model based on CYFRA, CEA, and NSE is capable of predicting which metastatic NSCLC patients treated with ICIs will experience no durable benefit in a real-world cohort in line with the population of intended usage. When increasing specificity and PPV by combining the STOP model with RECIST evaluations, 1 in 5 non-responders could be identified with a specificity and PPV of 100%. A clinician will be reluctant to stop ICI treatment when there is a small chance of response to ICI treatment (i.e., incorrectly discontinuing ICI treatment). Therefore, we believe that although the sensitivity is relatively low, this model is of added value because in 1 out of 5 patients treatment can be 100% safely discontinued. Furthermore, our model is of added value for those patients suspected of pseudoprogression. Although this is a relatively rare phenomenon, in many patients with early progression, ICI treatment is continued because of the small chance of pseudoprogression. Since our model is capable of predicting no durable benefit on the long term, our model can guide clinicians to stop treatment early and confirm the absence of pseudoprogression.
Our study successfully validated a model that repurposed well-known serum tumor markers into a decision support tool using state-of-the-art machine learning techniques, which can guide decisions of patients and clinicians as early as 6 weeks after treatment initiation. Our findings provide a ready to use and easy to implement decision tool based on low-priced, robust and generally available serum tumor markers.
Supplemental Material
Supplemental Material - External validation of a serum tumor marker algorithm for early prediction of no durable benefit to immunotherapy in metastastic non-small cell lung carcinoma
Supplemental Material for External validation of a serum tumor marker algorithm for early prediction of no durable benefit to immunotherapy in metastastic non-small cell lung carcinoma by Milou M. F. Schuurbiers, Freek A. van Delft, Hendrik Koffijberg, Maarten J. IJzerman, Kim Monkhorst, Marjolijn J. L. Ligtenberg, Daan van den Broek, Huub H. van Rossum, and Michel M. van den Heuvel in Tumor Biology.
Supplemental Material
Supplemental Material - External validation of a serum tumor marker algorithm for early prediction of no durable benefit to immunotherapy in metastastic non-small cell lung carcinoma
Supplemental Material for External validation of a serum tumor marker algorithm for early prediction of no durable benefit to immunotherapy in metastastic non-small cell lung carcinoma by Milou M. F. Schuurbiers, Freek A. van Delft, Hendrik Koffijberg, Maarten J. IJzerman, Kim Monkhorst, Marjolijn J. L. Ligtenberg, Daan van den Broek, Huub H. van Rossum, and Michel M. van den Heuvel in Tumor Biology.
Supplemental Material
Supplemental Material - External validation of a serum tumor marker algorithm for early prediction of no durable benefit to immunotherapy in metastastic non-small cell lung carcinoma
Supplemental Material for External validation of a serum tumor marker algorithm for early prediction of no durable benefit to immunotherapy in metastastic non-small cell lung carcinoma by Milou M. F. Schuurbiers, Freek A. van Delft, Hendrik Koffijberg, Maarten J. IJzerman, Kim Monkhorst, Marjolijn J. L. Ligtenberg, Daan van den Broek, Huub H. van Rossum, and Michel M. van den Heuvel in Tumor Biology.
Supplemental Material
Supplemental Material - External validation of a serum tumor marker algorithm for early prediction of no durable benefit to immunotherapy in metastastic non-small cell lung carcinoma
Supplemental Material for External validation of a serum tumor marker algorithm for early prediction of no durable benefit to immunotherapy in metastastic non-small cell lung carcinoma by Milou M. F. Schuurbiers, Freek A. van Delft, Hendrik Koffijberg, Maarten J. IJzerman, Kim Monkhorst, Marjolijn J. L. Ligtenberg, Daan van den Broek, Huub H. van Rossum, and Michel M. van den Heuvel in Tumor Biology.
Supplemental Material
Supplemental Material - External validation of a serum tumor marker algorithm for early prediction of no durable benefit to immunotherapy in metastastic non-small cell lung carcinoma
Supplemental Material for External validation of a serum tumor marker algorithm for early prediction of no durable benefit to immunotherapy in metastastic non-small cell lung carcinoma by Milou M. F. Schuurbiers, Freek A. van Delft, Hendrik Koffijberg, Maarten J. IJzerman, Kim Monkhorst, Marjolijn J. L. Ligtenberg, Daan van den Broek, Huub H. van Rossum, and Michel M. van den Heuvel in Tumor Biology.
Supplemental Material
Supplemental Material - External validation of a serum tumor marker algorithm for early prediction of no durable benefit to immunotherapy in metastastic non-small cell lung carcinoma
Supplemental Material for External validation of a serum tumor marker algorithm for early prediction of no durable benefit to immunotherapy in metastastic non-small cell lung carcinoma by Milou M. F. Schuurbiers, Freek A. van Delft, Hendrik Koffijberg, Maarten J. IJzerman, Kim Monkhorst, Marjolijn J. L. Ligtenberg, Daan van den Broek, Huub H. van Rossum, and Michel M. van den Heuvel in Tumor Biology.
Supplemental Material
Supplemental Material - External validation of a serum tumor marker algorithm for early prediction of no durable benefit to immunotherapy in metastastic non-small cell lung carcinoma
Supplemental Material for External validation of a serum tumor marker algorithm for early prediction of no durable benefit to immunotherapy in metastastic non-small cell lung carcinoma by Milou M. F. Schuurbiers, Freek A. van Delft, Hendrik Koffijberg, Maarten J. IJzerman, Kim Monkhorst, Marjolijn J. L. Ligtenberg, Daan van den Broek, Huub H. van Rossum, and Michel M. van den Heuvel in Tumor Biology.
Supplemental Material
Supplemental Material - External validation of a serum tumor marker algorithm for early prediction of no durable benefit to immunotherapy in metastastic non-small cell lung carcinoma
Supplemental Material for External validation of a serum tumor marker algorithm for early prediction of no durable benefit to immunotherapy in metastastic non-small cell lung carcinoma by Milou M. F. Schuurbiers, Freek A. van Delft, Hendrik Koffijberg, Maarten J. IJzerman, Kim Monkhorst, Marjolijn J. L. Ligtenberg, Daan van den Broek, Huub H. van Rossum, and Michel M. van den Heuvel in Tumor Biology.
Statements and declarations
Footnotes
Acknowledgements
We thank the Department of Laboratory Medicine of the Radboud university medical center, and in particular Teun van Herwaarden for his help in retrieving the serum tumor marker results of the RUMC patients.
Authors’ contributions
MMFS: conceptualization, methodology, software, data curation, formal analysis, and writing—original draft. FAD: conceptualization, methodology, software, and writing—review and editing. HK: writing—review and editing. KM: writing—review and editing. MJLL: writing—review and editing. DB: writing—review and editing. HHR: conceptualization, supervision, and writing—review and editing. MMH: conceptualization, supervision, and writing—review and editing.
Declaration of conflicting interests
The author(s) declared the following potential conflicts of interest with respect to the research, authorship, and/or publication of this article: MJLL received consulting fees from AstraZeneca, GlaxoSmithKline, Illumina, Janssen Pharmaceuticals, Lilly, and Merck Sharp & Dohme. All these relations were not related to this study and were paid to the institution. KM received a research grant from Astra Zeneca, non-financial support from Roche, Takeda, Pfizer, PGDx, and Delfi; speakers fees from MSD, Roche, Astra Zeneca, and Benecke; and consultant fees from Pfizer, BMS, Roche, MSD, Abbvie, AstraZeneca, Diaceutics, Lilly, Bayer, Boehringer Ingelheim, Merck, and Amgen. DB received research funding from Delfi. HvR is owner and director of Huvaros B.V. HvR has stocks in SelfSafeSure Blood Collections B.V. MMH received an unrestricted grant from Roche Diagnostics. All remaining authors have declared no conflicts of interest.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
Ethical considerations
This study was approved by the local institutional review board of the Netherlands Cancer Institute (IRBd21-149) and the medical research ethics committee of the Radboud university medical center (2021–13207).
Consent for publication
All authors approved the final version of this manuscript for publication.
Supplemental Material
Supplemental material for this article is available online.
Appendix
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
