Abstract
Numerous clinical trials investigating neoadjuvant immune checkpoint inhibitors (ICI) have been performed over the last 5 years. As the number of neoadjuvant trials increases, attention must be paid to identifying informative trial endpoints. Complete pathologic response has been shown to be an appropriate surrogate endpoint for clinical outcomes, such as event-free survival or overall survival, in breast cancer and bladder cancer, but it is less established for non-small-cell lung cancer (NSCLC). The simultaneous advances reported with adjuvant ICI make the optimal strategy for early-stage disease debatable. Considering the long time required to conduct trials, it is important to identify optimal endpoints and discover surrogate endpoints for survival that can help guide ongoing clinical research. Endpoints can be grouped into two categories: medical and surgical. Medical endpoints are measures of survival and drug activity; surgical endpoints describe the feasibility of neoadjuvant approaches at a surgical level as well as perioperative attrition and complications. There are also several exploratory endpoints, including circulating tumor DNA clearance and radiomics. In this review, we outline the advantages and disadvantages of commonly reported endpoints for clinical trials of neoadjuvant regimens in NSCLC.
Introduction
Lung cancer has the highest incidence and mortality in the United States with an estimated 236,740 cases in 2022 and 130,180 deaths 1 and remains the leading cause of cancer death worldwide. 2 Outcomes have improved over the last 30 years through the development of immunotherapies and targeted therapies against specific driver mutations. Better screening has also improved the ability to identify cancers in earlier stages that might be amenable to resection, thereby improving patient outcomes through earlier intervention.3–6 As more early-stage cancers are identified, trials are needed to develop regimens and therapeutics that decrease recurrence and improve cure in these patients.
Neoadjuvant immune checkpoint inhibitor (ICI) therapy is a recent change in the treatment landscape of early-stage non-small-cell lung cancer (NSCLC). Numerous clinical trials investigating the potential of neoadjuvant ICI have been performed over the last 5 years (Table 1). As the number of these neoadjuvant trials increases, attention must be paid to identifying informative and useful endpoints. The primary endpoint of a phase III trial should measure and identify a statistical difference between treatments within the time and recruitment constraints of that trial. It should also have a biological and clinical rationale that adequately answers the question posed by a given study. Finally, the endpoint should be readily standardized to facilitate applicability and access for patients beyond the trial itself. Here we review the advantages and disadvantages of currently available endpoints for clinical trials of neoadjuvant regimens in NSCLC.7–11
Single-arm studies and randomized controlled trials.
DFS, disease-free survival; EFS, event-free survival; ICI, immune checkpoint inhibitors; MPR, major pathologic response; OS, overall survival; PFS, progression-free survival; pCR, pathologic response.
Potential primary endpoints
Overall survival: Improvement in OS remains among the primary goals of most novel anticancer therapies. OS is defined as the time from randomization or treatment start to death from any cause. Advantages of using OS as the primary endpoint in clinical trials include the following: (1) Relevance: OS is a clinically meaningful endpoint that is directly related to the patient’s outcome. It is considered the gold standard endpoint for evaluating the efficacy of cancer treatments because it measures the ability of a treatment to prolong a patient’s life; (2) Objectivity: OS is not influenced by subjective factors, such as patient’s symptoms or low-grade side effects of the treatment; (3) Easy to measure: OS is logistically feasible to measure and can be determined through electronic medical records and death certificates; (4) Power: OS is a powerful endpoint that can demonstrate a treatment’s efficacy even when the treatment has a small effect size; (5) Long-term endpoint: OS can demonstrate if a treatment is effective in the long term.
In many countries, OS is considered a regulatory endpoint for the regulatory approval of new cancer treatments, so using it as a primary endpoint increases the chances of regulatory approval. However, in early-stage disease, an OS benefit can be difficult to determine in a timely manner. For those with early-stage disease receiving neoadjuvant treatment and surgery, OS is favorable, with the NADIM trial showing an OS of 78.9% at 42 months for those receiving neoadjuvant treatment. 20 While OS is a commendable and necessary endpoint, the time needed to reach the needed number of events is beyond the scope of many clinical trials. In addition, OS can be affected by competing risk of death, as well as cross-over therapy. 32 OS is therefore often used as a secondary endpoint.
Event-free survival: EFS is defined as the time after primary cancer treatment until a complication from cancer or treatment occurs, such as disease progression, recurrence, or treatment discontinuation, with events being specified for each trial. Similar to OS, EFS is an important endpoint, as cancer recurrence is associated with worse physical and psychological symptoms.33,34 Some advantages of using EFS endpoints in cancer clinical trials include the following: (1) Relevance: EFS is a clinically meaningful endpoint that is directly related to the patient’s outcome. It measures the ability of a treatment to delay the occurrence of a specific event, such as disease progression or relapse; (2) Easy to measure: EFS is relatively easy to measure, as it is based on the occurrence of a specific event that can be easily determined through patient monitoring and imaging studies; (3) Power: EFS is a powerful endpoint that can demonstrate a treatment’s efficacy even when the treatment has a small effect size; (4) Early endpoint: EFS is an early endpoint that can provide information about a treatment’s efficacy prior to death; (5) Prognostic value: EFS can also be a good prognostic factor, it can give an idea of the long-term outcome of the disease; (6) Regulator acceptability: EFS is often used as a regulatory endpoint in the context of pediatric and hematologic malignancies, so using it as a primary endpoint can increase the chances of approval.
In NSCLC, the surrogacy of EFS for OS remains controversial. Surrogacy of EFS for OS has been demonstrated in multiple other tumor types, including gastroesophageal cancer, breast cancer, and hematologic malignancies.35–38 Despite the lack of proven surrogacy, EFS can provide useful information regarding treatment tolerability and toxicity which is not encapsulated by OS. Given its ease of measurement and earlier maturity compared to OS, we advocate for EFS’ use in neoadjuvant clinical trials. As more neoadjuvant trials reach data maturity, it is possible that the surrogacy of EFS for OS will be better demonstrated.
Identification of drivers of cancer recurrence represents an important field of discovery in basic and translational science, and identifying risk factors for recurrence is an important exploratory endpoint for clinical trials. Nevertheless, as with OS, the time needed to reach the needed EFS in patients receiving neoadjuvant therapy is too long to be feasible given the constraints of a clinical trial. Even in early neoadjuvant trials, EFS was 73% at 18 months with nivolumab monotherapy, 12 and more recent trials have had 12-month EFS between 71% and 92%.23,28 The situation can be even more complicated given the recent advent of ICI in the adjuvant space.16,39 Due to the time needed for EFS data to mature, we advocate for EFS to be used as a co-primary endpoint.
Pathologic response: For those undergoing neoadjuvant treatment, surgical resection is one of the key goals. While the resection score addresses the adequacy of the removal of the primary tumor, the pathologic response addresses the adequacy of its response to the neoadjuvant treatment. The degree of pathologic response is based on the percentage of residual viable tumor (%RVT) following neoadjuvant treatment. Since the type of cell damage differs based on the modality of treatment administered, the International Association for the Study of Lung Cancer has set forth guidelines for measuring the degree of pathologic response at the time of surgical resection. Major pathologic response (MPR) is defined as a %RVT ⩽ 10% in the primary tumor bed after neoadjuvant treatment, while pathologic complete response (pCR) is defined as no remaining viable tumor cells in the primary tumor or lymph nodes. While most trials use these thresholds of %RVT as endpoints, some trials have observed improvements in %RVT. In CheckMate 816, the median %RVT was 10% in those treated with ICI compared to 74% in the control arm, and the decrease in %RVT was observed regardless of the initial stage. On subsequent analysis, the depth of pathologic regression was associated with EFS, and this association was most pronounced in those with %RVT 0–5%. 40 In NEOSTAR study (NCT03158129), the investigators likewise observed a substantial proportion of those achieving MPR having %RVT 0–5%. Among patients with stage IIIA disease, there was a greater degree of tumor regression in the ipilimumab + nivolumab + chemotherapy group, suggesting that more aggressive treatment may be warranted for those with more advanced disease at diagnosis. By reporting %RVT as an endpoint, future meta-analyses may be performed to better define the relationship between different thresholds of pathologic response and survival endpoints, as well as other factors that may contribute to the degree of pathologic response.
Another component of the pathologic response is nodal downstaging, that is, the clearance of cancer cells from previously positive lymph nodes. This endpoint can provide valuable information regarding the response of local metastases which can be further extrapolated to the response of more distant metastases. Interestingly, in a pooled analysis, 41 there was a greater likelihood of obtaining a complete response in the lymph nodes than in the primary tumor bed, and there are data suggesting that tumor clearance in lymph nodes does not necessarily occur in patients whose primary tumor undergoes a pCR.12,42 The SAKK 16/14 study of neoadjuvant durvalumab stratified patients based on MPR, pCR, and nodal downstaging, demonstrating that all three metrics were associated with improved EFS. 17 Unfortunately, not all studies include nodal downstaging, and patients with the node-positive disease are generally only a subgroup of neoadjuvant trials. Furthermore, identification of node positivity and pathologic downstaging requires obtaining sufficient and appropriate tissue during the initial biopsy and at the time of surgery to ensure accurate pre-treatment pathologic staging. Despite the utility of this endpoint, it is not yet sufficiently generalizable for more widespread use. The main disadvantage of pathologic endpoints is that patients must undergo resection for the primary tumor to be assessed for pathologic response. Some neoadjuvant trials have only included those patients who undergo surgical resection for the final analysis; this strategy can skew the true treatment effect, as those who do not undergo resection may have progression of their tumor. As such, we advocate for performing a true ITT analysis in any clinical trial evaluating neoadjuvant ICI. Various studies have also used different sample handling, pathologic sampling approaches, and microscopic determinations of the pathologic response to treatment. These discrepancies may decrease the generalizability of these studies and lead to an International Association for the Study of Lung Cancer (IASLC) guideline on the pathologic handling of samples.
An additional drawback of using pathologic response is the heterogeneity of pathologic assessment. Expert training that is generally available only at large academic centers is needed to adequately assess responses. In addition, each institution may have different practices for preparing samples and determining responses. Standardized procedures, such as those proposed by the IASLC, are needed. 43
Despite these considerations, advantages of pathologic endpoints include assessing the response of the tumor at the microscopic level, which can then be extrapolated to a presumed response or lack among any possible micro-metastases. This information can be used to inform the need for and choice of adjuvant therapy. Of note, these endpoints have been shown to correlate with improved survival among patients who have a pathologic response in multiple tumor types, including breast cancer and bladder cancer.44,45 This benefit is observed regardless of treatment received, suggesting that pathologic response is a valid correlated endpoint for DFS and OS in these tumor types. Nevertheless, to be considered reliable, a potential surrogate endpoint must be causally linked to the clinically relevant endpoint and capture the whole effect of treatment. Methodological consensus is growing for meta-analytic approaches, which combine data from randomized clinical trials and allow assessment of the strength of the correlation between the treatment effects on the surrogate and survival endpoints.46,47 The correlation between treatment effects on the surrogate and clinically relevant endpoints should be demonstrated both at the patient and trial levels.48,49 Indeed, a strong association at the patient level indicates that the surrogate and clinically relevant endpoints are strongly associated with each other, while a strong association at the trial level indicates that a large proportion of the treatment effect on the clinically relevant endpoint is captured by the surrogate.
Multiple recent trials of neoadjuvant therapy in NSCLC have used MPR or pCR as their primary endpoint. There are not yet data to confirm that this truly represents a surrogate endpoint for DFS and OS in the setting of neoadjuvant therapy for NSCLC; nonetheless, the data from these trials appear to demonstrate improved survival among patients who undergo a pathologic response even in control groups. There are also not yet data to distinguish whether MPR or pCR has a better correlation with DFS and OS to recommend using one endpoint over the other.
Potential relevant secondary endpoints
Radiologic response rate: Patients receiving cancer therapy undergo regular imaging as part of their care. Unfortunately, current techniques are not able to distinguish residual fibrotic or necrotic tissue from a living tumor and are sometimes unable to distinguish a tumor infiltration of a lymph node from a robust inflammatory response. Multiple trials of neoadjuvant ICI for NSCLC have failed to show a correlation between tumor response based on size and/or Response Evaluation Criteria In Solid Tumours (RECIST) criteria and pathologic response.16,18,19 There are promising novel metrics, such as a metabolic change in the tumor on positron emission tomography (PET), but these are not yet adequately validated to serve as a primary endpoint. Some treatments, particularly ICIs, can also cause an influx of inflammatory cells that cause a tumor to appear artificially larger on imaging (so-called ‘pseudoprogression’). 50 In addition to the tumor itself, some studies have also noted a nodal immune flare. 25 This phenomenon mimics nodal progression on imaging but is characterized by non-caseating granulomas microscopically. These limitations of radiographic assessment have been shown to cause a discrepancy between pathologic and radiographic evaluation of neoadjuvant treatment response in that progressive or stable disease may be observed on imaging, but the response is noted on pathology. 51 Therefore, radiologic response in itself is not sufficiently specific to serve as a primary endpoint.
In addition to this lack of specificity, current radiographic techniques lack adequate sensitivity to identify microscopic disease. Therefore, radiographic response or disease stability preoperatively may not represent control of distant metastases that could recur postoperatively. In a cohort study investigating cell-free DNA [circulating tumor (ctDNA)] as a biomarker for NSCLC, a rise in ctDNA preceded radiologic relapse by 6.83 months. 52 While ctDNA remains an exploratory endpoint (as discussed below), these data suggest limitations of current radiologic techniques in identifying microscopic disease. Although radiographic imaging provides a valuable and noninvasive way to measure disease burden, it lacks the sensitivity and specificity needed to act as an adequate surrogate of OS.
Surgical endpoints: By definition, neoadjuvant treatment is given prior to surgery, so a regimen that negatively impacts surgical complications, the complexity of resection, or the time to surgery is suboptimal. These metrics should therefore be considered as secondary endpoints to obtain a more holistic view of the benefits and risks of a proposed neoadjuvant regimen.
Time to surgery: Surgical resection is a mainstay of the management of the localized disease. There are theoretical concerns that delay inherent to screening for and participating in a clinical trial may lead to a portion of patients having disease progression prior to surgery and developing incurable metastatic disease. Given the risk of adverse events from neoadjuvant treatment, including immune-related adverse events (irAEs) and poor wound healing, there is generally a delay between the completion of neoadjuvant treatment and the surgery itself. During this period, there is likewise a theoretical risk of disease recurrence that may preclude patients from undergoing resection. Likewise, due to the adverse effects of treatment on pulmonary function, wound healing, and other surgical considerations, some patients may not be surgical candidates after neoadjuvant treatment. Some single-arm neoadjuvant ICI trials did not report the median time to surgery,17,22,42 while others have had a range from no delay12,13 to a greater than 42 days delay in 22% of patients. 25 However, in the recent Checkmate 816 trial, the percentage of patients with delayed surgery was similar in those receiving neoadjuvant chemotherapy alone or chemotherapy + nivolumab, suggesting that the addition of ICI did not affect time to surgery. 29 As novel regimens are developed, time to surgery should be included to ensure they do not delay a curative intervention, specifically in stages Ib and II.
Extent of surgery: One of the major theoretical benefits of neoadjuvant therapy in other tumor types is the ability to decrease the extent of surgery. In breast cancer, multiple trials, including the recent BrighTNess trial, have demonstrated that neoadjuvant therapy decreases the rate of mastectomy, even in patients who were deemed ineligible for breast-conserving surgery at diagnosis. 53 In rectal cancer, the ORPA trial demonstrated that surgery can be safely deferred in favor of observation in patients who receive neoadjuvant chemotherapy and radiation therapy, 54 and the FOXTROT study showed that in colorectal cancer, the neoadjuvant treatment improved the rate of complete resection. 55 In CheckMate 816, a trend toward decreased pneumonectomy rate was observed in patients who received neoadjuvant ICI as opposed to the chemotherapy group; however, the study was not powered to evaluate the extent of resection. 29 As more phase III clinical trials of neoadjuvant ICI are performed, the reporting of surgical outcomes will be vital to establish whether the extent of surgery can safely be decreased with neoadjuvant ICI.
Preoperative attrition and resection score: Many neoadjuvant trials include resectability and surgical attrition as secondary endpoints in their studies. Both LCMC3 and CheckMate 15912,56 identified patients who had unresectable disease at the time of surgery (5% in both studies), while other studies have reported reasons for which patients did not undergo surgery. For instance, the TOP1201 trial of ipilimumab + chemotherapy57,58 had a 46% preoperative attrition rate with reasons including progressive disease, tumor location, irAE, and inadequate pulmonary function. While most trials do not have this rate of attrition, it underscores the multifactorial nature of attrition in neoadjuvant trials that is impacted by surgeon experience, tumor stage, and patient comorbidity. For instance, patients with a higher baseline stage are more likely to be deemed unresectable following neoadjuvant treatment, particularly in the setting of stable disease. Different trials can also have differences in inclusion criteria that impact resection rates; for instance, the LCMC3 study excluded patients with T4 disease due to invasion of surrounding structures, while trials like CheckMate 816 and NADIM II included T4 patients regardless of invasion. These differences in the baseline stage can also impact the feasibility of adequate resection. In addition, while most trials indicate that neoadjuvant ICI has minimal impact on surgical complexity, some surgeons may be less comfortable with the resection of a given patient based on their tumor location, size, and any potential treatment-related effects. Attention should be paid to the resection rates by stage in both control and experimental groups to assess potential confounders in surgical outcomes. Ultimately, an upfront multidisciplinary evaluation is required to identify appropriate candidates for surgery. Due to the importance of surgical resection in providing a meaningful chance at a cure, attrition rates (and the reasons for which they occur) should be reported as a secondary endpoint in neoadjuvant trials.
Another frequently reported endpoint is the completeness of resection. A non-R0 resection has a negative impact on survival due to the presence of known residual disease at the time of surgery and impacts mortality regardless of the stage, 59 but an R0 resection does not have an impact on the presence or absence of distant micro-metastases, thereby limiting its surrogacy for DFS and OS.
Surgical complexity and complications
In considering the safety and feasibility of neoadjuvant regimens, both surgical complexity and surgical morbidity require consideration. This is of particular importance as more trials are employing combinations of ICI with chemotherapy and/or radiation therapy. Previous trials of neoadjuvant chemotherapy and chemotherapy + radiation have shown increased fibrosis, which can make surgical resection more technically challenging.60,61 Some authors have previously raised concerns that this may generate a perception that neoadjuvant ICI might increase surgical complexity. 62 The NEOSTAR study reported subjective measures of surgical complexity, blood loss, and operative time. 63 Standardized complexity scales have also been suggested for incorporation in future trials. 62 Similar to surgical complexity, surgical morbidity is an important marker for the safety of neoadjuvant regimens. In the CheckMate 816 trial, surgical complications occurred in 41.6% of the nivolumab + chemotherapy group and in 46.7% of the chemotherapy-alone group, 29 suggesting that morbidity was not increased by the addition of ICI; nevertheless, further trials will be needed to better answer this question and optimize the safety of neoadjuvant regimens.
These surgical outcomes are important considerations and should be reported as part of any clinical trial investigating neoadjuvant treatment. Similar to safety and toxicity outcomes, these are useful secondary endpoints to assess the tolerability of a regimen and to better quantify treatment-associated risks. These are particularly important in the dissemination of novel treatments, as more complex surgeries may necessitate regionalization to larger medical centers and therefore decrease access for some patients. Nevertheless, as previously discussed, these do not represent adequate surrogate endpoints for DFS and OS and cannot be used as primary endpoints for neoadjuvant clinical trials investigating novel treatment regimens.
Exploratory endpoints
In addition to the primary and secondary endpoints listed above, there is interest in identifying better biomarkers to identify responders to treatment in real time rather than relying on pathologic response. Although these endpoints are promising, there are not sufficient data to recommend their broad uptake as primary or even secondary endpoints at this time.
One endpoint is the use of metabolic response on PET to identify responders to treatment. By measuring Standardized Uptake Value (SUV) on FDG-PET, investigators can obtain a surrogate for tissue metabolism and hypoxia. The utility of this metric has been tested in a phase Ib study of neoadjuvant sintilimab, which demonstrated an association between patients with an MPR or pCR and a reduction in tumor SUVmax 64 These results were reproduced in another study using pembrolizumab in the neoadjuvant setting. 16 Finally, the NeoTAP01 trial showed that while radiologic response and PD-L1 status were not correlated with pathologic response, PET activity was moderately predictive of an MPR or pCR. 18
Another exploratory endpoint is ctDNA, which can be measured by a so-called liquid biopsy to monitor response or genetic changes and has been evaluated in several cancer types. In breast cancer, clearance of ctDNA has been associated with a decreased rate of recurrence in patients who received neoadjuvant chemotherapy, 65 even in those without a pCR. 66 Multiple studies in colorectal cancer have demonstrated the clinical relevance of ctDNA as a prognostic marker for recurrence and is currently incorporated in deciding which patients benefit from adjuvant therapy.67–69 Similar predictive and prognostic benefit is starting to be observed in urothelial cancer. 70 In NSCLC, a previous retrospective study demonstrated that ctDNA clearance was concordant with pathologic response with 91.67% accuracy and was likewise able to predict relapse following surgery. 52 The NADIM study similarly showed that ctDNA was better able to predict survival compared to radiologic response, 20 and the CheckMate 816 trial showed that pCR was higher in those with ctDNA clearance regardless of clearance. 29 Other studies have demonstrated that early decreases in variable allele fraction in patients with metastatic NSCLC undergoing treatment with first-line chemoimmunotherapy 71 or subsequent monotherapy with a PD-L1 inhibitor 72 were associated with improved PFS and OS. In addition, a recent meta-analysis evaluating the use of ctDNA in NSCLC undergoing treatment with targeted therapies also demonstrated a trend toward improved PFS in baseline ctDNA-negative patients, as well as improved PFS with early reduction in ctDNA. A strong association with OS was not shown and findings were affected by a high degree of heterogeneity in trials. 73 Nevertheless, as outlined in the IASLC consensus statement, the methods of measuring ctDNA and defining a ctDNA response limit its application as a primary endpoint at this time.74,75
Conclusion
As screening for lung cancer improves and more early-stage NSCLC is identified, there is a growing need for high-quality clinical trials investigating neoadjuvant treatment. In particular, neoadjuvant ICI for NSCLC is an area of active investigation with multiple ongoing clinical trials. While once considered the gold standard for determining the therapeutic benefit, OS cannot be appropriately or practically measured in all trials. In designing these studies, care should be taken to identify a feasible primary endpoint that also acts as a surrogate for OS. As previously mentioned, there are multiple potential endpoints to use for neoadjuvant therapy, each with its own advantages and pitfalls. Pathologic response has been used as a surrogate endpoint in breast and bladder cancer, but its surrogacy in NSCLC remains unproven. Thus, pathologic response cannot yet be used as the sole primary endpoint and requires a survival-based co-primary endpoint. Nevertheless, we believe that this can be a useful endpoint based on its feasibility in the setting of a neoadjuvant trial and the data from multiple trials; still, these data have yet to fully mature, and surrogacy needs to be proven. As newer neoadjuvant regimens for NSCLC are developed, other novel biomarkers should be developed to pair individual patients with the most beneficial neoadjuvant regimen.
As its implementation in NSCLC expands, it is possible that ctDNA will be a useful primary endpoint in the coming years. This has already been demonstrated in other solid tumor types and represents a promising strategy to risk-stratify patients who are at greater risk for recurrence and therefore have a greater need for adjuvant therapy. Currently, there is insufficient evidence to recommend ctDNA as a primary or secondary endpoint in NSCLC, and there is substantial heterogeneity in the methods of its measurement across trials. Therefore, we recommend that it be used as an exploratory endpoint to provide the needed validation for its use as a prognostic biomarker in NSCLC. Likewise, clinical trials investigating ctDNA-guided treatment are needed to validate its use as a predictive biomarker in NSCLC. Such studies would enable more widespread implementation of this endpoint in clinical practice in both academic and community practice settings to guide treatment decisions and minimize toxicity (both clinical and financial) to patients.
Future studies should address whether pCR is a surrogate endpoint for survival measures and the role of MPR. The identification of biomarkers will allow to create de-escalation of clinical trials and escalation of clinical trials. This will include adjuvant ICI in those at high risk of relapse and a new combination either in the neoadjuvant or adjuvant setting. A longer follow-up is required to address the long-term results of these treatments.
