Abstract
Background
Histologic grading of lung adenocarcinoma (LUAD) is predictive of outcome but is only possible after surgical resection. A radiomic biomarker predictive of grade has the potential to improve preoperative management of early-stage LUAD.
Objective
Validate a prognostic radiomic score indicative of lung cancer aggression (SILA) in surgically resected stage I LUAD (n = 161) histologically graded as indolent low malignant potential (LMP), intermediate, or aggressive vascular invasive (VI) subtypes.
Methods
The SILA scores were generated from preoperative CT-scans using the previously validated Computer-Aided Nodule Assessment and Risk Yield (CANARY) software.
Results
Cox proportional regression showed significant association between the SILA and 7-year recurrence-free survival (RFS) in a univariate (P < 0.05) and multivariate (P < 0.05) model incorporating age, gender, smoking status, pack years, and extent of resection. The SILA was positively correlated with invasive size (spearman r = 0.54, P = 8.0 × 10−14) and negatively correlated with percentage of lepidic histology (spearman
Conclusions
The SILA scoring of preoperative CT scans was prognostic and predictive of resected pathologic grade.
Introduction
Lung cancer (LC) is the deadliest cancer in the United States (U.S.) with an estimated 238,340 new cases and 127,070 deaths in 2023 [1]. However, lung cancer mortality has begun to decline in part due to declining rates of cigarette smoking and more recently the widespread implementation of low-dose computed tomography (CT) screening programs that have led to detection at earlier stages where curative surgery is possible [1]. Despite the mortality reduction associated with LC screening, CT screening results in an increase in overdiagnosis leading to higher morbidity, financial burden, and stress among patients [2,3]. Accurate pre-surgical prognostic markers are needed to personalize the management of early stage LC. Indolent tumors may be able to be treated with non-surgical approaches such as stereotactic body radiation therapy (SBRT), cryoablation, microwave ablation, or radiofrequency ablation (RFA) [4]. New clinical trials also indicate that a subset of early stage LC can be adequately managed with sublobar resection rather than standard of care lobectomy [5,6]. On the other hand, patients with aggressive disease and high risk of recurrence may benefit from adjuvant or neoadjuvant systemic therapy, which is not standard of care for stage I disease [7,8]. Tumor histopathology is highly prognostic, but it requires comprehensive histologic examination that is only possible after complete surgical excision [9]. Small biopsies, such as those obtained via bronchoscopy or CT-guided biopsy, are able to establish a diagnosis of LC and distinguish between LC subtypes, but cannot reliably provide the same level of prognostic information as resected specimens due to limited sampling, tumor heterogeneity, and crush artifact [10]. Widespread validation and clinical implementation of machine learning approaches that can predict prognostic histologic patterns and features from CT scans are an important approach to improve clinical management of early-stage tumors.
Lung adenocarcinoma (LUAD) is the most common subtype of LC overall, accounts for virtually all cases among light and never smokers, and is heterogeneous in its histologic patterns, features and prognosis [9]. In the National Lung Screening Trial (NLST), overdiagnosis was high (79%) among a subset of LUAD historically termed “bronchoalveolar carcinoma” (BAC), which comprised 27% of all LUAD detected by CT-screening [3,11,12]. Since the NLST, BAC has been discontinued as a diagnostic entity and replaced with adenocarcinoma in situ (AIS) and minimally invasive adenocarcinoma (MIA) which exhibit 100% disease-free survival (DFS) after excision, but together make up only ∼5% of stage I LUAD, substantially less than BAC in the NLST [13].
Recently, a proposed histopathology classification of stage I LUADs as low malignant potential (LMP) with 100% DFS that includes AIS and MIA, accounted for 23% of stage I LUAD, reflecting a similar proportion of cases as was reported as overdiagnosed stage I BAC in NLST [14]. In contrast to LMP, there are tumor invasive characteristics that are associated with poor prognosis. Vascular invasion (VI), a pathological hallmark of cancer pre-metastasis and a strong predictor of recurrence, cancer specific and overall mortality in patients with early-stage LUAD, even among tumors < 2 cm invasive size, has been shown to be more prognostic than the highest World Health Organization (WHO) grade [15,16,17,18,19].
We sought to evaluate the ability of a previously published CT scan-based method to distinguish between stage I LUAD classified as indolent (AIS/MIA/LMP), aggressive (VI), and intermediate grade (NST-no special type) at the time of resection. Computer Aided Nodule Assessment and Risk Yield (CANARY) is a software for automated risk assessment of adenocarcinoma based on of the clustering of voxel density histograms into nine clusters or exemplars named after colors [20]. Multidimensional scaling showed these nine exemplars clustered into three groups that visually corresponded to ground-glass appearance, solid appearance, and intermediate density. CANARY was originally designed and validated to distinguish invasive adenocarcinomas from AIS/MIA [20,21]. Subsequently, three CANARY risk groups were defined and association with patient outcomes were validated, independent of histology, in two retrospective surgical lung adenocarcinoma cohorts, including the NLST [22,23]. The good risk group among pathologic stage I adenocarcinoma was associated with 100% disease specific survival (DSS) in both cohorts. Interestingly, the good risk group represented 17% and 18% of pathologic stage I tumors in these cohorts, far exceeding the expected rate of AIS/MIA (∼5% combined). The latter finding implies that CANARY can predict a proportion of invasive lung adenocarcinomas beyond AIS/MIA that behave in an indolent fashion. Subsequent studies transformed the output of CANARY into a score indicative of lung cancer aggression (SILA) based on the prediction of invasive size and outcome [24,25]. Here, we further validate the association of CANARY and the corresponding SILA with prognosis in a retrospective cohort of pathologic stage I LUAD treated by surgical excision in an urban safety-net hospital setting. We also show that CANARY/SILA is predictive of WHO-2021 grade and our novel histopathologic grade, indicating that it detects histopathologic characteristics of LUAD invasion beyond invasive size.
Materials and methods
Clinical samples and pathology review
A retrospective cohort of 161 patients who were treated with surgery between 2005–2014 for pathologic stage I/0 LUAD were included in this study, representing a subset of a previously reported cohort [14,18]. Tumors measuring > 4 cm total size were not included, as subsets of these patients were given adjuvant therapy within this historic cohort. Cases were reviewed from Boston Medical Center (BMC), an urban safety-net hospital, after IRB approval (BU/BMC IRB H-37859 12/11/2018) in which patient consent was waived as this retrospective study posed no more than minimal risk of harm to subjects and involved no procedures for which written consent is normally required. The study was performed in accordance with the Declaration of Helsinki. Preoperative CT scans were obtained for all patients between December 2004 and November 2015. The median time from preoperative CT scan acquisition to surgery was 30 days. All matching pathology cases were reviewed by an experienced board-certified thoracic pathologist (EJB). Vascular invasion (VI) was defined as luminal invasion of a muscular artery or vein either within or adjacent to the tumor. Tumors were assessed for proportion of lepidic, acinar, papillary, micropapillary, and solid patterns in 5% increments with distinction of simple tubular acinar from complex and cribriform acinar patterns. Adenocarcinoma in situ (AIS) was rendered for purely lepidic tumors
CANARY analysis
CANARY Plus software version 1.0 was licensed from Mayo Clinic. CANARY has previously been demonstrated to have low inter-observer variability for segmenting and analyzing LUAD CT scans [28]. All CT scans were reviewed by an experienced board-certified thoracic radiologist at the time of clinical diagnosis. CT scans were acquired using a variety of scanners, with the majority (96.3%) acquired on one scanner. As part of this retrospective study, we collaborated with an experienced board-certified thoracic surgeon (KS) who confirmed that the nodule location on the CT scan matched the resected nodule on the original clinical report, and that adequate masking was performed by the CANARY nodule detection algorithm. The SILA and associated exemplars were generated by CANARY and exported for further analysis. The nine exemplars were previously named based on nine arbitrary colors: blue (B), cyan (C), green (G), yellow (Y), pink (P), violet (V), indigo (I), red (R), and orange (O) [20].
Statistical analysis
All statistical analysis was performed with R version 4.2.1. Tables were created with the tableone package. Comparisons of distributions of count data were tested with chisq.test. Correlations were performed using spearman correlation with stat_cor or cor.test. Comparisons of distributions of continuous data were tested with wilcox.test or t.test, as specified. P-values were converted to false-discovery rate (FDR) values by p.adjust using the bonferonni method. Survival analysis used recurrence-free survival (RFS) as an endpoint, which was defined as the time from surgery to recurrence or last follow up. Univariate and multivariate Cox regression was performed using the survival package version 3.5.3. Kaplan-Meier plots were created using the survminer package version 0.4.9 and groups compared using the log-rank test. Area under the curve (AUC) calculations and receiver operating characteristic (ROC) plots were created using the pROC package version 2.3.0 [29]. All statistical tests were two-tailed and p values < 0.05 were considered significant.
Clinical and pathologic characteristics of 161 patients with resected stage I LUAD included in the study.
Clinical and pathologic characteristics of 161 patients with resected stage I LUAD included in the study.
Note: The data are shown as the number and (%) unless otherwise indicated. Abbreviations: AIS, adenocarcinoma in situ; MIA, minimally invasive adenocarcinoma; G1, grade 1; G2, grade 2; G3, grade 3; M, mucinous, LMP, low malignant potential; NST, no special type; VI, vascular invasion.
Patient and tumor characteristics
Table 1 shows the clinical and pathologic characteristics of 161 patients with resected stage I LUAD included in the study. The mean age was 67.3 years. Most patients were female (60%), self-identified as white (53%), were former (47%) smokers, and were treated with lobectomy (64%). The patients in the study had an overall 7-year RFS of 88% with a mean follow-up time of 5.95 years. Kaplan-Meier estimation showed a significant difference in both RFS and DSS among grades from both the WHO 2021 grading (P < 0.05) and the novel grading classifications (P < 0.001) (Fig. S1A-B). AIS/MIA, WHO G1, WHO G2, and WHO G3 had 7-year RFS of 100%, 96%, 95%, and 81%, respectively. LMP, NST, and VI grades had 7-year RFS of 96%, 95%, and 65%, respectively. A single LMP recurred after wedge-resection with a positive surgical margin. The tumor recurred at the staple line and was treated with SBRT with prolonged survival ( > 10 years) without recurrence or metastasis. VI grade was associated with patients that identified as male (P < 0.01), Black or African American (P < 0.05), and were current smokers (P < 0.05), as previously reported (Table S1) [18]. No patients received adjuvant therapy.

The SILA is associated with recurrence-free survival in a cohort of resected stage I LUAD. (A) Distribution of the SILA by prognostic subgroup using previously established cutoffs (Varghese et al., 2019). (B) Kaplan Meier curve of the SILA prognostic subgroups with 7-year RFS. (C) Univariate cox proportional hazard model of the SILA predicting 7-year RFS. (D) Multivariate cox proportional hazard model of the SILA predicting 7-year RFS, with pack years, smoking status, gender, age, and surgical procedure as covariates.
Clinical and pathologic characteristics of resected stage I LUAD classified by the SILA prognostic subgroups.
Note: The data are shown as the number and (%) unless otherwise indicated. Abbreviations: SILA, score indicative of lung cancer aggression; AIS, adenocarcinoma in situ; MIA, minimally invasive adenocarcinoma; G1, grade 1; G2, grade 2; G3, grade 3; M, metachronous, LMP, low malignant potential; NST, no special type; VI, vascular invasion.
The SILA scores were binned into good (n = 12), intermediate (n = 94), and poor (n = 55) subgroups using the cutoffs established in the original manuscript (Fig. 1A) [24]. The mean SILA in each subgroup was 0.26, 0.54, and 0.75, respectively. Detailed results are shown in Table 2. Kaplan-Meier estimation revealed a significant difference in outcome among the three subgroups, with the good, intermediate, and poor subgroups having 7-year RFS of 100%, 91%, and 73%, respectively (P < 0.05) (Fig. 1B). The SILA was significantly predictive of RFS in univariate analysis (hazard ratio (HR) = 2.07, P < 0.05) (Fig. 1C). In a multivariate analysis including pack years, smoking status, gender, age, and surgical procedure, the SILA remained significant for RFS (HR = 1.84, P < 0.05) (Fig. 1D).

The SILA is associated with pathologic grade at resection. (A) The SILA correlation with invasive size at resection. (B) The SILA correlation with percentage of lepidic growth pattern, measured at resection. (C) The SILA association with WHO 2021 grading criteria. (D) The SILA association with novel pathology grading criteria.

The SILA performance for predicting LMP and VI stage I LUAD tumors. (A) ROC curve of the SILA predicting cases of LMP (including AIS/MIA) (Wilcoxon P = 8.0e−05). (B) ROC curve of the SILA predicting cases of VI (Wilcoxon P = 4.2e−05).
Given that the SILA has previously been reported as linearly increasing with invasive size (non-lepidic tumor size) at resection [24], we sought to validate this in our cohort and examine associations with other pathology features observable in the resected tumor. The SILA was positively correlated with invasive size at resection (R = 0.54, P = 8.0 × 10−14) (Fig. 2A) and negatively correlated with the percentage of lepidic growth pattern (

The CANARY red exemplar is associated with VI at resection. (A) Correlation matrix of CANARY exemplars with percentages of different LUAD histologic growth patterns (*indicates FDR < 0.05). (B) ROC curves of individual CANARY exemplars predicting VI cases. (C) ROC curves of individual CANARY exemplars predicting LMP cases.
Given that the SILA was weakly associated with WHO grade 3 tumors containing aggressive histologic patterns, we sought to determine whether any of the nine CANARY exemplars were associated with percentages of different growth patterns. Correlation analysis followed by unsupervised clustering revealed that non-lepidic patterns clustered separately from the exemplars (Fig. 4A), suggesting they are not major drivers of the SILA. The red exemplar had the highest performance for predicting VI (AUC of 0.69) (Fig. 4B). When all exemplars were included in a logistic regression model for predicting VI, only the red exemplar was significant (P < 0.05). Furthermore, after performing stepdown Akaike information criterion (AIC) analysis, the lowest AIC was obtained for a model that included only the red exemplar, suggesting that the red exemplar is primarily responsible for SILA’s ability to predict VI. Finally, LMP was classified equivalently by multiple CANARY exemplars (Fig. 4C). The lowest AIC was obtained for a model that included the indigo (P < 0.01), blue (P < 0.01), and green (P < 0.10) exemplars, suggesting that there are multiple radiologic aspects of the nodule that contribute to the prediction of LMP.
Discussion
This study evaluated the association between CANARY, a well-described algorithm for preoperative prediction of indolent and aggressive LUAD [20,22,24,25,28], and histologic grade in an urban safety-net hospital for the first time. In this cohort, the low, medium, and high CANARY SILA prognostic groups were associated with 100%, 91%, and 73% 7-year RFS respectively, and the SILA was significantly associated with RFS even after correction for other clinical factors. The SILA prognostic subgroups were originally identified by association with linear extent of histologic invasion and showed prognostic stratification in both an internal and external cohort of predominantly (83%) clinical stage I LUAD; exhibiting 100%, 79%, and 58% 5-year DSS [24]. Our improved outcomes among intermediate and poor SILA risk groups likely reflect the restriction of our analysis to pathologic stage I LUAD. As the SILA has been previously validated in a cohort derived from a subset of the NLST containing 94% white patients, the validation of CANARY and the SILA for predicting prognosis in a cohort containing patients of diverse racial and ethnic identity (47% non-white) is encouraging given that both LUAD incidence and LUAD aggressiveness at diagnosis is higher for non-Hispanic black patients [18,30,31]. Additionally, our cohort captures the diverse etiology that is known about LUAD, with 14% of patients being never-smokers.
There remains no clinically accepted approach to preoperatively predict tumor aggressiveness among surgically operable LCs, which are managed uniformly by clinical stage, potentially resulting in over-treatment of indolent lesions. While the SILA has been shown to accurately predict AIS and MIA stage I LUAD, we have previously shown that tumors designated as LMP more closely match the proportion of overdiagnosed cases in the NLST [14] and Burks et al. in this edition. In our cohort, the SILA “Good” group (n = 12) had 100% RFS, and the SILA achieved an AUC of 0.74 for classifying the larger group of LMP tumors (n = 27), which had 100% DFS. Future studies may therefore seek to improve the SILA’s identification of indolent stage I LUAD by incorporating other features such as serum proteomics, biopsy pathology, liquid biopsy and mutational or transcriptomic profiling into multimodal predictive models [32,33]. Although data from large clinical trials including JCOG0802/WJOG4607L and CALGB140503 suggest that lobectomy does not offer a survival benefit over limited resection, additional data is needed to determine whether patients identified with indolent disease may in the future have similar outcomes when treated with non-surgical approaches such as SBRT and RFA [4,5,6].
Tumor grading by microvascular invasion, the histologic representation of tumor intravasation, has been shown to be more strongly associated with post-surgical outcome in stage I LUAD than grading that takes into consideration the proportion of the aggressive LUAD histologic patterns – solid and micropapillary [18]. Retrospective analysis shows that patients with VI who undergo sublobar resection have poorer outcomes [34]. This underscores the growing importance of identifying individuals with more aggressive disease prior to surgery. Using the SILA to predict VI preoperatively, potentially in combination with other biomarker modalities, may therefore offer opportunities to guide precision surgery, but prospective studies are needed. Tumors exhibiting VI may also identify candidates who would benefit from adjuvant or neoadjuvant therapy. In this study, the SILA predicted VI with an AUC of 0.71 and was associated with VI independently of invasive size at resection. A previous study of stage IA LUAD nodules found that the ratio of the length of nodule consolidation to nodule diameter in preoperative CT scans predicted combined lymphatic and/or blood vessel invasion [35]. While lymph vessel invasion is often reported interchangeably with angioinvasion, it may not be as strong of an independent prognostic factor [18,19]. Examination of the CANARY exemplars showed that the red exemplar, which corresponds to the most visually solid tumor areas on CT, [20] was most responsible for the performance of the SILA for predicting VI. This is the first analysis showing CANARY to be predictive of a specific type of pathologic invasion; prior studies assessed CANARY and/or the SILA for predicting the size of any type of invasion. Others have identified the violet, indigo, red, and orange CANARY exemplars, visually corresponding to varying degrees of solid tumor CT appearance, as being associated with a lower likelihood of EGFR-mutated LUAD, which might be expected since these tumors frequently are rich in lepidic histology and ground glass CT-appearance [36]. Additionally, other studies have shown a lower prevalence of EGFR-mutated cases among both LUAD and NSCLC with VI [37,38].
Future efforts to improve the preoperative prediction of VI from CT images may take advantage of convolutional neural network-based extraction of perinodular features to add additional context from the surrounding lung microvascular architecture [39]. A study seeking to preoperatively predict VI positive hepatocellular carcinoma from CT scans achieved an AUC of 0.89 in a validation set using a radiomic model incorporating peritumoral features [40]. In contrast to our findings with VI, the SILA and the CANARY exemplars were not associated with aggressive LUAD histologic patterns, which may explain the poor predictive performance we reported for WHO grade 3 tumors. Other studies have demonstrated the feasibility of building radiomic classifiers that may predict solid and micropapillary histology, suggesting that other nodule features than those extracted by the SILA may be more representative of these high-grade patterns [41,42].
There were several limitations present in this study. Despite the robust validation of the SILA in a new cohort, the cohort was collected over 11 years and there may be variability due to changes in standard of care and practice patterns. Additionally, the low number of AIS/MIA cases included (n = 4) did not allow for a robust validation of the SILA to distinguish between indolent and invasive LUAD as defined by the WHO 2021 grading scheme [25]. Finally, because most of the CT scans used in our study were acquired with the same scanner manufacturer, we cannot rule out variability in the SILA due to scanner type. However, CANARY and the SILA were both derived on and have since been validated across a variety of scanner types.
Conclusion
The SILA derived from preoperative CT scans was prognostic and predictive of resected pathologic grade in stage I LUAD patients from a diverse cohort of patients. New strategies are necessary to minimize overdiagnosis in this clinical setting and identify aggressive tumors that may benefit from precision surgery, adjuvant and/or neoadjuvant treatment. Ultimately, the SILA should be prospectively validated and benchmarked against pathology review of biopsies to identify both LMP and VI tumors preoperatively.
Supplemental Material
sj-pdf-1-10.3233_cbm-230456 - Supplemental material for A computed tomography-based score indicative of lung cancer aggression (SILA) predicts lung adenocarcinomas with low malignant potential or vascular invasion
Supplemental material, sj-pdf-1-10.3233_cbm-230456 for A computed tomography-based score indicative of lung cancer aggression (SILA) predicts lung adenocarcinomas with low malignant potential or vascular invasion by Dylan Steiner, Ju Ae Park, Sarah Singh, Austin Potter, Jonathan Scalera, Jennifer Beane, Kei Suzuki, Marc E. Lenburg and Eric J. Burks in Cancer Biomarkers
Footnotes
Acknowledgments
This work was supported by 1R01CA275015-01A1 and the Ellison Foundation.
Author contributions
Conception: Dylan Steiner, Eric Burks, Marc Lenburg.
Interpretation or analysis of data: Dylan Steiner, Ju Ae Park, Sarah Singh, Austin Potter, Jonathan Scalera.
Preparation of the manuscript: Dylan Steiner, Eric Burks.
Revision for important intellectual content: Jennifer Beane, Kei Suzuki, Marc Lenburg.
Supervision: Jennifer Beane, Kei Suzuki, Eric Burks, Marc Lenburg.
Supplementary data
The supplementary files are available to download from http://dx.doi.org/10.3233/CBM-230456.
