Abstract
Objective
Lung cancer is primarily categorized into small cell lung cancer (SCLC) and non-small cell lung cancer (NSCLC), each characterized by distinct therapeutic approaches and prognostic outcomes, particularly in stage III peripheral cases. This study aimed to develop predictive models utilizing clinical and radiomic data to preoperatively differentiate stage III peripheral SCLC from NSCLC.
Method
We conducted a retrospective analysis of 33 stage III peripheral SCLC cases and 99 stage III peripheral NSCLC cases treated at our hospital between January 2016 and July 2024. A total of 1037 radiomic features were extracted from contrast-enhanced CT scans. The cohort was divided into a training set (n = 92) and a test set (n = 40). Radiomic feature selection was performed using the LASSO algorithm, and nine machine learning models were evaluated. The optimal model was employed to compute the radiomics score (Rad-score) and construct a clinical model. A combined model, integrating clinical factors and radiomic features, was assessed for clinical utility through receiver operating characteristic (ROC) curve analysis (area under the curve, AUC), KS statistics and decision curve analysis (DCA). We externally validated the combined model in a group of 84 patients from another hospital.
Results
The logistic regression-based combined model exhibited superior performance, achieving AUC values of 0.956, 0.775, and 0.841 for the combined, clinical, and radiomics models, respectively, within the training cohort, and 0.905, 0.864, and 0.732 in the test cohort. AUC for the combined model was 0.843 in the external validation cohort. The KS statistics and DCA indicated the clinical utility of the combined model, as evidenced by a Brier score of 0.115.
Conclusion
The integration of clinical parameters and radiomics features within the combined model may hold significant potential for the preoperative differentiation of stage III peripheral SCLC from NSCLC.
Keywords
Critical Relevance Statement
This research effectively established a model capable of differentiating between stage III peripheral SCLC and NSCLC by integrating preoperative clinicopathological, radiomic, and combined clinical-radiomic features. The model demonstrates significant potential to improve the accuracy of treatment planning for patients with advanced-stage cancers or those who may require neoadjuvant therapy in the future.
Key Points
The study aimed to develop predictive models to distinguish stage III peripheral SCLC from NSCLC.
The model was successfully constructed by incorporating preoperative clinical and radiomic features, thereby holding promise for enhancing the precision of lung cancer treatment.
Introduction
According to the data presented in the Cancer Report by the International Agency for Research on Cancer (IARC), lung cancer remained the leading cause of cancer-related mortality worldwide in 2022. 1 Lung cancer is primarily classified into two major subtypes: small cell lung cancer (SCLC), accounting for 10%–15% of cases, and non-small cell lung cancer (NSCLC), comprising 85% of cases.2,3 SCLC is characterized by its aggressive nature and rapid proliferation, typically requiring a treatment regimen that combines chemotherapy and radiation therapy.4,5 The role of surgical intervention in the management of early-stage SCLC continues to be a topic of ongoing debate. 6 Current guidelines do not universally recommend surgical resection, largely due to the lack of robust, high-level evidence supporting its efficacy. Nevertheless, several retrospective analyses and smaller single-arm trials suggest that surgical treatment may offer benefits for certain patients with early-stage disease. 7 A population-based analysis revealed that patients diagnosed with stage I or II SCLC who had T1-2N0 disease and underwent surgical treatment demonstrated significantly enhanced overall survival (OS) and lung cancer-specific survival (LCSS) rates compared to those who received radiotherapy as their sole treatment modality. 8 This finding underscores the potential advantages of surgical intervention in a carefully selected cohort of patients. For individuals with resectable stage IIIA-N2 SCLC, concurrent chemoradiotherapy remains the standard treatment approach. 9 This contrasts with the management of stage IIIA-N2 NSCLC, where surgical interventions, such as lobectomy or pneumonectomy, may be advantageous. The choice of treatment is influenced by the specific stage and histological type of the tumor and may include adjuvant therapies. Although SCLC is not always characterized by central lesions with mediastinal lymph node metastasis, some cases present as peripheral types, which are frequently misdiagnosed clinically as NSCLC. In cases where a patient is diagnosed with resectable stage IIIA-N2 lung cancer, surgical intervention may be considered the primary treatment option.
Within the context of lung cancer diagnosis, computed tomography (CT) scans are the most commonly utilized imaging modality. Numerous CT imaging characteristics have been identified in lung nodules that may assist in predicting malignancy. However, the imaging features specific to lung cancer are limited, and conventional CT imaging analysis predominantly relies on the visual assessment by radiologists, which may introduce interobserver variability. 10 Furthermore, traditional CT analysis faces challenges in distinguishing between SCLC and NSCLC due to overlapping features, complicating visual differentiation in clinical practice. When lung cancer is suspected, a biopsy is performed to confirm the diagnosis and supplement CT imaging. Techniques such as bronchial brushing and CT-guided biopsy carry risks, including post-procedural infection, bleeding, and pneumothorax. Furthermore, pathological diagnosis through invasive biopsy generally assesses a localized area of the tumor rather than the entire neoplasm, thereby complicating comprehensive characterization. Additionally, the results of biopsies are not always promptly accessible, highlighting the imperative need to develop complementary non-invasive techniques for differentiating subtypes of primary lung cancer via radiomic analysis.
Radiomics is an emerging clinical technology designed to extract high-throughput features from medical images of lesions, thereby enhancing clinical decision-making processes.11,12 In clinical practice, radiomics has been utilized in the diagnosis of lung nodules, facilitating the differentiation between benign and malignant forms, preoperative prediction of nodule types, classification of various NSCLC subtypes and SCLC, prognostic evaluations, prediction of surgical outcomes, and assessment of tumor gene expression patterns and microenvironmental characteristics.13-15 The widespread implementation of imaging examinations in clinical diagnostics has increased the accessibility of radiomics research. Building on previous studies, this research aims to conduct a preliminary evaluation of the complex clinical parameters, radiomic features, and their integration for the preoperative differentiation of stage III peripheral SCLC from NSCLC.
Materials and Methods
This study was conducted in accordance with the Helsinki Declaration and received approval from the Ethics Committee of our two hospitals. For this retrospective study, informed consent was not required. And ethical review boards approved this retrospective analysis and waived informed consent requirements.
Patient Selection
The reporting of this study adheres to the TRIPOD guidelines. 16 From January 2016 to July 2024, we conducted a retrospective review and analysis of all patients diagnosed with stage III peripheral SCLC and NSCLC at our hospital. The inclusion criteria for this study were as follows: (1) a confirmed diagnosis of SCLC or NSCLC through surgical pathology; (2) the presence of solitary and solid nodules on imaging, indicative of peripheral lung cancer; (3) completion of adequate staging procedures, including whole-body PET-CT or brain MRI, and contrast-enhanced CT of the chest and upper abdomen, including the adrenals, with a pathological tumor stage of III; and (4) the availability of comprehensive clinical and pathological data, including analyzable plain and enhanced thin-slice CT images at a thickness of 1 mm per slice. Patients were excluded from the study based on specific criteria: (1) undergoing anti-tumor therapies, such as radiotherapy, chemotherapy, chemoradiotherapy, or molecular targeted therapy, prior to CT examination and pathological diagnosis; (2) diagnosis of other cancer types; and (3) CT images were available that had been obtained more than two weeks before the pathological diagnosis. The Tumor, Node, Metastasis (TNM) stage was determined in accordance with the ninth edition of the TNM staging system established by the American Joint Committee on Cancer. Based on our selection criteria, this study included 33 patients diagnosed with stage III peripheral SCLC. A control group comprising 99 patients with stage III peripheral NSCLC was randomly selected. Additionally, we collected a dataset (n = 84) from January 2022 to November 2024 in another hospital (21 and 63 patients with SCLC and NSCLC, respectively) to validate the combined model externally. The inclusion and exclusion criteria were the same as for the development cohort. The retrospective nature of this analysis received approval from the hospital's ethical review board, with a waiver for informed consent. All patient information was anonymized to ensure confidentiality.
CT Image Acquisition
Contrast-enhanced chest CT scans were conducted on all patients using dual-source CT technology (SOMATOM Definition Flash or SOMATOM Force, Siemens Healthineers, Germany). Prior to imaging, patients participated in breathing exercises and maintained breath-hold during inspiration to enhance image quality. Scans were performed with patients in the supine position to minimize artifacts. The imaging range, defined using the sternoclavicular joint and thoracic inlet as reference points, was acquired at a tube voltage of 120 kV and a tube current of 100 mAs. Reconstruction parameters included a slice thickness of 1 mm, a matrix size of 512 × 512, and a pitch of 1.2. Conventional algorithms were employed for the reconstruction process to optimize soft tissue and pulmonary imaging. Following the initial scan, 70-90 mL of the nonionic contrast agent iohexol (300 mg/mL) was administered via the ulnar vein using a high-pressure syringe at a flow rate of 3 mL/s. Dual-phase enhanced imaging was subsequently performed, capturing the arterial phase at 30 s and the venous phase at 90 s after contrast administration. In addition to the standard imaging parameters, the raw data were transmitted to the post-processing terminal for multiplanar reconstruction (MPR). Image feature analysis was conducted independently by two thoracic radiologists, each possessing certification from their respective professional organizations and having 7 and 13 years of experience in chest CT imaging, respectively. The window settings were calibrated to a mediastinal window with a width of 400 HU and a level of 40 HU, as well as a lung window with a width of 1500 HU and a level of −600 HU. The study undertook a comprehensive examination of various characteristics of primary lung tumors, including their anatomical location (distinguishing between the left and right lungs, as well as the upper, middle, and lower lobes), size (determined by maximum diameter), and morphological features, such as shape and margin, which were categorized as lobular, spiculated, or vermiform/branching. The study further evaluated internal features, including the presence of cavities or vacuoles and swamp-like reinforcement, as well as external features such as the pleural indentation sign, in addition to associated signs like emphysema. The assessment of CT image characteristics was independently performed by two radiologists, with any discrepancies resolved through a standardized protocol. Clinical data for each patient were systematically recorded using the Electronic Medical Record System, encompassing patient demographics such as gender and mean age, as well as clinical histories, including smoking status. Tumor biomarkers were also documented, specifically progastrin-releasing peptide (proGRP), squamous cell carcinoma antigen (SCC-Ag), carcinoembryonic antigen (CEA), and neuron-specific enolase (NSE). The radiologists were tasked with collecting all clinical data via the Electronic Medical Record System. Further details on segmentation, feature extraction, selection, and model development are provided as following.
Segmentation, Feature Extraction, and Selection
CT images were imported into 3D-Slicer (version 5.0.2, accessible at http://www.slicer.org) and analyzed using lung (1500/-600 Hounsfield Units) and mediastinal (400/40 Hounsfield Units) window settings. A radiologist with 13 years of professional experience, blinded to clinical data, meticulously delineated regions of interest (ROIs) on a slice-by-slice basis. Tumor ROIs were defined to encompass all areas within a nodule, including any cavities or vacuoles, while explicitly excluding bronchi, blood vessels, and normal lung tissue. To evaluate intraclass agreement, all patients underwent independent tumor segmentation one month later. A second radiologist, with 7 years of experience, independently repeated the segmentation for these patients to assess interclass agreement. Intraclass correlation coefficients (ICCs) were employed to determine the reproducibility of feature extraction with respect to intra-observer and inter-observer variability.
Radiomic features were extracted using Pyradiomics within the 3D-Slicer software. A total of 1037 radiomic features were derived from the images, including Original, log-sigma-4-0-mm-3D, and Wavelet transformations. The features extracted from the original image included 14 shape factor categories, 18 first-order histogram categories, 24 categories from the gray level co-occurrence matrix (GLCM), 16 categories from the gray level run length matrix (GLRLM), 16 categories from the gray level size zone matrix (GLSZM), 5 categories from the neighboring gray tone difference matrix (NGTDM), and 14 categories from the gray level dependence matrix (GLDM). Except for the shape factor categories, the type and quantity of features extracted from other image types were consistent with those extracted from the original image.
To address the challenge of dimensionality reduction of radiomic features in relation to the number of events, a three-step sequential methodology was employed. Initially, the interobserver agreement of radiomic features was evaluated, and features with an ICC exceeding 0.75 were selected. Subsequently, features that demonstrated statistical significance in differentiating between SCLC and NSCLC groups were identified. Finally, the LASSO logistic regression was utilized to identify the most informative radiomic features for distinguishing between SCLC and NSCLC within the training cohort. This process incorporated fivefold cross-validation repeated 100 times to minimize the risk of overfitting.
Model Development
Machine learning methodologies were subsequently developed using variables that demonstrated statistical significance in the multivariate analysis. The study followed these procedural steps: (1) Patients were randomly assigned to training and testing cohorts in a 7:3 ratio; (2) Nine machine learning models were constructed using statistically significant factors identified in the training dataset, specifically: XGBoost (Extreme Gradient Boosting), SVM (Polynomial Support Vector Machine), LightGBM (Light Gradient Boosting Machine), AdaBoost (Adaptive Boosting), GNB (Gaussian Naive Bayes), MLP (Multilayer Perceptron), and KNN (k-Nearest Neighbor). Optimal parameters for these nine machine learning models were retrospectively identified using a 5-fold cross-validation approach. To address the risks of data imbalance and overfitting, the Synthetic Minority Over-sampling Technique (SMOTE) was employed, along with class-weight adjustments. A 5-fold cross-validation procedure was utilized to evaluate the performance of these models and determine the most efficient one. The primary evaluation metrics comprised AUC, accuracy, sensitivity, and specificity. Models exhibiting overfitting among the nine machine learning algorithms were excluded, and the most predictive classifier was selected based on ROC analysis. A predictive model was developed by integrating both clinical and CT data.
Subsequently, a statistical comparison was performed among three models: the clinical model, the radiomics model, and the combined model, which incorporates both clinical factors and radiomics features, to determine the model with the highest predictive accuracy.
Statistical Analysis
Statistical analyses were conducted using Python version 3.7, with patients randomly allocated to training and test groups in a 7:3 ratio. Radiomic features were normalized through Z-score transformation, and baseline data were analyzed using univariate methods via Python's statsmodels version 0.11.1. Categorical variables were evaluated using chi-square tests, while continuous variables were assessed using either t-tests or Mann-Whitney U tests. Variables exhibiting statistically significant differences (P < .05) were subsequently included in the multivariate logistic regression analysis. The multivariate analysis identified clinical and CT features with statistically significant differences (P < .05), which were then used to develop a clinical prediction model. The model's performance was assessed using key metrics, including AUC, accuracy, sensitivity, specificity, positive predictive value, and negative predictive value. To evaluate the clinical utility of the three models, DCA and Kolmogorov-Smirnov statistical plots were employed. The methodology outlined in the Methods section ensures reproducibility by other researchers.
Results
Patient Characteristics
This retrospective study included a total of 132 patients, comprising 33 individuals diagnosed with stage III peripheral SCLC and 99 individuals diagnosed with stage III peripheral NSCLC. Participants were randomly assigned to two cohorts in a 7:3 ratio, resulting in 92 patients in the training cohort (26 with SCLC and 66 with NSCLC) and 40 patients in the test cohort (7 with SCLC and 33 with NSCLC). Statistical analyses indicated no significant age differences between the two groups, with both groups having a median age of 62 years. The proportion of male patients was higher in the SCLC cohort compared to the NSCLC cohort, with percentages of 70.00% and 66.00%, respectively (P = .670). Additionally, no statistically significant difference was observed in the prevalence of smoking history between the SCLC and NSCLC groups, with proportions of 55.00% and 51.00%, respectively (P = .688). However, a significantly higher prevalence of emphysema was identified in the SCLC group compared to the NSCLC group, with rates of 33.00% and 17.00%, respectively (P = .049).
According to CT findings, patients with NSCLC demonstrated a significantly higher incidence of the spiculated sign compared to those with SCLC, with prevalence rates of 52.5% and 3.03%, respectively (P < .001). Furthermore, pleural indentation was observed on CT lung window images in both NSCLC and SCLC groups, with occurrence rates of 64.65% and 6.06%, respectively (P < .001). Conversely, vermiform/branching signs were more prevalent in SCLC patients than in those with NSCLC, with frequencies of 24.24% and .10%, respectively. Additionally, a statistically significant difference was noted in swamp-like reinforcement during the venous phase between SCLC and NSCLC, with frequencies of 15.15% and 0.20%, respectively (P = .004).
In relation to tumor marker levels, the prevalence of abnormal proGRP and NSE was significantly higher in the SCLC cohort compared to the NSCLC cohort (P < .001 and P = .006, respectively). In contrast, the prevalence of abnormal CEA and SCC-Ag was significantly lower in the SCLC cohort than in the NSCLC cohort (P < 0.001 and P = .002, respectively).
The patient characteristics for both the training and testing cohorts are comprehensively presented in Table 1.
Clinical Characteristics of the Patients.
Abbreviations: NSCLC, non-small cell lung cancer; SCLC, small cell lung cancer; Lobul, lobulation; Spicul, spiculation; Cavity, cavities or vacuoles; Swamp, swamp-like reinforcement; Ple, pleural indentation sign; Vermiform, vermiform/branching; Emp, emphysema; proGRP, progastrin-releasing peptide; NSE, neuron-specific enolase; CEA, carcinoembryonic antigen;SCC, squamous cell carcinoma antigen.
Radiomics Feature Selection and Model Construction
Initially, a total of 1037 radiomic features were extracted. During the feature elimination process, 425 features that did not demonstrate significant differences between SCLC and NSCLC were removed, along with 255 features that exhibited high correlation and ICC values below 0.75. Subsequent LASSO screening, utilizing a λ value at the minimum standard with a standard error of 0.027 and 0.119, respectively, resulted in the retention of four robust radiomic features: energy, complexity, T90Percentile, and correlation, at λ = 0.119, as depicted in Figures 1A and 1B.

Radiomics and Clinical Features Selection with LASSO. 1A and 1C: x-axis Represents log (λ), and the Numbers Above the x-axis Represent the Average Number of Predictive Variables. The red dot Represents the Average Deviation Value of Each Model with a Given λ, While the Vertical bar of the red dot Represents the Upper and Lower Limit Values of the Deviation. The Vertical Dotted Line Represents the log (λ) Value Corresponding to the Best λ Value; the Selection Standard is the Minimum Standard. By Adjusting Different Parameters (λ), the Binomial Deviation of the Model is Minimized, and the Feature Datasets With the Best Performance are Selected. 1B and 1D: Plots the Coeffificients of the log (λ) Function. The Dotted Line Represented the λ Value the Minimum Standard and the Smallest. Select the Coeffificient That is not 0 Here as the Coeffificient of the Last Reserved Feature. After Screening out the Redundant Features by LASSO, the Four Most Robust Radiomics Features (Energy, Complexity, t 90Percentile, Correlation) Were Retained, With λ = 0.119.
In the training dataset, nine machine learning radiomics prediction models were developed, including XGB, LGBM, RF, AdaBoost Classifier, GNB, LR, MLP, SVC, and K-Neighbors Classifier, using four radiomics features. Among these models, the Logistic Regression Classifier demonstrated superior performance, achieving AUC values of 0.841 on the training dataset and 0.847 on the validation dataset, as presented in Tables 2 and 3, and Figure 2. Furthermore, a variable number-model rating analysis was employed to optimize the model, resulting in the selection of complexity, T90Percentile, and correlation as key features. This optimized model achieved AUC values of 0.840 on the training dataset and 0.903 on the validation dataset, as illustrated in Figure 3. The KS statistical chart further validated the model's efficacy, with a KS value of 0.541 achieved under the optimal prediction probability threshold (Figure 4).

Receiver Operating Characteristics (ROC) Curves of the Nine Machine Learning in Training Dataset.

Variable Number-Model Rating Analysis was Used to Optimize the Model, and Complexity, T90Percentile, Correlation Were Finally Selected into the Model, and the AUC Values of 0.840 on the Training Dataset and 0.903 on the Validation Dataset. 3A Represent the Weight Value of the Feature in the Model. 3B Represent the Mean AUC Value of the Model with Different Features.

The KS Statistical Chart for Radiomics Model (4A) and Clinical Model (4B).
Performance Metrics for Nine Models in the Training Dataset.
Abbreviations: AUC, area under the curve; XGBoost, EXtreme Gradient Boosting; SVM, polynomial supervised vector machine; LightGBM, Light Gradient Boosting Machine; AdaBoost, Adaptive boosting; GNB, Gaussian naive bayes; MLP, Multilayer Perceptron; KNN, k-Nearest Neighbor.
Performance Metrics for Nine Models in the Validation Dataset.
Abbreviations: AUC, area under the curve; XGBoost, EXtreme Gradient Boosting; SVM, polynomial supervised vector machine; LightGBM, Light Gradient Boosting Machine; AdaBoost, Adaptive boosting; GNB, Gaussian naive bayes; MLP, Multilayer Perceptron; KNN, k-Nearest Neighbor.
Feature Selection and Clinical Model Construction
In the training dataset, LASSO screening analysis identified five clinical features—namely, the spiculated sign, vermiform/branching sign, swamp-like reinforcement, pleural indentation sign, and proGRP—as key discriminators between NSCLC and SCLC at λ = 0.095, as depicted in Figures 1C and 1D. Utilizing these five features, a clinical prediction model was developed using LR machine learning. The AUC values for the clinical models in distinguishing SCLC from NSCLC were 0.775 in the training cohort and 0.864 in the test cohort. Additional details are presented in Figure 5. The model's efficacy was further corroborated by the KS statistical chart, which demonstrated a KS value of 0.697 at the optimal prediction probability threshold (Figure 4B).

Comparison of Receiver Operating Characteristic (ROC) Curves Among the Clinical Model, Radiomics Model and Combined Model in the Training Cohorts (A) and Testing Cohorts (B). The AUC values in the combined model were better than those in the clinical model and radiomics model for the prediction of SCLC.
Combined Model Construction and Validation of Performance
In this study, a composite model was developed using a LR classifier, which integrated five clinical characteristics with three radiomic features. The performance of this composite model exceeded that of the individual clinical and radiomic models, as indicated by ROC-AUC values of 0.956 and 0.905 for the training and test cohorts, respectively (see Figure 5). The sensitivity and specificity were 0.961 versus 0.887 and 0.862 versus 0.831 in the training and test groups, respectively. Moreover, after employing the SMOTE to address potential issues of data imbalance and overfitting, the composite model exhibited high ROC-AUC values of 0.977 and 0.922 for the training and test cohorts (Figure 6). The sensitivity and specificity were 1.00 versus 0.972 and 0.894 versus 0.878 in the training and test groups, respectively. The model's robustness was further corroborated by the KS statistical chart, which demonstrated a KS value of 0.848 at the optimal prediction probability threshold (Figure 7). The results of the DCA indicated that the integrated model for preoperative differentiation between stage III peripheral SCLC and NSCLC significantly outperformed the clinical and radiomics models in terms of net benefits in both the training and test cohorts, as depicted in Figure 8. The Brier scores for the entire cohorts were 0.115, with the corresponding calibration plots provided. Finally, a nomogram was established to predict the risk of SCLC for each patient (Figure 9). The nomogram points and the risk were caculated based on the nomogram, and the best cut-off value of the risk to predict SCLC was 0.772.

Receiver Operating Characteristic (ROC) Curves Among the Combined Model in the Training Cohorts (A) and Testing Cohorts (B), After that Resampling Technique (SMOTE) was Used to Mitigate the well-Known Risks of Data Imbalance or Overfitting.

The KS Statistical Chart for Combined Model.

Decision Curve Analyses for the Radiomics-Clinical Model Compared with the Radiomics Model and Clinical Model in the Training Cohort (A) and the Testing Cohort (B). Decision curve analysis showed that the net benefits of the combined model for the prediction of SCLC were higher than those of the clinical model and radiomics model.

The Nomogram of the Combined Model.
The external validation cohort comprised 84 patients (21patients with SCLC, 63 patients with NSCLC) included in the final analysis. Table 4 presents the baseline characteristics of this external group. The Area Under the Curve (AUC) values for the combined model in predicting SCLC and NSCLC cases were 0.843 in the external validation cohort, The DCA revealed that when the probability of the threshold was over 0%, the net benefit of the combined model for the prediction of IMA was high (Figure 10A and 10B).

Receiver Operating Characteristics (ROC) Curves(10A) and Decision Curve Analysis (10B) for Combined Model in the External Validation Cohort.
Clinical Characteristics of the Patients in External Validation Cohort.
Abbreviations: NSCLC, non-small cell lung cancer; SCLC, small cell lung cancer; Lobul, lobulation; Spicul, spiculation; Cavity, cavities or vacuoles; Swamp, swamp-like reinforcement; Ple, pleural indentation sign; Vermiform, vermiform/branching; Emp, emphysema; proGRP, progastrin-releasing peptide; NSE, neuron-specific enolase; CEA, carcinoembryonic antigen; SCC, squamous cell carcinoma antigen.
Discussion
This retrospective study identified significant differences in +both qualitative and quantitative clinical characteristics between patients with stage III peripheral SCLC and NSCLC. To distinguish stage III peripheral SCLC from NSCLC, five clinical variables were selected: spiculated sign, vermiform/branching sign, swamp-like reinforcement, pleural indentation sign, and proGRP. For the development of a radiomic prediction model, three radiomic features were chosen: complexity, T90th percentile, and correlation. The integrated clinical-radiomics model, which incorporated five clinical features alongside three radiomic parameters, demonstrated robust predictive capabilities in both the training and validation phases. Moreover, statistically significant distinctions were observed among the clinical model, the radiomics model, and the integrated clinical-radiomics model. Notably, the combined clinical-radiomics model exhibited superior performance compared to each individual model.
CT imaging of peripheral SCLC tumors consistently demonstrates homogeneous density, swamp-like enhancement, and a vermiform or branching pattern characterized by smooth margins, minimal spiculation, and pleural indentation. The vermiform or branching configuration, identified by a spindle-like morphology with its major axis oriented towards the hilum, functions as a polymer composed of two or more coalesced nodules and serves as a diagnostic marker for peripheral SCLC. Previous studies have described this vermiform or branching morphology using terms such as multinodular shape and fusiform beaded appearance.17,18 The presence of the vermiform or branching sign may be associated with tumors that proliferate along bronchial or vascular walls with a short doubling time, leading to asymmetric tumor growth due to compression by surrounding tissues. As the tumor expands around the bronchioles, it forms multiple contiguous nodules, which may account for the multinodular nature of peripheral SCLC.
The swamp-like reinforcement pattern is characterized by a lack of substantial unenhanced regions. Macroscopically, this corresponds to minimal necrotic and stromal tissues interspersed among the tumor nests, despite the absence of extensive necrotic areas within the tumors. This pathological observation has been previously documented, with similar findings reported in alignment with the enhancement pattern described by Kazawa and Taiga.17,19 A notable distinction exists between this pathological finding and that observed in NSCLC, which is characterized by larger necrotic tissue, and SCLC, which lacks necrotic tissue entirely. Consequently, the swamp-like reinforcement pattern of the nodule may serve as a critical diagnostic indicator for differentiating various types of malignant nodules.
In comparison to peripheral NSCLC, peripheral SCLC demonstrates a lower incidence of spiculation and pleural indentation signs, corroborating previous research findings. 20 In peripheral NSCLC, tumor cells possess the ability to invade the surrounding pulmonary parenchyma and thicken the interstitial space. This phenomenon is less pronounced in SCLC, likely attributable to its pathological characteristics, particularly the paucity of fibrous tissue. As a result, the impact on surrounding structures is minimal, leading to the infrequent occurrence of pleural indentation signs in SCLC. These observations are consistent with prior studies. 21 Regarding tumor markers, the levels of pro-gastrin-releasing peptide (proGRP) in peripheral SCLC groups were significantly elevated compared to those in peripheral NSCLC groups, aligning with findings reported in previous research. 21 Therefore, in instances where imaging signs are insufficient, tumor markers may prove beneficial for differentiation.
Previous studies have demonstrated that radiomic features derived from CT scans can preoperatively differentiate peripheral SCLC from NSCLC. Linning et al developed four radiomic classification models using these extracted features to evaluate phenotypic differences between SCLC and NSCLC, as well as among NSCLC subtypes, achieving an AUC of 0.82. 22 Similarly, Chen et al constructed a CT radiomic model employing a neural network classifier to distinguish peripheral SCLC from NSCLC, resulting in an AUC of 0.93. 15 Li et al developed a CT radiomic model using a feedforward neural network classifier to differentiate central SCLC from NSCLC, achieving an AUC of 0.78. 23 In our study, we employed a radiomics approach to enhance the differentiation of stage III peripheral SCLC from NSCLC. The images were transformed into higher-dimensional data to extract the relevant features. This method facilitated the high-throughput extraction of quantitative features from standard medical images, followed by automated analysis to support clinical decision-making. In this study, three independent radiomic features were incorporated into the final model: T90Percentile, complexity, and correlation. These features are classified under the Histogram Parameter, Neighboring Gray Tone Difference Matrix (NGTDM), and Texture Parameters, respectively. The histogram-derived parameter, T90Percentile, indicates that 90% of the observed values within a dataset fall below this specific threshold. The complexity feature, associated with the Neighboring Gray Tone Difference Matrix, relates to the complexity of an image and encapsulates the intricacy of the information it contains. Conversely, the correlation feature, categorized under Texture Parameters, represents the degree of similarity in gray levels between adjacent pixels. The differences in these three radiomic features between SCLC and NSCLC can be interpreted as follows: SCLC typically presents with uniform density and rapid growth, corresponding to a lower complexity value and a higher correlation value—metrics that reflect density and uniformity. In contrast, NSCLC is characterized by pronounced heterogeneity, often accompanied by necrotic vacuoles (areas of cell death). Specifically, this results in a notable increase in the T90 percentile and complexity, while the correlation diminishes. Utilizing the radiomic features, we developed a radiomic model that exhibits robust classification performance, achieving AUC values of 0.840 and 0.960 in the training and test cohorts, respectively. The integrated clinical-radiomics model demonstrated statistically significant superiority over both the individual clinical and radiomics models. And the integrated clinical-radiomics model also demonstrated a superior performance in external validation cohort with AUC values of 0.843. Furthermore, KS statistics and DCA substantiated the enhanced predictive capability of the combined model in comparison to the separate clinical and radiomics models. KS statistics and DCA offers valuable insights beyond traditional performance metrics such as discrimination and calibration, providing an evaluation of clinical impact that indicates an increased likelihood of successful outcomes.
The present study is subject to several limitations that warrant consideration. Firstly, due to the rarity of stage III peripheral SCLC, only two institutions were included in the study, and no power calculations were performed prior to the study. Secondly, the focus on patients with postoperative pathological results may introduce selection bias. Thirdly, the study exclusively assessed the primary tumor lesion without evaluating lymph node status. Another key limitation of this study is the absence of direct comparison or integration with clinical outcomes or prospective patient data. Future validation of this model will be conducted through a multicenter prospective study, followed by subsequent optimization.
Conclusion
Our research effectively established a model capable of differentiating between stage III peripheral SCLC and NSCLC by integrating preoperative clinicopathological, radiomic, and combined clinical-radiomic features. Moreover, the clinical-radiomic model exhibited robust predictive capabilities, indicating its potential utility in clinical settings. Within the context of contemporary pathological diagnostics, our methodologies represent the most precise approach for diagnosing stage III peripheral SCLC and NSCLC through postoperative tumor analysis. This model holds promise for substantially improving the accuracy of treatment planning for patients with advanced-stage cancers or those who may require neoadjuvant therapy in the future.
Footnotes
Abbreviations
Ethics Approval and Consent to Participate
The Xing Tai People's Hospital ethical review board approved this retrospective analysis and waived informed consent requirements. Ethics Committee of Xing Tai People's Hospital, reference number: 2023 (124), dated 2023.11.15. The study was also approved by the Ethics Committee of the Fourth Hospital of Hebei Medical University, (reference number: 2022KS017, data 2022.6.27).
Consent for Publication
We confirm that there has been no publication, submission, or acceptance elsewhere of the manuscript other than this journal. All potentially identifiable images or data in this article were published with the written consent of the individuals involved.
Authors’ Contributions
JJ Z and LG H performed the experiments and wrote the manuscript. QX Z, LN Z, Q X and FX G was responsible for designing the experiments. All authors read and approved the final version of this submitted manuscript.
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: Key development plan of Xingtai (2023ZC049).
Declaration of Conflicting Interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Availability of Data and Materials
The datasets produced and examined in the present investigation are not accessible to the public at this time due to ongoing analysis for future publications, although they can be obtained from the corresponding author on reasonable request.
