Individualized Prediction of Radiation Pneumonitis Using RP-GAN: Leveraging Global Lung Features and Explainable Artificial Intelligence

Abstract

Introduction

This study aims to develop an individualized risk prediction model for radiation pneumonitis (RP) based on unsupervised image feature learning. A deep convolutional generative adversarial network (DCGAN) was utilized to automatically extract features from computed tomography (CT) images.

Methods

A retrospective analysis was conducted on 180 lung cancer patients treated with volumetric modulated arc therapy (VMAT) at Kaohsiung Veterans General Hospital between 2015 and 2022. To mitigate clinical sample size limitations, rotation-based augmentation was employed to expand the training dataset. The pretreatment CT images were processed into three input configurations: whole-lung, V_5Gy dose regions, and V_20Gy dose regions. An unsupervised feature extraction model, designated RP-GAN, was constructed to capture latent representations associated with RP risk. High-dimensional features were refined via least absolute shrinkage and selection operator (LASSO) and integrated into a stacking ensemble learning framework (including RF, SVM, KNN, XGBoost, and LR). Model stability and generalization were validated through 10-fold cross-validation alongside an independent test set, while clinical interpretability was ensured using Grad-CAM and LIME.

Results

The whole-lung input model demonstrated superior performance, achieving an AUC of 0.856 and an accuracy of 0.861, with a recall of 0.778. In contrast, models restricted to V_20Gy dose regions showed a significant decline in sensitivity, with the recall decreasing to 0.273. XAI visualization confirmed that the model focused not only on the tumor bed but also on the peritumoral parenchyma and contralateral lung.

Conclusion

The proposed RP-GAN architecture effectively captures subtle textural changes across the whole lung microenvironment without requiring manual annotations. This framework provides a robust tool for individualized RP risk assessment, facilitating the optimization of radiation therapy plans.

Keywords

radiation pneumonitis (RP)deep convolutional generative adversarial network (DCGAN)stacking ensemble learning explainable artificial intelligence (XAI)risk prediction model

1. Introduction

Radiation therapy (RT) is a cornerstone modality in the clinical management of lung cancer and is capable of achieving substantial tumor control and improving survival outcomes. However, radiation pneumonitis (RP) caused by inadvertent irradiation of normal lung parenchyma remains a serious complication, with clinical manifestations ranging from dry cough and dyspnea to irreversible pulmonary fibrosis, thereby posing a significant threat to patient prognosis and quality of life. Owing to the combined influence of dose distribution, age, and host-related physiological factors, the development of RP exhibits marked inter-individual variability. Although commonly used dose–volume parameters (such as V_5Gy and V_20Gy) have been shown to correlate with the occurrence of RP, single or limited indices are insufficient to capture its inherent complexity, thus constraining the precision of individualized risk prediction.^1-3

Most existing RP risk prediction studies rely on supervised learning and heavily depend on manually contoured clinical structures delineated by medical physicists. However, the pathogenesis of RP involves widespread and complex lung responses that may not be fully encompassed by pre-defined annotated regions. In addition, the clinical implementation of deep learning faces two major challenges: first, the substantial time and subjective variability associated with high-quality manual annotation; second, as highlighted by Guo et al (2024), the heterogeneity of data across different imaging scanners often leads to performance degradation and undermines the robustness of model deployment in real-world practice.^4,5

Motivated by the need for a more individualized and clinically scalable approach to radiation pneumonitis (RP) risk assessment, we sought to move beyond conventional dose–volume metrics and annotation-dependent supervised models. Although parameters such as V_5Gyand V_20Gy remain clinically useful, they provide only limited summaries of radiation exposure and may not adequately capture the complex pulmonary background that modulates individual RP susceptibility. At the same time, most existing deep learning approaches depend on manually defined structures, which are labor-intensive to generate and may miss latent risk-related signals distributed outside pre-specified regions.

To address these gaps, this study proposes an unsupervised feature learning strategy based on a deep convolutional generative adversarial network (DCGAN) and develops a model termed RP-GAN. The motivation behind this framework is to determine whether whole-lung imaging contains clinically meaningful latent information that can improve RP prediction without requiring manual annotations. By automatically learning texture, structural, and spatial distribution patterns from CT images, RP-GAN is designed to capture lung-wide features potentially associated with subclinical inflammatory responses and inter-individual susceptibility.

In recent years, explainable artificial intelligence (XAI) has become essential for improving transparency and trust in cancer detection systems. Prior studies emphasize that medical AI should move beyond accuracy-driven models toward clinically traceable decision support, particularly in high-stakes oncological settings.^6,7 To improve decision transparency and clinical trustworthiness, XAI techniques, including gradient-weighted class activation mapping (Grad-CAM) and local interpretable model-agnostic explanations (LIME), were incorporated to visualize model attention regions and quantify their contributions to predictions. Furthermore, because no single classifier may be sufficient to model the heterogeneity of RP risk, a stacking ensemble learning framework was adopted to integrate the complementary strengths of multiple base learners, such as random forest (RF) and support vector machine (SVM), with the aim of improving generalizability and stability in RP risk prediction.^8-10

The central hypothesis of this study is that RP risk arises from the interaction between the global pulmonary microenvironment and the three-dimensional dose distribution rather than being driven solely by specific high-dose subregions. The scientific problem addressed in this study is whether RP risk can be adequately characterized by conventional dose-restricted regions alone, or whether clinically meaningful risk signals are distributed across the whole lung and embedded in the global pulmonary microenvironment. Because RP is biologically heterogeneous and spatially diffuse, models based only on predefined local regions may fail to capture latent susceptibility patterns relevant to individualized prediction. Accordingly, three image input configurations—whole-lung, V_5Gy, and V_20Gy regions—are designed to evaluate the necessity of incorporating full-lung information for enhancing predictive performance.^11-13 In summary, by integrating the RP-GAN, stacking ensemble learning, and XAI techniques, this work establishes a clinically oriented risk prediction framework intended to assist clinicians in precise individualized RP risk assessment, thereby facilitating treatment plan optimization and reducing the incidence of RP.

2. Materials and Methods

2.1. Study Framework

This study was developed by optimizing a DCGAN architecture, hereafter referred to as the RP-GAN. The overall workflow, illustrated in Figure 1, comprises four major stages: data preprocessing, RP-GAN model training, feature engineering, and the construction and evaluation of the RP risk prediction model. First, image format conversion, window setting, centering, and normalization were performed. The images were then divided into whole-lung regions and dose-restricted regions (V_5Gy and V_20Gy) as separate inputs to investigate the impact of different regions on feature extraction and model performance. RP-GAN was subsequently used to perform unsupervised feature learning on unlabeled images, and the resulting features were processed with the least absolute shrinkage and selection operator (LASSO) for feature selection and dimensionality reduction.^14,15 The selected features were fed into multiple base classifiers for prediction, and a stacking ensemble was used to integrate their outputs and build the RP risk prediction model. Finally, gradient-weighted class activation mapping (Grad-CAM; Selvaraju et al, USA) and local interpretable model-agnostic explanations (LIME; Ribeiro et al, USA) were applied to visualize model attention regions, and multiple evaluation metrics were used to assess model performance. This study was reported in accordance with the Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD) guidelines.¹⁶

Figure 1.

Workflow of the proposed RP risk prediction and explainability framework. The overall framework is divided into four core stages: (1) Data preprocessing – including CT image resampling and extraction of whole-lung and dose-restricted regions; (2) RP-GAN feature extraction – unsupervised representation learning with a deep convolutional generative adversarial network to automatically derive high-dimensional lung features; (3) Feature engineering – selection of key predictors using the LASSO operator; and (4) Model training and evaluation – construction of a stacking ensemble classifier and application of XAI visualization techniques to enable clinically interpretable RP risk assessment and transparent decision-making. Abbreviation: CT: Computed Tomography, RP-GAN: Radiation Pneumonitis Generative Adversarial Network, Grad-CAM: Gradient-weighted Class Activation Mapping, LIME: Local Interpretable Model-agnostic Explanations, LASSO: Least Absolute Shrinkage and Selection Operator, SVM: Support Vector Machine, KNN: K Nearest Neighbor, RF: Random Forest, XGB: Extreme Gradient Boosting, AUC: Area Under the ROC Curve, ROC: Receiver Operating Characteristic, ACC: Accuracy, PPV: Positive Predictive Value, NPV: Negative Predictive Value

2.2. Data Collection

This retrospective study was approved by the Institutional Review Board of Kaohsiung Veterans General Hospital (approval number: KSVGH25-CT1-06, approval date: December 17, 2024). The requirement for informed consent was waived by the Institutional Review Board because all data were retrospectively collected and fully de-identified prior to analysis. All procedures complied with relevant ethical guidelines, and all patient data were de-identified and re-encoded to ensure privacy and data security. A total of 180 lung cancer patients who received volumetric modulated arc therapy (VMAT) between 2015 and 2022 were included, and their pre-treatment thoracic CT images were collected for analysis. Tumor staging was performed according to the American Joint Committee on Cancer (AJCC) TNM staging system. Because this work focused primarily on developing an image-based risk prediction model, clinical variables were not used as model inputs; however, commonly used clinical parameters are summarized in Table 1 to provide an overview of the study cohort.

Table 1.

Baseline Clinical Characteristics and Radiotherapy Parameters of the Study Cohort

	Total	Without RP	With RP
	N = 180	N = 133	N = 47
Age
Range	39 - 96	39 - 96	43 - 90
(Mean ± SD)	68.61 ± 10.79	68.44 ± 10.87	69.09 ± 10.65
Gender
Male	132 (73.3%)	95 (52.8%)	37 (20.5%)
Female	48 (26.7%)	38 (21.1%)	10 (5.6%)
BMI
Range	9.59 - 40.31	13.66 - 40.31	9.59 - 34.42
(Mean ± SD)	23.42 ± 4.38	23.01 ± 4.21	24.58 ± 4.68
T Stage
T1	32 (17.8%)	24 (13.4%)	8 (4.4%)
T2	54 (30.0%)	37 (20.6%)	17 (9.4%)
T3	41 (22.8%)	30 (16.7%)	11 (6.1%)
T4	53 (29.4%)	42 (23.3%)	11 (6.1%)
N Stage
N0	52 (28.9%)	39 (21.7%)	13 (7.2%)
N1	25 (13.9%)	19 (10.6%)	6 (3.3%)
N2	65 (36.1%)	51 (28.3%)	14 (7.8%)
N3	38 (21.1%)	24 (13.3%)	14 (7.8%)
Chemotherapy
No	18 (10.0%)	12 (6.7%)	6 (3.3%)
Yes	162 (90.0%)	121 (67.2%)	41 (22.8%)

This table summarizes the baseline characteristics of 180 patients with lung cancer, including demographic variables (age, sex, and BMI), tumor stage, and Chemotherapy. By comparing the distributions between the RP and non-RP groups, this table provides the clinical context for subsequent individualized risk assessment. Abbreviation: RP: Radiation Pneumonitis, BMI: Body Mass Index.

Samples for the RP prediction model were obtained from the treatment planning system (TPS), including 180 lung cancer patients treated with VMAT between 2015 and 2022, together with their pre-treatment simulation CT scans. Data were retrieved from electronic medical records and TPS archives and anonymized prior to analysis. Simulation CT images were acquired via a GE Discovery CT590 RT scanner (GE Healthcare, Waukesha, WI, USA; 512 × 512 matrix, 2.5 mm slice thickness). Treatment planning was performed via Pinnacle³ v9.14 (Philips Healthcare, Fitchburg, WI, USA) or Eclipse v13 (Varian Medical Systems, Palo Alto, CA, USA), and radiation was delivered via Elekta Synergy or Versa HD linear accelerators (Elekta AB, Stockholm, Sweden). The materials used in this study included DICOM CT images, RT dose files, and clinical parameters, and the occurrence of RP was determined from clinical follow-up records and used as the reference label for model development. All the cases were split according to the number of image slices into training and test sets at an 8:2 ratio, with the detailed distributions summarized in Table 2.

Table 2.

Distribution of the Study Cohort and Imaging Data Across Datasets

Number of patients
Class	Train set	Test set	Total
Non-RP	106	27	133
RP	38	9	47
Number of images
Class	Train set	Test set	Total
Non-RP	6400	1600	8000
RP	2141	536	2677

This table details the distribution of 180 patients enrolled at Kaohsiung Veterans General Hospital (KSVGH), stratified at both the patient level and the image-slice level. Cases with and without radiation pneumonitis (RP) were allocated to the training and test sets at an 8:2 ratio. This large-scale imaging dataset provides a sufficient foundation for feature evolution during the unsupervised training of RP-GAN. Abbreviation: RP: Radiation Pneumonitis.

2.3. Assessment of Radiation Pneumonitis

RP is a common complication in lung cancer patients undergoing radiotherapy and is triggered by radiation-induced damage to normal lung tissue. In the early phase, such radiation injury leads to damage to alveolar cells and acute inflammatory responses, which may progress over time to irreversible fibrotic and other structural changes. The development of RP is driven by multiple factors, including dose distribution, treatment volume, and the patient’s pre-existing pulmonary conditions. Although modern techniques such as VMAT have substantially reduced the incidence of severe toxicity, a comprehensive capture of early and mild pulmonary reactions remains clinically important. In this study, RP was defined as any case with RP of Grade 1 or higher, according to the Radiation Therapy Oncology Group (RTOG) grading criteria, to ensure the inclusion of subtle and early manifestations. The typical imaging features of RP, as highlighted by the red circle in Figure 2A, were contrasted with the normal lung appearance shown in Figure 2B.

Figure 2.

Representative CT appearances of radiation pneumonitis (RP). (A) RP image: The red circle highlights the irradiated lung region showing ground-glass opacities and interstitial thickening, consistent with acute inflammatory changes. (B) Normal lung image: The lung parenchyma demonstrates preserved vascular markings and normal aerated spaces. In this study, cases with RP of grade 1 or higher according to the RTOG criteria were classified into the RP group, enabling the model to capture early, patient-specific subclinical abnormalities that may already be visually appreciable on CT

RP grading in this cohort was performed according to the RTOG criteria (Table 3). Two radiation oncologists independently reviewed patients’ clinical symptoms and imaging findings to diagnose and grade RP, and discrepancies were resolved by consensus.

Table 3.

RTOG Radiation Pneumonitis Grading Scale

Grade	Clinical manifestations
Grade 0	Asymptomatic, with no abnormal findings on imaging
Grade 1	Mild dry cough or mild dyspnea; imaging may show focal ground-glass opacities
Grade 2	Persistent dry cough or dyspnea; imaging demonstrates localized pulmonary consolidation or mild fibrosis
Grade 3	Severe dyspnea; imaging reveals extensive pulmonary consolidation or fibrosis
Grade 4	Life-threatening dyspnea; imaging shows widespread pulmonary fibrosis or consolidation
Grade 5	Death; imaging demonstrates diffuse pulmonary lesions with extensive fibrosis and consolidation

This table summarizes the clinical symptoms and imaging features associated with RP from grade 0 to grade 5 according to the Radiation Therapy Oncology Group (RTOG) criteria (e.g., ground-glass opacities and radiation-induced pulmonary fibrosis). In this study, cases with RP of grade 1 or higher were classified into the RP group in order to capture subtle, early subclinical changes and thereby improve the clinical sensitivity of individualized risk prediction. Abbreviation: RTOG: Radiation Therapy Oncology Group.

2.4. Image Preprocessing

Pretreatment thoracic CT images were standardized through a multi-stage pre-processing pipeline to ensure computational consistency and data traceability. Original DICOM data were batch-processed and converted into PNG format using a systematic indexing convention to facilitate precise cross-referencing between model predictions and anatomical locations. Lung regions of interest were extracted using radiotherapy structure files to restrict analysis to the pulmonary parenchyma. To optimize the input for the RPGAN’s Tanh activation layers, all images were centered and pixel intensities were normalized to a range of -1, 1.

Three input configurations were constructed: whole-lung images, V_5Gy dose regions, and V_20Gy dose regions. The latter two were based on clinical lung dose distributions to evaluate dose-relevant features. Systematic reviews by Keffer et al reported significant associations between RP risk and the proportion of lung volume exposed to both low-and high-dose ranges, with V_20Gy being a widely used clinical dose constraint in radiotherapy planning and V_5Gy also demonstrating a significant correlation with RP risk. Guided by this clinical evidence, the V_5Gy and V_20Gy regions were selected as dose-based high-risk controls to evaluate the performance of the proposed unlabeled feature learning framework in the absence of explicit clinical structure information.¹⁷ All other preprocessing steps were kept identical to those used for models trained with the whole - lung input.

2.5. Development of the RPGAN and Data Augmentation

The workflow of the RPGAN is illustrated in Figure 3. To mitigate clinical sample size limitations and alleviate the impact of class imbalance, strategic data augmentation was applied to the training set. Specifically, we employed rotation-based transformation within a range of ± 10° exclusively for the minority class (RP-positive samples). Through adversarial training, the generator and discriminator were iteratively optimized until reaching equilibrium. The model was implemented in Python 3.10 (Python Software Foundation, Wilmington, DE, USA) via TensorFlow 2.0 (Google LLC, Mountain View, CA, USA), with training settings including a batch size of 32 and 100 epochs optimized via the Adam optimizer. The trained model was subsequently employed for RP risk prediction.

Figure 3.

Architecture of the RP-GAN model and schematic of unsupervised feature learning. The proposed framework is optimized from a DCGAN architecture and consists of a generator–discriminator adversarial training pipeline. The convolutional feature maps derived from the discriminator are used as latent representations, enabling the model to automatically learn lung-wide texture patterns and spatial structural features associated with radiation-induced injury from whole-lung images without relying on manual annotations

2.5.1. Generator and Discriminator Architecture

The generator adopted a multi-layer transposed convolutional architecture, in which a 100-dimensional noise vector was progressively upsampled to produce 512 × 512 single-channel grayscale images. Starting from a 4 × 4 feature map, the spatial resolution was gradually increased to 128 × 128 through several intermediate layers, while the number of channels was reduced stepwise to refine the image details. Each layer utilized a 4 × 4 kernel to perform feature learning and spatial expansion, enabling the generated images to exhibit a coherent structure and fine-grained texture suitable for downstream model training. The detailed layer configuration, including feature map sizes and channel numbers, is shown in Figure 4A.

Figure 4.

RP-GAN deep learning architecture and multi-scale feature extraction scheme. (A) Generator evolution: A 100-dimensional random noise vector is progressively transformed through multiple transposed convolution (deconvolution) layers, reconstructing image details stepwise and expanding the feature maps to a final spatial resolution of 512 × 512. This pathway illustrates how latent noise is mapped into realistic lung-like images. (B) Discriminator convolutional pathway: Seven convolutional layers interleaved with pooling operations compress the input image into 28,672 latent features (multi-feature layer), which serve as the raw representations for subsequent unsupervised learning. This architecture enables high-dimensional characterization of lung texture and structure for RP risk modeling. Abbreviation: deconv: deconvolution, conv: convolution

The discriminator consists of seven convolutional layers and two max-pooling layers, which are designed to progressively extract deep features from the input images and perform real–fake classification. The input was a 512 × 512 single-channel grayscale images. Rather than employing automated convergence criteria or early stopping, the model was trained for a fixed number of epochs using Binary Cross-Entropy (BCE) loss. To stabilize feature learning, a feature matching loss was integrated into the generator’s objective, minimizing the $L_{2}$ distance between real and synthetic intermediate feature maps to capture biologically representative pulmonary structures. Optimization was performed via the Adam optimizer ( $l e a r n i n g r a t e = 2 \times 10^{- 4}, β_{1} = 0.5)$ . As the data propagate through the network, the spatial dimensions of the feature maps decrease while the number of channels increases, thereby enhancing the representation of local and high-level features. Max-pooling layers were inserted between specific convolutional layers to improve computational efficiency and feature condensation. The final feature maps were flattened and passed through fully connected layers to generate the classification output, indicating whether the input was real or generated. The architecture and feature map evolution are detailed in Figure 4B. In the Keras implementation, these convolutional layers were sequentially named “conv2d” through “conv2d_6” to facilitate subsequent feature extraction and layer-wise Grad-CAM analysis.^8,18 Importantly, the unsupervised feature extraction in this study utilized the intermediate representations (feature maps) from the discriminator’s convolutional layers.

2.6. Feature Engineering

2.6.1. Aggregation of Image Slices

The model inputs were constructed from features extracted from multiple CT slices per patient. To integrate slice-level information and build patient-level representations, mean pooling was applied to aggregate features across all slices from the same patient, yielding a single feature vector per case. This aggregation strategy reduces slice-to-slice variability and improves the consistency of patient-level predictions.

2.6.2. LASSO

To reduce feature dimensionality and model complexity, LASSO regression with L1 regularization was employed, driving the coefficients of noninformative features toward zero and retaining only features with meaningful predictive contributions. All features were standardized prior to training, and the regularization parameter λ was selected via cross-validation to mitigate overfitting and enhance model generalizability. The optimal λ was determined to be 0.0295, resulting in the retention of 66 features from the original 32,512 extracted features for subsequent model development.

2.7. Ensemble Learning: Stacking Classifier

The dataset was split into training and test sets at an 8:2 ratio. The test set was kept completely independent from model training and was used only for the final performance evaluation. A 10-fold cross-validation was applied to the training set. In each iteration, nine folds trained the model while one fold validated it. This process was repeated ten times to build the meta-feature matrix. The base learners (RF, SVM, KNN, and XGBoost) were trained sequentially, incorporating class weighting to address potential label imbalance, and their out-of-fold predictions were collected as inputs for the meta-learner, which was implemented via logistic regression (LR). The hyperparameters of the base learners were optimized via a randomized search combined with cross-validation, with the area under the ROC curve (AUC) serving as the primary evaluation metric. The overall stacking procedure is depicted in Figure 5.

Figure 5.

Stacking ensemble architecture and meta-learner fusion workflow. In the meta-feature generation stage, the training set is processed using 10-fold cross-validation to train four base learners—random forest (RF), support vector machine (SVM), k-nearest neighbors (KNN), and extreme gradient boosting (XGB)—and to construct a meta-feature matrix that captures their complementary prediction patterns. In the decision fusion stage, logistic regression (LR) is used as the meta-learner to perform a linear weighted combination of the predicted probabilities from all base models, aiming to reduce overfitting risk associated with any single classifier and to yield a stable final prediction of RP risk. Abbreviation: RF: Random Forest, SVM: Support Vector Machine, KNN: K Nearest Neighbor, XGB: Extreme Gradient Boosting, LR: Logistic Regression

2.8. Model Evaluation Methods

2.8.1. Evaluation of Predictive Performance (Confusion Matrix)

To comprehensively assess the performance of the proposed radiation pneumonitis (RP) prediction and diagnostic models, multiple evaluation metrics, including the area under the receiver operating characteristic curve (AUC), accuracy, positive predictive value (PPV), negative predictive value (NPV), specificity, recall (sensitivity), and F1- score, were employed. These metrics quantify both global and detailed aspects of model behavior in clinical classification tasks. All indices were derived from the binary confusion matrix, which is based on four fundamental quantities: true positives (TPs), false positives (FPs), false negatives (FNs), and true negatives (TNs).

2.8.2. Visualization With Explainable AI (XAI) Models

Grad-CAM was adopted as one of the explainable AI techniques to generate heatmaps by combining convolutional feature maps with class score information, thereby visualizing the image regions to which the model attends during decision-making. To further enhance interpretability at the output level, local interpretable model-agnostic explanations (LIME) were applied for localized analysis. LIME segments an image into superpixel regions and randomly occludes subsets of these regions to create perturbed samples; then, it evaluates changes in model outputs to estimate the contribution of each region to the prediction. In contrast to Grad-CAM, which emphasizes feature-level explanations, LIME focuses on local interpretability at the output layer; together, the two methods provide complementary insights into model behavior.^19,20

3. Results

3.1. Predictive Performance of RP Risk Models and Pathophysiological Interpretation

As summarized in Table 4, using whole-lung images as input yielded the best predictive performance across all experimental settings. The stacking ensemble model achieved the highest area under the ROC curve (AUC = 0.856) and accuracy (0.861), with a recall of 0.778 and an F1- score of 0.737, clearly outperforming all single classifiers. Statistical significance testing (Supplementary Table S1) showed that the whole-lung configuration had numerically superior performance compared to the V5Gy and V20Gy configurations in key metrics including AUC and recall, although the differences did not reach statistical significance (p > 0.05). Its calibration curves are shown in Supplementary Figure S1.

Table 4.

Summary of Classification Performance Under Different Image Input Configurations

Whole-lung input
Classifier	AUC	Accuracy	PPV	NPV	Specificity	Recall	F1-score	Brier score
SVM	0.846	0.806	0.625	0.857	0.625	0.556	0.588	0.154
RF	0.819	0.778	0.571	0.828	0.571	0.444	0.500	0.152
KNN	0.819	0.778	0.600	0.807	0.600	0.333	0.429	0.138
XGB	0.679	0.722	0.429	0.793	0.429	0.333	0.375	0.197
Stacking	0.856	0.861	0.700	0.923	0.700	0.778	0.737	0.142

V_5Gy region input
Classifier	AUC	Accuracy	PPV	NPV	Specificity	Recall	F1-Score	Brier Score
SVM	0.669	0.769	0.667	0.778	0.667	0.200	0.308	0.206
RF	0.566	0.692	0.333	0.758	0.333	0.200	0.250	0.223
KNN	0.676	0.769	0.667	0.778	0.667	0.200	0.308	0.187
XGB	0.679	0.692	0.400	0.793	0.400	0.400	0.400	0.204
Stacking	0.700	0.769	0.600	0.794	0.600	0.300	0.400	0.198

V_20Gy region input
Classifier	AUC	Accuracy	PPV	NPV	Specificity	Recall	F1-Score	Brier Score
SVM	0.653	0.692	0.333	0.722	0.333	0.091	0.143	0.254
RF	0.520	0.667	0.333	0.727	0.333	0.182	0.235	0.222
KNN	0.609	0.641	0.200	0.706	0.091	0.200	0.125	0.228
XGB	0.601	0.590	0.308	0.731	0.308	0.364	0.333	0.250
Stacking	0.653	0.667	0.375	0.742	0.375	0.273	0.316	0.265

This table presents the core outcome metrics of the study, summarizing AUC, accuracy, recall, and F1-score for comprehensive performance evaluation. The results show that the stacking architecture achieves the best performance when using whole-lung images as input and quantitatively demonstrate that, when image information is restricted to the V_20Gyhigh-dose region, the model’s sensitivity (recall) for identifying high-risk cases declines markedly, underscoring the pivotal role of whole-lung information in RP risk prediction. Abbreviation: AUC: Area Under the ROC Curve, ROC: Receiver Operating Characteristic, PPV: Positive Predictive Value, NPV: Negative Predictive Value, SVM: Support Vector Machine, RF: Random Forest, KNN: K Nearest Neighbor, XGB: Extreme Gradient Boosting, RP: Radiation Pneumonitis.

In contrast, when the input was restricted to the V_5Gy or V_20Gy regions, the model performance deteriorated markedly; under the V_20Gy configuration, the AUC of the stacking model decreased to 0.653, and its recall plummeted to 0.273. These findings highlight the limitations of models trained solely on local dose subregions and demonstrate that neglecting global lung anatomy severely compromises the ability to capture subclinical injury signals, thereby reducing the precision of individualized RP risk stratification.

This performance gap is strongly supported by radiobiological mechanisms. RP is increasingly recognized as a complex, lung-wide inflammatory process rather than a purely focal tissue injury. Although the V_20Gy region receives the highest radiation dose, previous studies have shown that radiation can trigger the release of pro-inflammatory cytokines—such as transforming growth factor-β (TGF-β), interleukin-1 (IL-1), and interleukin-6 (IL-6)—through bystander and abscopal effects, thereby altering the microenvironment even in non-irradiated lung regions. The proposed RP-GAN framework is capable of capturing subtle textural and structural alterations distributed throughout the lungs, which likely correspond to subclinical inflammation that is not readily discernible by visual inspection. When analysis is confined only to high-dose regions, the model effectively loses access to the global “inflammatory background,” resulting in a markedly reduced ability to identify high-risk patients.

Moreover, individual tolerance to radiation is profoundly influenced by the baseline lung microenvironment, including pre-existing pulmonary conditions such as chronic lung disease, emphysema, or structural remodeling, which are spatially distributed across the entire lung. All-lung images preserve this baseline status and thus encode susceptibility factors that shape RP risk at the patient level. Consistent with this concept, explainability analyses via Grad-CAM and LIME revealed that model attention was not confined to the tumor region but also extended to the surrounding and contralateral lung parenchyma, indicating that RP risk emerges from the interaction between the three-dimensional dose distribution and the global pulmonary microenvironment. These results further validate the proposed unlabeled feature learning strategy, demonstrating its ability to capture latent features closely related to individual physiological responses and to deliver more accurate personalized risk prediction than traditional dose-volume metrics alone.

3.2. Convolutional Attention Patterns and Interpretability Analysis of the RP Risk Model

Figure 6A shows the Grad-CAM visualizations of the RP risk prediction model across different convolutional layers. In the early convolutional layers (first to fourth layers), feature extraction focuses primarily on the tumor and its immediately adjacent regions, showing substantial spatial overlap with the clinically contoured tumor volume. As the convolutional depth increases, the model’s attention gradually expands from the local lesion to the surrounding lung parenchyma and eventually extends into the contralateral lung, indicating that the model integrates spatial information from the entire lung rather than relying solely on focal tumor features during decision-making.

Figure 6.

XAI-based visualization of feature evolution from local lesions to whole-lung patterns for a patient with radiation pneumonitis (RP=1). (A) Grad-CAM feature-level visualization: heatmaps of model attention are shown across different convolutional layers. The early layers (Layers 1–4) primarily focus on the local tumor region, whereas the deeper layers (Layers 5–7) progressively integrate spatial features from the surrounding and contralateral lung, indicating that the model is capable of deriving global lung microenvironment features from an initially lesion-centered representation. (B) LIME output-level visualization: superpixel-based maps quantify the contribution of individual regions to the model prediction. The results demonstrate that the model’s decisions are informed not only by the left-sided tumor region (yellow dashed contour in the original image) but also by signals from the right, non-irradiated lung, thereby supporting the hypothesis that RP risk is influenced by background whole-lung features. Abbreviation: RP: Radiation Pneumonitis, Grad-CAM: Gradient-weighted Class Activation Mapping, LIME: Local Interpretable Model-agnostic Explanations

LIME-based visualizations further complement this analysis by quantifying the local contributions of different image regions to the model’s predictions. As shown in Figure 6B, high-weight superpixels are predominantly located in the tumor-bearing left lung, but notable attention regions are also observed in the contralateral, non-irradiated right lung. This pattern is highly consistent with the “global lung microenvironment interaction” hypothesis described in Section 3.1 and confirms that the proposed RP-GAN can capture subclinical risk signals distributed throughout the lungs without requiring manual annotations.

These findings demonstrate that the model assesses radiation pneumonitis (RP) risk by treating the entire lung as an interconnected microenvironment, rather than focusing solely on peritumoral high-dose regions. Irradiated cells within high-dose volumes release soluble factors—including cytokines, reactive oxygen species (ROS), and exosomes—that mediate the radiation-induced bystander effect (RIBE). These signaling molecules propagate via systemic circulation or local diffusion to non-irradiated lung parenchyma, including the contralateral lung, thereby inducing cytotoxic and genotoxic damage. Together with radiation-induced immune activation, this process triggers systemic sterile inflammation and reconfigures the pulmonary immune landscape.^21-23 Collectively, these interactions manifest as cross-lung spatial effects, which the RP GAN effectively captures as predictive subclinical features. The concordant attention distributions revealed by Grad-CAM and LIME across distinct interpretability frameworks not only enhance the transparency of model decisions but also provide radiobiologically plausible explanations for the predicted RP risk.

4. Discussion

4.1. RP-GAN Feature Extraction Model

Beyond model development, the key scientific finding of this study is that whole-lung imaging features provide more informative RP risk signals than dose-restricted regions alone. This finding suggests that RP should not be viewed solely as a focal toxicity confined to high-dose areas, but rather as a lung-wide process influenced by the interaction between radiation exposure and the pre-existing pulmonary microenvironment. The superior performance of the whole-lung model therefore supports a broader biological interpretation of RP susceptibility and highlights the importance of preserving global pulmonary information in risk modeling. The RP-GAN model developed in this study employs an unsupervised feature learning strategy that enables automatic extraction of imaging features without relying on annotated data, and these learned representations are subsequently used for radiation pneumonitis (RP) risk prediction. Through adversarial training, the RP-GAN learns texture and structural patterns directly from CT images under label-free conditions and is able to capture latent signals associated with disease risk that may not be explicitly encoded in conventional contours or dose parameters.

Most RP-related studies to date have focused on supervised risk prediction models that depend heavily on clinically contoured structures as inputs, such as physician- or physicist-delineated lung and tumor volumes. Although such contours are routinely available from radiotherapy planning, the formation of RP involves complex and widespread pulmonary responses that may extend beyond predefined anatomical or high-dose regions, meaning that excessive reliance on manual annotations can constrain the model’s capacity to learn subtle risk-relevant features. The RP-GAN framework was therefore designed to reduce dependence on clinical annotations and to extract RP-associated features directly from imaging data via unlabeled feature learning.^24,25

With only a standard image preprocessing pipeline, the RP-GAN can be retrained and adapted to local datasets from different institutions, making it inherently data-driven and flexible with respect to variations in scanners, acquisition parameters, and processing protocols. Prior work has shown that even visually similar medical images acquired on different scanners or with different parameter settings can lead to substantial performance degradation in deep learning models, underscoring the impact of cross-device heterogeneity on model robustness and generalizability. By enabling site-specific adaptation without the burden of manual re-annotation, the RP-GAN has the potential to improve deployment efficiency in clinical practice and to facilitate integration into real-world radiotherapy workflows.

4.2. Advantages of Whole-Lung Imaging Features and Pathophysiological Mechanisms

Given the 13%–37% clinical incidence of radiation pneumonitis (RP) and its associated mortality risk, predictive models must prioritize sensitivity to mitigate false-negative risks.²⁶ This study and related meta-analyses demonstrate that effective models, such as RP-GAN, achieve sensitivities exceeding 0.74, with a combined AUC of 0.93 validating their diagnostic efficacy. The results of this study demonstrate that models using whole-lung images as input (AUC 0.856) markedly outperform those restricted to high-dose regions such as V_20Gy (AUC 0.653), thereby challenging traditional risk assessment strategies that focus primarily on high-dose irradiated volumes. This observation is well supported by radiobiological principles. RP is increasingly recognized as a complex, lung-wide immune-inflammatory process rather than a purely focal tissue injury confined to high-dose regions. Although the V_20Gy volume receives the bulk of the radiation dose, prior studies have reported that localized lung irradiation can induce the release of pro-inflammatory cytokines—including transforming growth factor-β (TGF-β), interleukin-1 (IL-1), and interleukin-6 (IL-6)—through bystander and abscopal effects, with these mediators disseminating via the bloodstream and interstitium to non-irradiated regions and altering the global pulmonary microenvironment. RP-GAN appears able to capture subtle whole-lung textural changes that likely reflect such subclinical inflammatory processes, which are difficult to perceive via routine visual inspection.^17,27

Low-dose regions, such as those encompassed by V_5Gy, also play a non-negligible role in pulmonary injury. When the entire lung receives low-dose radiation, the dose may be below the threshold for overt cell death but sufficient to activate alveolar macrophages and alter vascular endothelial permeability. In this study, incorporating whole-lung information substantially increased the recall from 0.273-0.778, indicating that many latent high-risk features reside in low-dose background lung tissue and would be overlooked if analysis were confined to the V_20Gy region alone. Under such restricted conditions, the model effectively loses its ability to evaluate the global “inflammatory background,” resulting in a high rate of missed high-risk cases.

From a radiobiological standpoint, the baseline lung microenvironment is a key determinant of individual tolerance to radiation. Whole-lung images inherently encode pre-existing pulmonary conditions such as emphysema, interstitial changes, or microvascular abnormalities, which are spatially distributed throughout the lungs and collectively form a susceptibility background for RP. Through unsupervised learning, the RP-GAN successfully extracts latent features that are independent of dose distribution yet strongly related to individual physiological responses, thereby enabling more accurate personalized risk prediction than models relying solely on traditional dose–volume metrics.

4.3. Contribution of XAI Techniques to Model Interpretability and Clinical Applicability

The Grad-CAM and LIME visualizations of the risk prediction model show that the primary attention regions are located over the tumor and the surrounding lung parenchyma, with a strong spatial correspondence to clinically defined high-dose regions. This finding indicates that, even without relying on manual structural contours or explicit dose information, the model can autonomously learn and localize image regions that are closely related to RP risk, reflecting its ability to capture treatment-related lung responses. Notably, the attention maps are not confined to the high-dose core but extend into adjacent lung tissue, suggesting that RP risk may arise from the interaction between high-dose exposure and the neighboring pulmonary microenvironment rather than being driven by a single dose band alone. This observation is consistent with the superior performance of the whole-lung input model compared with models restricted to V_5Gy and V_20Gy inputs, further supporting the importance of global lung information for RP risk assessment. Previous studies have shown that Grad-CAM provides feature-level spatial heatmaps, whereas LIME offers instance-level explanations²⁸; their integration enables a more comprehensive understanding of model behavior and enhances the reliability of clinical decision-making.⁶ In addition, the broadly concordant attention patterns produced by Grad-CAM and LIME across different interpretability frameworks enhance confidence in the model’s decision-making process and demonstrate that the proposed unlabeled feature learning strategy exhibits good stability and clinical plausibility in the RP risk prediction setting.

4.4. Advantages of Ensemble Learning

To improve robustness and classification stability in the presence of heterogeneous data, this study employed a stacking ensemble architecture for RP risk prediction. The approach integrated the outputs of several base classifiers—Random Forest (RF), Support Vector Machine (SVM), k-Nearest Neighbors (KNN), and XGBoost—using logistic regression (LR) as the metaclassifier for final decision-making. Overall, the stacking framework exhibited consistently superior performance, outperforming all individual base classifiers on this task. In direct comparisons of predictive performance, the stacking model achieved higher AUC 0.856, accuracy 0.861, and F1 score 0.737 than any single base learner (Table 4). Although the AUC improvement over the best-performing single model (SVM: 0.846) appears modest, the clinical gain in sensitivity was substantial. Single models such as RF and XGBoost showed limited ability to identify high-risk cases, with Recall values of only 0.444 and 0.333, respectively. In contrast, the stacking ensemble markedly increased Recall to 0.778. In the clinical context of radiation pneumonitis (RP), where failing to identify a high-risk patient (false negative) may lead to severe pulmonary complications, this substantial gain in sensitivity strongly supports the adoption of the ensemble approach despite its greater complexity. This observation aligns with previous studies demonstrating that stacking can exploit complementary strengths and compensate for individual model weaknesses.²⁹

To balance performance and complexity, logistic regression was chosen as the metaclassifier. By employing a simple linear fusion model in the second layer, we effectively combined the diverse predictive capabilities of the base learners while minimizing the risk of overfitting and preserving computational efficiency. This design is consistent with prior work in cardiovascular risk prediction, which has shown that logistic regression as a fusion classifier can strike an appropriate balance between predictive performance and model complexity in stacked ensemble frameworks.³⁰

4.5. Limitations and Future Directions

The current model successfully applies the RP-GAN for unsupervised feature extraction, thereby substantially reducing the dependence on manual annotations. Future work could incorporate self-supervised learning (SSL) strategies-such as masked image modeling and contrastive learning—while integrating singular value decomposition (SVD)-based spectral pooling techniques (e.g., singular pooling as proposed by Zhu et al) to guide the network in learning more representative anatomical features and spatial relationships from large collections of unlabeled CT images.³¹ In addition, inspired by recent work on the CR-SCAD framework,³² future studies may explore whether RP-GAN-derived latent features can be further modeled using a strategy that combines collaborative representation with sparse variable selection. Such a design may be particularly relevant in RP prediction, where patient-level samples are relatively limited but the learned image features are high-dimensional. In this context, a CR-SCAD-inspired downstream module may help preserve relationships among patient representations while simultaneously selecting the most discriminative latent variables associated with RP risk. This direction could improve robustness and feature interpretability, especially when integrating imaging, dosimetric, and clinical variables in a multimodal prediction setting. Such approaches are expected to enhance the model’s ability to capture subtle lung texture alterations and improve sensitivity for detecting early subclinical inflammatory changes.

The stacking ensemble framework used in this study involves multiple classifiers and non-trivial hyperparameter tuning. A future direction is to introduce automated machine learning (AutoML) techniques, including neural architecture search (NAS) and automated hyperparameter optimization (HPO), to systematically identify the optimal combination of base learners and meta-classifier weights. This would not only improve development efficiency and reduce manual tuning effort but also help ensure that the model maintains optimal generalization performance and stability when confronted with heterogeneous data from different institutions.³³

Furthermore, as a prerequisite for large-scale clinical adoption, future iterations must address data privacy and security through mechanisms such as Federated Learning or Blockchain^34,35. Given the progressive nature of RP, future studies will also explore the integration of time-series imaging and multidimensional clinical information to construct a dynamic risk prediction system capable of quantifying both the severity and temporal evolution of RP. In addition, external validation using multicenter datasets will be conducted to test the model’s applicability under varying scan protocols and population backgrounds. Prospective validation and real-world workflow integration studies will also be necessary before the framework can be considered for routine clinical implementation in radiotherapy planning.

5. Conclusion

This study successfully developed a radiation pneumonitis (RP) risk prediction framework that integrates RP-GAN--based unsupervised feature extraction, stacking ensemble learning, and explainable AI (XAI) techniques. The experimental results demonstrate that the whole-lung image model significantly outperforms models that use only V_20Gy or V_5Gy dose-restricted regions, achieving an AUC of 0.856 and increasing the recall for high-risk cases to 0.778. These findings indicate that imaging features associated with RP risk are not confined to high-dose sub volumes but are strongly linked to global lung texture, structural alterations, and spatial distribution patterns.

The proposed RP-GAN model highlights the advantages of unsupervised feature learning, enabling the automatic discovery of key risk-related signals from CT images without labor-intensive manual annotation and thereby addressing both the high cost of clinical labeling and the challenge of cross-device imaging heterogeneity. By integrating multiple classifiers through a stacking strategy, the framework further improves the stability and accuracy of risk prediction. Moreover, XAI analyses via Grad-CAM and LIME show that model attention is mainly concentrated in the lung parenchyma surrounding the tumor, with good correspondence to high-dose regions, supporting the clinical plausibility and transparency of the model’s decision process.

Taken together, the findings of this work underscore the lung-wide inflammatory nature of RP, the risk of which is shaped by the interaction between the global pulmonary microenvironment and subclinical radiation-induced damage. The proposed prediction framework has substantial potential for clinical application as an adjunctive tool for optimizing radiotherapy planning. Future research should expand the sample size and include multicenter external validation, prospective evaluation, and workflow integration studies to develop a dynamic risk prediction system covering different RP severity grades and to support individualized precision radiotherapy for lung cancer patients before routine clinical implementation can be considered.

Supplemental Material

Supplemental Material - Individualized Prediction of Radiation Pneumonitis Using RP-GAN: Leveraging Global Lung Features and Explainable Artificial Intelligence

Supplemental Material for Individualized Prediction of Radiation Pneumonitis Using RP-GAN: Leveraging Global Lung Features and Explainable Artificial Intelligence by Yang-Wei Hsieh, Pei-Ju Chao, Yi-Lun Liao, Wen-Ping Yun, Ling-Chuan Chang-Chien, Cheng-Shie Wuu, Yu-Wei Lin and Tsair-Fwu Lee in Technology in Cancer Research & Treatment.

Footnotes

Acknowledgements

We acknowledge the use of artificial intelligence tools solely for linguistic refinement and improving the readability of this manuscript. All scientific concepts, methodological developments, and clinical interpretations presented in this study are the sole work of the authors.

ORCID iD

Tsair-Fwu Lee

Consent to Participate

Informed consent was waived due to the retrospective design and anonymization of patient data, as approved by the Institutional Review Board (IRB) of Kaohsiung Veterans General Hospital (approval number: KSVGH25-CT1-06, approval date: December 17, 2024).

Funding

The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This study was partially supported by grants from the National Science and Technology Council (NSTC), Executive Yuan, Taiwan, Republic of China (113-2221-E-992-011-MY2, 114-2637-8-992-002). Institutional Review Board Statement: This study involving human participants was approved by the Institutional Review Board (IRB) of Kaohsiung Veterans General Hospital (approval number: KSVGH25-CT1-06, approval date: December 17, 2024), in compliance with ethical standards and regulatory requirements. The requirement for informed consent was waived.

Declaration of Conflicting Interests

The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Data Availability Statement

The datasets generated and/or analyzed during the current study are not publicly available due to institutional ethical restrictions and patient privacy regulations but are available from the corresponding authors (Yang-Wei Hsieh, wewe750422@gmail.com; Pei-Ju Chao, pjchao99@gmail.com; Tsair-Fwu Lee, tflee@nkust.edu.tw) upon reasonable request and with appropriate ethical approval from the Kaohsiung Veterans General Hospital Institutional Review Board (IRB No. KSVGH25-CT1-06). Supporting data, including feature extraction scripts and model performance metrics, are provided in the supplementary materials. The source code for the RP-GAN model, including preprocessing, feature extraction, and stacking ensemble learning implementation, will be made publicly available on GitHub upon acceptance of this manuscript. Due to institutional review board (IRB) regulations and patient privacy protection, the raw clinical imaging data cannot be publicly released. However, de-identified processed data (e.g., extracted feature sets and representative samples) can be provided upon reasonable request to the corresponding author for research purposes. The Zenodo repository link is provided below and will be activated upon publication: .

Use of Artificial Intelligence Statement

During the preparation of this manuscript, we used artificial intelligence (AI) tools solely for language improvement and polishing to enhance clarity and readability. All content was carefully reviewed and edited by the authors, who take full responsibility for the final manuscript. The core scientific contributions of this study—including the development of the RP-GAN unsupervised feature extraction model, the design of the stacking ensemble learning framework, and the clinical interpretation of global lung microenvironment signals—were conceived and completed entirely by the authors without the use of generative AI.

Supplemental Material

Supplemental material for this article is available online.

Appendix

References

Bradley

Paulus

Komaki

, et al. Standard-dose versus high-dose conformal radiotherapy with concurrent and consolidation carboplatin plus paclitaxel with or without cetuximab for patients with stage IIIA or IIIB non-small-cell lung cancer (RTOG 0617): a randomised, two-by-two factorial phase 3 study. The lancet oncology. 2015;16(2):187-199.

Marks

Bentzen

Deasy

, et al. Radiation dose–volume effects in the lung. International Journal of Radiation Oncology* Biology* Physics. 2010;76(3):S70-S76.

Palma

Senan

Tsujino

, et al. Predicting radiation pneumonitis after chemoradiation therapy for lung cancer: an international individual patient data meta-analysis. International Journal of Radiation Oncology* Biology* Physics. 2013;85(2):444-450.

Deasy

Blanco

Clark

. CERR: a computational environment for radiotherapy research. Medical physics. 2003;30(5):979-985.

Luo

Chen

Valdes

. Machine learning for radiation outcome modeling and prediction. Medical Physics. 2020;47(5):e178-e184.

Toumaj

Heidari

Navimipour

. Leveraging explainable artificial intelligence for transparent and trustworthy cancer detection systems. Artificial intelligence in medicine. 2025;169:103243.

Farhoudian

Heidari

Shahhosseini

. A new era in colorectal cancer: Artificial Intelligence at the forefront. Computers in Biology and Medicine. 2025;196:110926.

Radford

Metz

Chintala

. Unseuprvised representation learning with deep convolutional generative adversarial networks. arXiv preprint arXiv:1511.06434. 2015.

Selvaraju

Cogswell

Das

Vedantam

Parikh

Batra

. Grad-cam: Visual explanations from deep networks via gradient-based localization. In: Proceedings of the IEEE international conference on computer vision, 2017, Venice, Italy. Oct. 22 2017 to Oct. 29 2017.

10.

Zhou Z, Rahman Siddiquee MM, Tajbakhsh N, & Liang J , et al. Unet++: A nested u-net architecture for medical image segmentation. In: International workshop on deep learning in medical image analysis. Springer; 2018.

11.

Arroyo-Hernández

Maldonado

Lozano-Ruiz

Muñoz-Montaño

Nuñez-Baez

Arrieta

. Radiation-induced lung injury: current evidence. BMC pulmonary medicine. 2021;21(1):9.

12.

Groves

Misra

Clair

, et al. Influence of the irradiated pulmonary microenvironment on macrophage and T cell dynamics. Radiotherapy and Oncology. 2023;183:109543.

13.

Mothersill

Seymour

. Radiation-induced bystander effects: past history and future directions. Radiation research. 2001;155(6):759-767.

14.

, et al. Integrating deep learning and multi-omics features in radiation pneumonitis prediction for lung cancer patients using PET/CT. BMC Medical Imaging. 2025;25(1):426.

15.

Yadav

Menon

Ravi

Vishvanathan

. Lung-GANs: unsupervised representation learning for lung disease classification using chest CT and X-ray images. IEEE Transactions on Engineering Management. 2021;70(8):2774-2786.

16.

Collins

Reitsma

Altman

Moons

KGM

. Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD): the TRIPOD statement. Journal of British Surgery. 2015;102(3):148-158.

17.

Keffer

Guy

Weiss

. Fatal radiation pneumonitis: literature review and case series. Advances in Radiation Oncology. 2020;5(2):238-249.

18.

Gurusubramani

Latha

. Deep Convolutional Generative Adversarial Network for Improved Cardiac Image Classification in Heart Disease Diagnosis. Journal of imaging informatics in medicine. 2024;38(4):2146-2169.

19.

Ihongbe

Fouad

F Mahmoud

Rajasekaran

Bhatia

. Evaluating Explainable Artificial Intelligence (XAI) techniques in chest radiology imaging through a human-centered Lens. Plos one. 2024;19(10):e0308758.

20.

Vittorio

Lunghini

Morerio

, et al. Addressing docking pose selection with structure-based deep learning: recent advances, challenges and opportunities. Computational and Structural Biotechnology Journal. 2024;23:2141-2151.

21.

Wang

Xiao

Liu

Yuan

. Radiation-induced lung injury: from mechanism to prognosis and drug therapy. Radiation Oncology. 2025;20(1):39.

22.

Smolarz

Skoczylas

Gawin

Krzyżowska

Pietrowska

Widłak

. Radiation-induced bystander effect mediated by exosomes involves the replication stress in recipient cells. International Journal of Molecular Sciences. 2022;23(8):4169.

23.

Guo

Zhou

Jiang

Zhang

Zhu

. STING signaling activation modulates macrophage polarization via CCL2 in radiation-induced lung injury. Journal of translational medicine. 2023;21(1):590.

24.

Barragan-Montero

Bibal

Dastarac

, et al. Towards a safe and efficient clinical implementation of machine learning in radiation oncology by exploring model interpretability, explainability and data-model dependency. Physics in Medicine & Biology. 2022;67(11):11TR01.

25.

Gutiérrez

. Applications of artificial intelligence for toxicity prediction in radiotherapy. Université de Rennes; Universidad Carlos III; 2025.

26.

Chen

, et al. Predicting radiation pneumonitis in lung cancer using machine learning and multimodal features: a systematic review and meta-analysis of diagnostic accuracy. BMC cancer. 2024;24(1):1355.

27.

Najafi

Fardid

Hadadi

Fardid

. The mechanisms of radiation-induced bystander effect. Journal of biomedical physics & engineering. 2014;4(4):163-172.

28.

Wang

Toumaj

Heidari

Souri

Jafari

Jiang

. Neurodegenerative disorders: a holistic study of the explainable artificial intelligence applications. Engineering Applications of Artificial Intelligence. 2025;153:110752.

29.

Lee

T-F

Tsai

Tseng

, et al. Using Radiomics and Explainable Ensemble Learning to Predict Radiation Pneumonitis and Survival in NSCLC Patients Post-VMAT. Life. 2025;15(11):1753.

30.

Fathima

Raja

Jayanthi

Hariharan

. OptiStack classifier: optimized stacking framework with ensemble feature engineering for enhanced cardiovascular risk prediction. Inflammation Research. 2025;74(1):88.

31.

Zhu

Cai

Xiong

Zheng

, et al. Singular pooling: a spectral pooling paradigm for second-trimester prenatal level II ultrasound standard fetal plane identification. IEEE Transactions on Circuits and Systems for Video Technology. 2025.

32.

Cai

Liu

Wang

Chen

Zhu

Chen

. Developing a CR-SCAD algorithm for fibrosis and inflammatory activity analysis of chronic hepatitis C. International Journal of Machine Learning and Cybernetics. 2026;17(1):3.

33.

Imrie

Denner

Brunschwig

Maier-Hein

Van Der Schaar

, et al. Automated ensemble multimodal machine learning for healthcare. IEEE Journal of Biomedical and Health Informatics. 2025;26(6):4213-4226.

34.

Toumaj

Heidari

Shahhosseini

Jafari Navimipour

. Applications of deep learning in Alzheimer’s disease: a systematic literature review of current trends, methodologies, challenges, innovations, and future directions. Artificial Intelligence Review. 2024;58(2):44.

35.

Heidari

Javaheri

Toumaj

Navimipour

Rezaei

Unal

. A new lung cancer detection method based on the chest CT images using Federated Learning and blockchain systems. Artificial intelligence in medicine. 2023;141:102572.

Supplementary Material

Please find the following supplemental material available below.

For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.

For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.

0.75 MB

0.00 MB