Abstract
Background:
Adrenal lesions, often incidentally detected, present diagnostic challenges in distinguishing benign from malignant or hormonally active lesions. Conventional imaging (computed tomography/magnetic resonance imaging (CT/MRI)) has limitations, driving interest in artificial intelligence (AI) and radiomics for enhanced accuracy.
Objectives:
To systematically evaluate AI and radiomics applications in adrenal lesion characterization, focusing on diagnostic performance, methodologies, and clinical utility.
Design:
PRISMA-guided systematic review of studies published up to June 2024.
Data sources and methods:
PubMed, Scopus, Web of Science, and Google Scholar were searched using the keywords: adrenal lesions, AI, radiomics, and machine learning. Inclusion followed PICO criteria: patients with indeterminate lesions, AI/radiomics interventions, comparisons to standard diagnostics, and diagnostic accuracy. Two reviewers screened studies, resolving discrepancies via consensus. Eleven retrospective studies (996 patients) met eligibility.
Results:
CT-based radiomics (eight studies) achieved a mean AUC of 0.88 (range: 0.84–0.94) in differentiating benign/malignant or functional/non-functional lesions. Top-performing models identified aldosterone-producing adenomas (AUC: 0.99). MRI-based radiomics (three studies) yielded mean AUC: 0.82 (0.72–0.92), with test-set performance declines (e.g., AUC: 0.72) suggesting overfitting. Nuclear medicine (four studies) demonstrated that hybrid 18F-FDG PET/CT models (SUVmax + texture features) achieved an AUC of 0.97 for metastatic versus benign lesions. AI applications extended to intraoperative navigation (AUC: 0.93) and prognostic prediction.
Conclusion:
CT-based radiomics outperformed MRI, aligning with guidelines favoring CT for adrenal assessment. AI-enhanced models show promise in refining diagnostics and reducing invasive procedures. However, retrospective designs, small cohorts, and protocol variability limit generalizability. Future work requires multicenter collaboration, standardized protocols, and prospective validation to translate AI/radiomics into clinical practice.
Introduction
Adrenal masses are most commonly identified incidentally. It is estimated that incidental adrenal lesions are detected in approximately 4%–6% of computed tomography (CT) scans performed for unrelated indications. 1 The majority of these asymptomatic lesions are benign, with adrenal adenomas representing the most prevalent type. Adrenal adenomas have a high prevalence, affecting up to 9% of the general population, with incidence increasing with age.2–5 For clinicians and radiologists, characterizing small adrenal masses—particularly in distinguishing benign from malignant or hormonally active from inactive lesions—can be challenging.6,7 Standard imaging modalities, such as CT and magnetic resonance imaging (MRI), are validated tools for adrenal lesion characterization 8 ; however, their diagnostic accuracy may be limited in complex cases, such as lesions with low lipid content, 9 or large, heterogeneous masses. 10 Furthermore, adrenal biopsy has demonstrated a limited role in the diagnostic assessment of indeterminate adrenal lesions. 11 Artificial intelligence (AI), along with its subfields of machine learning (ML) and deep learning (DL), has emerged as a promising approach to address these diagnostic limitations. ML and DL algorithms can autonomously identify patterns in CT or MRI images and extract quantitative radiomic features, thereby supporting diagnostic workflows through the development of predictive statistical models.12–15 These techniques utilize mathematical models and texture analysis to objectively quantify variations in grayscale intensity and spatial distribution within the image voxels. 16 Quantitative texture analysis can be classified into first-order statistics, such as uniformity, kurtosis, energy, mean, minimum, maximum density, and standard deviation of grayscale levels, 17 and second-order statistics. The latter, originally developed for two-dimensional (2D) image analysis, includes metrics derived from the gray-level co-occurrence matrix (GLCM), neighborhood gray-level difference matrix (NGLDM), and gray-level run-length matrix (GLRLM), which quantify spatial relationships among pixel intensities. While three-dimensional (3D) extensions (e.g., volumetric GLCM) have been introduced in contemporary radiomic applications, 2D methods remain widely used due to their computational efficiency and robust historical validation.18,19 This approach to extracting quantitative data from medical imaging is known as radiomics. 20 Radiomics has been applied, for example, to differentiate subclinical pheochromocytomas from lipid-poor adenomas. 21
The objective of this systematic review is to provide a comprehensive and up-to-date overview of the application of AI and radiomics in the diagnostic evaluation of adrenal lesions.
Materials and methods
Search strategy
This review was conducted in accordance with the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines (Figure 1 and Supplemental Material). 22 The objective of this research was to identify studies focused on the detection and structural and functional characterization of adrenal lesions through the application of radiomics and AI algorithms. The selection of studies, which began in February 2024, was guided by the population–intervention–comparison–outcome (PICO) framework 23 :
- Population: patients with indeterminate adrenal lesions;
- Intervention: radiomic analysis and ML algorithms applied to CT imaging (or, less frequently, MRI);
- Comparison: current standard diagnostic approaches, such as endocrinological tests (e.g., dexamethasone suppression test) and contrast-enhanced CT with washout analysis;
- Outcome: diagnostic accuracy in determining the nature of adrenal lesions, specifically in distinguishing between benign and malignant or secreting and non-secreting lesions.

Literature search and study selection flowchart.
A detailed data extraction form is provided in Table 1.
The data charting form is of aim of this review.
PET/CT, positron emission tomography/computed tomography.
The literature search was conducted using major electronic databases, including PubMed, Scopus, Google Scholar, and Web of Science. The following keywords and Boolean operators were used to refine the search: [Adrenal] AND [Lesion] AND [Artificial Intelligence] OR [AI] AND [Radiomics] AND [Machine Learning] AND [Deep Learning] AND [Neural Networks]. The search was performed up to June 2024. Boolean operators (AND/OR) were applied to enhance search precision. Due to the limited and heterogeneous nature of the studies in this field, a formal meta-analysis was not feasible.
Inclusion/exclusion criteria
Initially, duplicate articles were removed. Subsequently, studies were screened based on title and abstract to assess compliance with the PICO criteria. The following types of articles were excluded: reviews, abstracts, case reports, editorials, and publications not written in English. No restrictions were imposed regarding study design or sample size. Study selection was carried out independently by two authors (R.G. and O.S.T.), and disagreements were resolved through discussion with the senior author (M.F.).
Data collections, variables, and outcomes definition
The following data were extracted from the included studies: title, first author, year of publication, study design, data source, total number of patients, median age at diagnosis, clinical characteristics of the adrenal lesion, AI methodology used for lesion characterization, and the imaging modality applied. The primary outcome was the diagnostic accuracy of each method, reported as the area under the receiver operating characteristic curve (AUC). Radiomic features collected included first-order statistics (e.g., mean, minimum/maximum density, standard deviation, uniformity, kurtosis) and second-order texture features (e.g., GLCM, GLRLM, NGLDM). These features were handcrafted in most studies. Feature selection was generally conducted using dimensionality reduction techniques, though specific retained features were often not consistently reported.
Results
Literature search
The PRISMA flowchart illustrating the study selection process is shown in Figure 1.
The initial database search yielded 374 records. After removing four duplicate entries, 370 studies remained for screening. The application of inclusion and exclusion criteria resulted in the selection of 11 eligible articles for inclusion in this review. All included studies were retrospective in design.3,24–33 The total patient population across studies was 996, with a median sample size per study of 55 patients (range: 19–289).
CT was the imaging modality in 8 out of 11 studies (72%),3,25,27–31,33 among these, one used unenhanced CT. MRI was used in 3 out of 11 studies (27%),24,26,32 and in one of these, 32 CT was additionally performed.
Radiomic feature extraction was conducted after segmentation of the adrenal lesion and definition of a region of interest (ROI) on the images. In 9 of 11 studies,3,24,26,28–33 the ROI was manually segmented, while in one study 27 segmentation was automated using Amira software. One study 25 sought to streamline the process by extracting two-dimensional radiomic features from the largest lesion section visible on CT and deriving three-dimensional features from a cuboid encompassing the entire lesion. In three studies,3,32,33 radiomic features primarily characterized lesion texture. In one study, 31 radiomic features were combined with basic clinical variables to create two ML models: the Clinic-Radscore ML (radiomics + clinical variables, AUC = 0.994) and the Radscore ML (radiomics only, AUC = 0.869).
Across all included studies, radiomic features were used to develop ML models aimed at classifying adrenal lesions or distinguishing between benign and malignant types. Five studies24,27,30,32 focused on differentiating benign versus malignant lesions using histopathological analysis as the reference standard. Two studies3,29 aimed to classify lesion subtypes (e.g., lipid-poor adenomas, pheochromocytomas, metastases, or primary adrenal tumors). Four studies25,26,28,31 investigated the differentiation between hormonally active (secreting) and inactive (non-secreting) adrenal incidentalomas—a task traditionally performed using invasive methods such as adrenal venous sampling or hormonal stimulation testing. Specifically, one study 26 aimed to distinguish non-functioning incidentalomas from those with autonomous cortisol secretion, while another 31 focused on separating non-functioning adrenal adenomas from aldosterone-producing adenomas.
All studies successfully developed ML models with diagnostic performance exceeding an AUC of 0.80. Although limited by small sample sizes, these models showed promising accuracy and, in several cases, outperformed conventional radiological assessment methods.
Radiomics from CT images
As emphasized in the European endocrinology guidelines, 34 CT remains the imaging modality of choice for the initial diagnostic assessment of adrenal lesions. The primary diagnostic challenge lies in distinguishing between secreting and non-secreting lesions, as well as between benign and malignant ones. 35 Through unenhanced CT, it is possible to clearly diagnose lipid-rich adenomas (density < 10 HU). Unenhanced CT is particularly useful in identifying lipid-rich adenomas, which typically demonstrate attenuation values below 10 Hounsfield Units (HU). However, evaluating lipid-poor or indeterminate lesions is more complex and requires specific imaging protocols, including contrast-enhanced CT (CECT) with washout assessment. 36 Among the studies included in this review, unenhanced CT was used in a minority of cases, whereas the majority employed CECT protocols encompassing arterial, portal venous, and delayed phases. 3 To overcome the limitations and interpretative complexity of conventional imaging, radiomics applied to CT imaging has gained increasing relevance. In this context, either manual or automated segmentation of the region (ROI) or volume (VOI) of interest is performed, from which radiomic features are extracted to enable quantitative analysis. By extracting quantitative features from both unenhanced and contrast-enhanced CT images, radiomics has the potential to enhance diagnostic precision. Across the eight CT-based studies included in this review, the mean AUC for differentiating benign from malignant lesions, or functional from non-functional masses, was 0.88 (range: 0.84–0.94). Notable applications included adenoma subtyping (mean AUC: 0.88) and the identification of aldosterone-producing adenomas, with AUC values ranging from 0.88 to 0.99.
Radiomics from MRI
MRI also plays a significant role in the characterization of adrenal lesions, particularly due to the ability to detect a fat signal drop on T1-weighted chemical shift sequences. This imaging feature assists in identifying lipid-rich adenomas but is less effective in evaluating lesions that do not demonstrate signal drop. 24 Consequently, the differentiation between benign and malignant lesions using MRI may rely more heavily on radiomic analysis, which involves segmentation and texture-based gray-level feature extraction.24,26,32 All three MRI-based studies in this review employed pre-defined radiomic features to build models aimed at diagnosing malignant adrenal lesions. The mean AUC across these studies was 0.82, with individual study values ranging from 0.72 to 0.92. For instance, one study differentiating non-functioning from cortisol-secreting incidentalomas reported training set performance with an AUC of 0.92; however, a reduced AUC of 0.72 in the test set suggested potential overfitting. 26 Although MRI radiomics shows promise, comparative analyses continue to favor CT-based models in terms of diagnostic accuracy. 32 It is important to note, however, that these comparisons were made without incorporating T2-weighted sequences, which may provide additional value in texture analysis and lesion characterization. 24 Another relevant application of MRI-based radiomics is the distinction between non-functioning adrenal incidentalomas (NFAI) and those with autonomous cortisol secretion (ACSAI). In this context, unenhanced MRI sequences yielded high diagnostic performance in training cohorts (AUC: 0.92), though this declined in test sets (AUC: 0.72), again raising concerns of overfitting. 26
In one study, 14 an ML algorithm trained on MRI data from 60 patients aimed to differentiate among adrenocortical adenomas, lipid-poor adenomas, and non-adenomatous adrenal lesions. Its performance was compared with that of experienced radiologists, yielding comparable results (73% vs 80%, p = 0.39), highlighting the potential utility of ML-based MRI analysis.
MRI remains a valuable imaging modality in the evaluation of adrenal lesions, particularly for lipid-rich tumors. With the integration of emerging radiomic techniques—especially those incorporating T2-weighted sequences—MRI may offer complementary diagnostic insights through enhanced texture analysis. A summary of the key findings from CT and MRI-based studies is presented in Table 2.
Overview of CT (unenhanced), CECT, and MRI studies included in the review.
Mean AUC for CT-based studies: 0.88 (range: 0.84–0.94); MRI-based studies: 0.82 (range: 0.72–0.92).
AI, artificial intelligence; AUC, area under the curve; CECT, contrast-enhanced computed tomography; CT, computed tomography; MRI, magnetic resonance imaging.
Nuclear medicine and radiotracers
In addition to conventional anatomical imaging modalities, various nuclear medicine techniques play an important role in the functional assessment and differentiation of benign versus malignant adrenal lesions. Among these, 18F-fluorodeoxyglucose positron emission tomography/computed tomography (18F-FDG PET/CT) is the most widely used. Other radiotracers explored for adrenal imaging include 131I-6β-iodomethyl-19-norcholesterol (noriodocholesterol) used in adrenocortical scintigraphy, 11C-metomidate (MTO) PET/CT, 123I-MTO single photon emission computed tomography (SPECT), 123I-meta-iodobenzylguanidine (123I-MIBG), 18F-FDOPA, and 68Ga-DOTA-somatostatin analogs.37–46 Under normal conditions, adrenal glands on 18F-FDG PET/CT demonstrate radiotracer uptake and tissue density similar to that of the liver and spleen. They typically exhibit homogeneous enhancement with a mean standardized uptake value (SUV) of 2.3 ± 0.7, and an adrenal-to-liver maximum SUV ratio—which represents the most discriminatory parameter—of 1.0 ± 0.3. 47 Despite the established diagnostic value of these nuclear imaging modalities, the application of AI to nuclear medicine for adrenal lesion characterization is still in its early stages. To date, only a limited number of studies have investigated AI-based approaches utilizing 18F-FDG PET/CT data.48–51 These investigations were all retrospective in design, involved small cohorts ranging from 10 to 52 adrenal lesions, and are summarized in Table 3.
Studies evaluating AI methods applied to nuclear medicine imaging (i.e., 18F-FDG PET/CT) for the detection and evaluation of adrenal lesions.
18F-FDG PET/CT, 18F-fluorodeoxyglucose positron emission tomography/computed tomography; AI, artificial intelligence; AUC, area under the curve.
Ansquer et al. 48 investigated the potential added value of radiomic features extracted from 18F-FDG PET/CT for the characterization of pheochromocytomas, with a particular focus on biological, histological, and genetic profiling. The study evaluated conventional PET parameters (e.g., SUVmax, metabolic tumor volume, total lesion glycolysis) and textural features alongside biochemical and genetic data. A predictive model incorporating three selected features achieved an AUC of 0.95 in differentiating between sporadic and germline mutations, suggesting promising utility for precision phenotyping.
Nakajo et al. 49 aimed to assess the diagnostic performance of both SUV-related and texture features—such as entropy, homogeneity, and intensity variability—for differentiating benign versus metastatic FDG-avid adrenal lesions. While individual textural parameters did not outperform SUVmax, the combination of radiomics and SUVmax yielded enhanced diagnostic performance, with a sensitivity and specificity of 100% and 85%, respectively, and an AUC of 0.97. This supports the added value of combining quantitative texture features with conventional PET metrics. Beyond diagnostic characterization, two studies explored the prognostic applications of 18F-FDG PET/CT radiomics in adrenal malignancies. Wang et al. 50 evaluated adrenal lymphoma, demonstrating that patients with low lesion uniformity—measured via gray-level run-length matrix—had significantly shorter overall survival (10 vs 25 months, p = 0.046), supporting a role for textural analysis in prognostication. By contrast, Werner et al., 51 investigating adrenocortical carcinoma, found no significant correlation between PET-derived features (SUV ratios and textural metrics) and oncologic outcomes (PFS or OS), thus limiting the predictive utility of radiomics in this context. Overall, while the reviewed nuclear medicine studies were limited by small sample sizes and retrospective design, they underscore the emerging interest in radiomics and AI applications in PET/CT-based adrenal imaging.
Other applications of AI in adrenal imaging
Beyond lesion characterization, AI applications in adrenal imaging are expanding into perioperative and predictive domains. 52 In the surgical field, AI has been explored for enhancing intraoperative navigation. One feasibility study using over 2000 annotated images developed a DL algorithm capable of accurately predicting critical anatomical landmarks, such as the left adrenal vein, during adrenal surgery. 53 Artificial intelligence has also shown potential in predicting post-treatment outcomes in adrenal metastases treated with surgical or ablative modalities. A model utilizing ML and texture features, implemented via a linear support vector machine, achieved an AUC of 0.93 (p = 0.024) in predicting local progression or cancer-specific survival. 54 In the domain of endocrine diagnostics, the integration of plasma steroidomics with AI has been studied for primary aldosteronism diagnosis. 55 Furthermore, a multi-omics (MOmics) framework incorporating adrenal biomarkers was developed for the differential diagnosis of primary versus secondary hypertension. This classifier demonstrated superior diagnostic performance compared to mono-omics approaches, 56 reinforcing the potential of AI in complex endocrine-metabolic disorders.
Discussion
Recent evidence highlights the potential of AI-powered radiomics in significantly improving lesion detection and characterization across various urologic malignancies, including prostate, kidney, and bladder cancers, thereby enhancing diagnostic precision and facilitating personalized treatment planning.57–62 While these applications are well documented in the urologic field, the present review focuses specifically on the diagnosis and characterization of adrenal masses. The emergence of AI, alongside ML and DL techniques, has opened new and promising avenues for improving the detection, classification, and risk stratification of adrenal lesions.6,37 AI refers to the simulation of human cognitive functions by machines, enabling them to execute tasks typically requiring human intelligence. This interdisciplinary field integrates concepts from computer science, mathematics, and cognitive science to develop systems capable of environmental perception, reasoning, and decision-making, often through methodologies such as pattern recognition, problem-solving, and adaptive learning.63,64
ML, a core subset of AI, focuses on the development of algorithms that learn from data through iterative exposure, enhancing their ability to identify patterns, generalize findings, and generate accurate predictions.65,66 Within ML, DL employs artificial neural networks with multiple interconnected layers, forming a hierarchical structure that enables the model to learn complex, non-linear relationships in high-dimensional data. This architecture allows for robust modeling of intricate clinical problems, particularly in the imaging domain67–69 (Figure 2).

Graphical representation of AI, ML, and DL.
Radiomics, a term coined around 2014, refers to the high-throughput extraction of quantitative features from medical images. While texture analysis has been used for decades, radiomics expands this concept by integrating advanced statistical and ML methods. 70 Importantly, radiomics does not involve image acquisition but relies on post-processing analysis of existing imaging data. Key steps include ROI segmentation, feature extraction (e.g., texture, intensity, shape), and model development to link features to clinical outcomes71,72 (Figure 3).

Radiomics workflow for adrenal tumor identification.
Quantitative feature extraction can be performed manually by delineating an ROI or using DL technology for automated ROI delineation.73,74 Once the features from the ROI are extracted and normalized, ML methods are applied to analyze imaging features associated with specific outcomes. 75 Both DL and ML, as subfields of AI, enable data-driven predictions and decisions.76,77 However, ML typically relies on manually engineered features and structured data, requiring smaller datasets and offering easier interpretability. DL, on the other hand, uses layered neural networks that automatically extract hierarchical features from unstructured data but demands larger datasets and presents interpretability challenges due to its complex architecture.67,68,78–82 (Figure 4).

Differences between ML and DL.
CT-based radiomics demonstrated superior diagnostic accuracy compared to MRI, with a mean AUC of 0.88 versus 0.82. This aligns with current guidelines recommending CT as the first-line imaging modality for adrenal lesions due to its higher reproducibility and standardized protocols. We did not identify any studies focused on the application of radiomics to ultrasound imaging, despite the potential of AI tools to improve diagnostic accuracy by extracting features or recognizing patterns.
The most promising image classifiers have been built on unenhanced CT scans, which are the primary and perhaps the most straightforward test for detecting intracellular lipids, allowing for the diagnosis of adrenal adenoma in over 70% of cases. 29 Radiomics models based on ML showed good accuracy in differentiating between aldosterone-producing adenoma and non-functioning adrenal adenoma based on contrast-enhanced computed tomography scans. 13 These models have also been effective in subtyping adrenal adenomas using CT radiomics features. 25 While similar results have been achieved using radiomics on T1- and T2-weighted MRI images,24,26,32 evidence for MRI-based models remains limited. The development of ML models using MRI is also challenging due to the variability in image acquisition protocols. Promising results have also been shown with 18F-fluorodesoxyglucose (FDG) PET/CT and other radiotracers in differentiating benign and malignant adrenal lesions. The incorporation of AI methods in this context suggests that combining SUV max values with textural features could substantially improve the differentiation between FDG-avid benign and malignant lesions.
Future research directions in this area include the development of interpretable ML models that integrate both deep neural network-derived biomarkers and expert-driven radiomics, which can enhance the performance in malignancy classification while maintaining inherent interpretability. 83 Collaborative efforts between radiologists and AI systems are crucial in optimizing adrenal lesion identification, where AI can assist radiologists to enhance diagnostic accuracy and efficiency.
Limitations and the future of AI in detecting adrenal lesions
The use of AI in medical imaging is leaping forward with the advent of powerful technologies and the usage of DL algorithms that are becoming a methodology of choice. For adrenal lesions, detection and diagnosis have major results, yet the use is restricted by several limitations due to technology, the retrospective nature of studies, and feedback from practitioners. 84
All AI models rely on the quality of the data used for their training. Developing a DL structure with biased or limited data will negatively impact its performance and results. For instance, in adrenal lesion detection, parameters such as relative washout, absolute washout, and lesion density play a crucial role in diagnosing malignancy. 85 However, models trained on specific datasets may underperform when applied to diverse populations, ethnicities, or imaging equipment—a limitation termed “generalization.” 86 This challenge is evident in studies like Piskin et al., 26 where MRI-based models showed significant disparities between training and test performance (training AUC: 0.92 vs test AUC: 0.72), highlighting overfitting risks. By contrast, CT-based models demonstrated more consistent generalization, with test AUCs ranging from 0.84 to 0.94, validated against histopathology or follow-up imaging. Similarly, Nakajo et al. 49 achieved superior test performance (AUC: 0.97) by combining SUVmax with texture features, underscoring the value of hybrid models to mitigate dataset-specific biases. To address generalization, AI models must prioritize external validation across multi-institutional cohorts and diverse imaging protocols. For example, variations in CT slice thickness or MRI field strength can alter radiomic feature reproducibility, necessitating standardized acquisition protocols. 72 Furthermore, federated learning frameworks could enable collaborative model training without centralized data sharing, preserving patient privacy while improving generalizability. 87 Future research should also focus on dynamic learning systems that iteratively refine predictions when exposed to new data, ensuring adaptability across clinical settings.
Despite promising developments, the implementation of AI in medical practice remains met with skepticism, primarily due to regulatory and ethical concerns. A central ethical issue involves data security and patient privacy; however, the use of robust encryption methods and effective anonymization protocols may help mitigate these risks. 88 Another critical limitation of current AI applications in adrenal imaging lies in the heterogeneous quality of CT images, especially in centers where internationally validated adrenal imaging protocols are not uniformly followed. AI models trained on low-quality or biased datasets may underperform when applied to higher-quality images, limiting their generalizability and clinical utility.
Furthermore, small sample sizes represent a major challenge for training and validating ML and DL algorithms—particularly in the context of adrenal lesions, which are relatively rare. This limitation restricts model robustness and reliability. Nevertheless, collaborative, multicenter efforts could enable the collection of larger, standardized datasets, addressing this issue in future research. While several models demonstrated promising results (with AUCs around 0.8), there remains substantial room for improvement, as this performance is comparable to conventional radiologic assessments. Integrating AI tools into the clinical workflow has the potential to provide real-time support for radiologists and surgeons. Currently, CT imaging plays a pivotal role in the diagnosis of adrenal malignancies. Several studies suggest that tumor size and radiologic signs of malignancy may guide therapeutic decisions, though a definitive threshold for malignancy risk remains undefined. Given the high diagnostic accuracy of modern imaging, the need for histological biopsies and unnecessary surgeries has decreased. 89 Emerging evidence also points to the added value of T2-weighted (T2w) sequences in MRI radiomics, which may become crucial in the differential diagnosis and management of adrenal masses. To support widespread implementation, user-friendly interfaces, interoperability across imaging systems, and standardized AI integration into radiologic platforms are essential. 87 In addition, clear regulatory frameworks and clinical guidelines must be established, including the definition of diagnostic cutoff values and validation protocols, before clinical adoption. 28 Looking ahead, prospective studies, well-designed algorithms, and diverse, high-quality datasets are necessary to build robust, generalizable models. Moreover, improvements in image acquisition quality and AI-based image preprocessing techniques will likely enhance the performance of radiomics models and optimize patient management.
Conclusion
Given the limitations of current imaging modalities in accurately characterizing adrenal lesions, the integration of artificial intelligence and radiomics presents a significant opportunity to improve diagnostic precision. In this review, CT-based radiomics achieved a mean AUC of 0.88, outperforming MRI-based radiomics (mean AUC: 0.82), thus confirming CT as the preferred modality for adrenal lesion characterization. This systematic assessment evaluated the diagnostic performance of imaging-based radiomics in distinguishing benign from malignant tumors and functional from non-functional adrenal masses. Despite current limitations—including small cohort sizes, lack of prospective validation, and insufficient standardization—the potential of AI to augment diagnostic workflows is evident. Future studies, driven by advances in AI, radiomics, and collaborative research, are essential to validate these tools, ensure clinical applicability, and ultimately improve patient outcomes in the management of adrenal tumors.
Supplemental Material
sj-docx-1-tau-10.1177_17562872251352553 – Supplemental material for Artificial intelligence and radiomics applications in adrenal lesions: a systematic review
Supplemental material, sj-docx-1-tau-10.1177_17562872251352553 for Artificial intelligence and radiomics applications in adrenal lesions: a systematic review by Matteo Ferro, Octavian Sabin Tataru, Giuseppe Carrieri, Gian Maria Busetto, Ugo Giovanni Falagario, Martina Maggi, Felice Crocetto, Biagio Barone, Francesco Del Giudice, Michele Marchioni, Daniela Terracciano, Giuseppe Lucarelli, Pasquale Ditonno, Raul Gherasim, Ciprian Todea-Moga, Giuseppe Fallara, Marco Tozzi, Antonio Cioffi, Roberto Bianchi, Alessio Digiacomo, Alessandro Veccia, Alessandro Antonelli, Maria Chiara Sighinolfi, Luigi Schips and Bernardo Rocco in Therapeutic Advances in Urology
Footnotes
Appendix
Acknowledgements
The authors express their gratitude to “Fondazione Muto Onlus,” from Naples, for the support related to the publication of this manuscript.
Declarations
Supplemental material
Supplemental material for this article is available online.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
