Abstract
Endometrial cancer (EC), a growing malignancy among women, underscores an urgent need for early detection and intervention, critical for enhancing patient outcomes and survival rates. Traditional diagnostic approaches, including ultrasound (US), magnetic resonance imaging (MRI), hysteroscopy, and histopathology, have been essential in establishing robust diagnostic and prognostic frameworks for EC. These methods offer detailed insights into tumor morphology, vital for clinical decision-making. However, their analysis relies heavily on the expertise of radiologists and pathologists, a process that is not only time-consuming and labor-intensive but also prone to human error. The emergence of deep learning (DL) in computer vision has significantly transformed medical image analysis, presenting substantial potential for EC diagnosis. DL models, capable of autonomously learning and extracting complex features from imaging and histopathological data, have demonstrated remarkable accuracy in discriminating EC and stratifying patient prognoses. This review comprehensively examines and synthesizes the current literature on DL-based imaging techniques for EC diagnosis and management. It also aims to identify challenges faced by DL in this context and to explore avenues for its future development. Through these detailed analyses, our objective is to inform future research directions and promote the integration of DL into EC diagnostic and treatment strategies, thereby enhancing the precision and efficiency of clinical practice.
Highlights
This research delves into the utility of diverse imaging techniques for EC, encompassing US, MRI, hysteroscopic imaging, and histopathological imaging.
The study provides a critical assessment of the DL frameworks and datasets related to the various imaging modalities for EC as reported in current literature, along with a synthesis of the findings.
It offers an integrated analysis of the application of existing DL algorithms in diagnosing EC, identifying prevailing challenges and charting a course for future research endeavors in this domain.
Introduction
Endometrial cancer (EC) is increasingly recognized as a leading malignancy, particularly affecting women in the perimenopausal and postmenopausal phases. Driven by socioeconomic and lifestyle factors, EC's rising incidence and mortality rates pose a significant global health challenge. In developed countries, it accounts for nearly 7% of all female cancers, with a concerning trend of increasing mortality rates. 1 Timely detection and accurate diagnosis are crucial for optimizing patient outcomes; delayed diagnosis can lead to advanced-stage disease with limited treatment options.
Current diagnostic methods for EC, including ultrasound (US), magnetic resonance imaging (MRI), hysteroscopy, and histopathology, provide vital insights for clinical decision-making (Figure 1). However, the analysis of these methods heavily relies on the expertise of radiologists and pathologists, a process that is not only time-consuming and labor-intensive but also prone to human error. This reliance can compromise diagnostic accuracy and delay the initiation of necessary treatment, negatively impacting patient outcomes.

Medical images of EC: (a) US, (b) hysteroscopy, (c) MRI, (d) histopathology.
Advancements in deep learning (DL) have shown significant potential in automating and enhancing medical image analysis. DL models can autonomously learn and extract complex features from imaging and histopathological data, achieving high diagnostic accuracy across various cancers. 2 Despite these promising capabilities, the integration of DL into EC diagnosis remains in its early stages, with limited research focusing on its incorporation into standard diagnostic practices.
This study conducts a comprehensive review of the current applications of DL in EC diagnosis, treatment, and prognosis. By examining the published research on DL's role in various imaging modalities for EC (Figure 2), this review aims to identify knowledge gaps and suggest directions for future investigation. It assesses the diagnostic performance, benefits, and limitations of diverse DL methodologies, intending to contribute to the development of more precise and efficient diagnostic tools for EC.

Application of different imaging techniques combined with DL algorithm in EC.
In conclusion, this study aims to fill the research gap concerning DL in EC, serving as a valuable resource for clinicians, researchers, and policymakers dedicated to enhancing outcomes for EC patients.
Deep learning
DL, a critical component of the machine learning (ML) field, is renowned for its ability to autonomously discern complex visual and molecular disease signatures. Unlike traditional ML techniques, DL eliminates the need for manual feature engineering, enabling a continuous learning process from raw data to actionable insights. 3 The medical sector has increasingly adopted DL, particularly in radiomics and pathomics, where its impact has been transformative.4,5
DL models are categorized into supervised and unsupervised learning paradigms. Supervised learning includes models such as artificial neural networks (ANNs), convolutional neural networks (CNNs), and recurrent neural networks (RNNs). ANNs, inspired by the human brain's neural structure, consist of an input layer, multiple hidden layers, and an output layer. Data is input into the network, processed through the hidden layers, and the network's weights are fine-tuned using gradient descent to minimize prediction errors, culminating in the output layer's final diagnostic prediction.
CNNs, a quintessential DL model, have become the preferred architecture for computer vision tasks. 6 These networks apply convolution operations to input images using adaptable filters, followed by bias addition to create feature maps. These maps are then refined through functions like the Sigmoid to yield higher-order feature representations, leading to the network’s predictive output. CNNs’ weight-sharing mechanism and their structure, akin to biological neural networks, simplify model complexity and reduce parameter count, positioning them at the forefront of speech and image recognition research. Iconic CNN architectures like “AlexNet,” “VGGNet,” “ResNet,” and “GoogLeNet” have achieved remarkable success in prestigious computer vision contests. 7 However, CNNs face challenges in managing long-range dependencies and global context, requiring substantial annotated data and computational resources to enhance performance and prevent overfitting, which remains a significant developmental hurdle.
RNNs are a specialized class of ANNs designed for sequential data processing, characterized by their internal recurrent connections that facilitate the retention and manipulation of sequence-based information. RNNs process sequential data through their recurrent hidden layer connections, producing outputs at the output layer. While adept at capturing sequence context, RNNs can struggle with memory capacity in lengthy sequences and may suffer from vanishing or exploding gradient issues. To counteract these challenges, more sophisticated network architectures like long short-term memory (LSTM) and gated recurrent units (GRU) have been developed.
Unsupervised learning, in contrast, operates on unlabeled data, seeking to uncover the underlying data structure without external guidance. Employing techniques such as clustering and dimensionality reduction, it groups and condenses data representations. Notable algorithms in this domain include K-means clustering and principal component analysis (PCA). While unsupervised learning is adept at handling complex tasks, its accuracy may not match that of supervised learning.
This review exhaustively examines the application of DL models across various imaging modalities in EC, encompassing screening, diagnosis, molecular classification, typing, and prognostic treatment.
DL for MRI
Research has shown that preoperative MRI-radiomics analysis is a reliable predictor of tumor grading, deep myometrial invasion (MI), lymphovascular space invasion, and nodal metastasis in EC patients. 8 DL models have become essential tools in EC evaluation using MRI, marking a significant advancement in recent research (Table 1). Leveraging the advanced image recognition capabilities of CNNs, these models have notably improved the diagnostic accuracy of EC, especially in quantifying the depth of MI. By streamlining the segmentation of lesion areas and assessing MI depth, these algorithms efficiently filter through irrelevant image data, extracting features of greater diagnostic value, while eliminating the need for manual tumor margin delineation.
Studies Conducted for EC based on MRI using DL techniques.
A collaborative study 9 that combined the YOLOv3 algorithm with ResNet stands out for its successful automation of EC lesion localization and MI depth determination, achieving an accuracy rate of 84.78%. This result indicates a diagnostic proficiency comparable to that of expert radiologists. In another study, Mao et al. 10 developed three MI depth assessment algorithms tailored to the diverse morphologies of the uterus and tumor. By selecting the most suitable algorithm for each case, they achieved an accuracy rate of 93.34%, demonstrating superior accuracy in assessing the progression of early-stage EC.
In a comparative analysis of DL models against radiologists’ interpretations of T2-weighted imaging (T2WI), both single and composite image sets were examined. The findings revealed that DL models outperformed radiologists in certain diagnostic scenarios, particularly with axial apparent diffusion coefficient (ADC) maps and axial contrast-enhanced T1-weighted imaging (CE-T1WI). 11 Enhancing the training datasets with diverse imaging modalities further improved the diagnostic capabilities of these models. Zheng et al. 12 constructed an eight-layer CNN model, which, through data augmentation techniques, improved the model's ability to distinguish between EC-negative and stage I EC-positive patients on MRI scans, thus enhancing the precision of early-stage EC diagnosis.
Wang et al. 13 made a significant contribution by using CNN to extract features from diffusion-weighted imaging (DWI), combined with clinical parameters, radiomic features, and ADC values. They developed a comprehensive predictive model for assessing microsatellite instability (MSI) in EC patients. This model achieved an impressive area under the curve (AUC) value of 0.885 in the test set, indicating a significant step towards predicting molecular states from imaging data. This approach offers an alternative to traditional pathological evaluations, providing a more accessible, cost-effective, and non-invasive diagnostic method for MSI assessment, ultimately enriching patient care.
In summary, DL models, especially CNNs, have significantly improved the diagnostic precision of EC when applied to MRI. These models excel at automating the segmentation of lesion regions and evaluating MI depth, reducing the workload for radiologists and increasing diagnostic efficiency. However, these models have limitations, primarily their reliance on proprietary datasets, which calls for validation across larger, multi-center datasets. Additionally, the current focus on T2WI data, with less exploration of other MRI sequences like dynamic contrast-enhanced imaging, may limit the diagnostic potential of these models. Future research should integrate multisequence MRI data to enhance the versatility and accuracy of these models.
DL for hysteroscopy
Traditional hysteroscopy, while beneficial, has limitations. It often relies on physicians’ subjective assessments, potentially failing to identify minor lesions and risking misdiagnosis and oversight. 23 To address these challenges, the integration of DL has brought transformative improvements to EC diagnosis. DL models autonomously learn and extract intricate features from a vast array of hysteroscopic images, achieving precise EC recognition. During model training, these algorithms have shown an exceptional ability to discern minute differences between healthy and neoplastic tissues across various dimensions, including color, texture, and morphology. Crucially, DL can identify early-stage lesions that may elude human perception, substantially reducing the likelihood of diagnostic errors. This technology not only enhances diagnostic precision but also speeds up the transition from lesion identification to treatment initiation, providing patients with more timely and effective therapeutic strategies. This underscores the growing significance of artificial intelligence in the early detection and management of EC.
Recent scholarly efforts have utilized a range of DL models, such as Xception, MobileNetV2, and EfficientNetB0, demonstrating remarkable efficiency and accuracy in image recognition tasks (Table 2). Zhao et al. 24 introduced a computer-aided diagnostic system based on the EfficientNet architecture, incorporating the ParNet attention mechanism. This mechanism, which combines spatial and channel feature learning with an advanced Squeeze-and-Excitation module known as Skip-Squeeze-Excitation, has sharpened the model's focus on salient regions and features within the data. The system's impressive AUC value of 0.941 and accuracy rate of 89.4% in the test set highlight its superior ability to differentiate benign from malignant endometrial lesions, marking a critical advancement for the early identification and treatment of these conditions. Furthermore, a study incorporating continuity analysis methods has raised the model's accuracy in malignancy detection to over 90%. 25 This outcome underscores that even with a modest sample size, DL models can achieve high diagnostic accuracy, a key factor in enhancing the efficacy and precision of clinical diagnostics.
Studies conducted for EC based on hysteroscopy using DL techniques.
In conclusion, DL models have shown an extraordinary aptitude for discerning subtle differences between normal and malignant tissues in hysteroscopic images, significantly increasing the sensitivity of early lesion detection. However, obtaining high-fidelity hysteroscopic image data remains a challenge, as image quality can be compromised by factors such as lighting conditions and equipment variability, impacting model performance. Moreover, given the invasive nature of hysteroscopy, its clinical utility is inherently limited, emphasizing the need for ongoing research into the seamless integration of DL technology within non-invasive diagnostic frameworks.
DL for US
US is a crucial tool for the initial detection of lesions, offering significant advantages in depicting the thickness, uniformity, and presence of endometrial abnormalities. This makes it pivotal in the early screening and diagnosis of EC. In the context of breast cancer, DL-based US technology has been repeatedly validated and proven to have high clinical value.29–31 This not only confirms the effectiveness of DL technology in US diagnostics but also provides new insights and methods for diagnosing EC.
DL has recently been applied to analyze US images for assessing EC, focusing on the degree of MI and lesion characteristics (Table 3). Models such as EfficientNet-B6 and Dense-Pyramid-Attention U-Net (DPA-UNet) have demonstrated their ability to identify tumor features—like edges, size, and morphology—from extensive datasets of US images. This capability aids clinicians in making more nuanced clinical staging and treatment decisions. Notably, Liu et al. have shown that the EfficientNet-B6 model surpasses traditional radiologists in predicting the depth of MI in EC patients, as evidenced by superior performance metrics including accuracy, sensitivity, specificity, and AUC. 28
Studies conducted for EC based on US using DL techniques.
However, the field faces several fundamental challenges in EC. The limited availability of high-quality US images hampers the broad implementation of DL models in clinical settings. Additionally, the quality of US images is influenced by various factors, including the operator's skill level, the quality of the equipment, and the patient's body habitus. These factors can introduce variability in the data and complicate the training of models. Future research should prioritize the development of high-quality US datasets and investigate strategies to mitigate the impact of these variable factors on model performance.
Therefore, researchers are encouraged to focus on constructing large-scale, high-quality datasets and developing more valuable DL models to fully harness the significant potential of US technology in EC screening and diagnosis. Through these efforts, it is anticipated that more accurate and efficient early diagnosis of EC will be achieved in the future, thereby providing patients with more timely and effective treatment strategies.
DL for histopathology
DL has unlocked vast potential in the realm of histopathological analysis, particularly for EC. A suite of DL models has been tailored for scrutinizing histological sections, heralding a new era in precision diagnostics (Table 4). The EndoNet model, developed by Manu Goyal and colleagues, exemplifies this progress. It uses CNNs to distill histological features and employs vision transformers to synthesize these attributes, subsequently categorizing EC slides into low-grade and high-grade tiers based on visual signatures. This model achieved an impressive AUC of 0.86 in external validation testing. 32 The HIENet model, another notable development, leverages CNNs and attention mechanisms to discern and classify EC, demonstrating superior performance over conventional machine learning approaches with an accuracy rate of 76.91% and an AUC of 0.9579 in binary classification tasks. 33 The G2LNet model, proposed by Zhao et al., further refines diagnostic precision by amalgamating global and local features with multi-scale learning strategies. 34
Studies conducted for EC based on histopathology using DL techniques.
The frontier of histological analysis is expanding to include multimodal data integration, such as immunohistochemistry (IHC) images and clinical data, which is anticipated to enrich pathological information and aid physicians in rendering more accurate diagnoses.35,36 To bolster model interpretability, techniques like Class Activation Mapping (CAM) have been employed, enabling the visualization of image regions pivotal to classification decisions. 33 This aids in demystifying model decision-making for clinicians and guides model refinement.
In the context of precision medicine, traditional histological typing has shown limitations, prompting the emergence of molecular typing as a novel classification paradigm. This method, encompassing four molecular subtypes, is intricately linked to EC etiology, tumor behavior, and patient prognosis. Incorporating molecular subtyping into risk stratification mitigates the risks of both overtreatment and undertreatment, optimizing clinical management. 37 The latest European ESGO/ESTRO/ESP guidelines and FIGO 2023 staging have officially integrated molecular typing into the tumor staging system, underscoring its clinical relevance.38,39
DL has opened new avenues for EC molecular typing, particularly in automated feature extraction and image analysis. Vincent Wagner et al. developed a DL algorithm that predicts EC molecular subtypes from H&E-stained histological slides with ResNet-18, achieving accuracy comparable to current clinical molecular typing methods. 40 Runyu Hong et al. introduced Panoptes, a multi-resolution deep CNN capable of predicting histological and molecular subtypes, as well as 18 common gene mutations in EC, showcasing its potential in clinical settings. 41
The predictive power of DL is also evident in mismatch repair protein status and MSI, 42 with models like those proposed by Zhang et al. 43 and Mina Umemoto 44 demonstrating high accuracy and AUC values. These advancements reduce reliance on pathologists’ subjective judgments, enhancing diagnostic consistency and efficiency.
Wang et al. 45 developed a comprehensive DL framework that includes foreground region localization, iterative patch sampling, patch attention scoring, and weighted softmax ensemble decision-making. This framework, leveraging a pre-trained Modified Fully Convolutional Network (MFCN) and an InceptionV3 classifier, has shown significant improvements in accuracy, precision, and sensitivity compared to benchmark DL methods.
Sarah Fremond's im4MEC model 46 not only predicts molecular typing in a four-class manner but also identifies morphological features associated with each molecular class, revealing the genetic characteristics of tumors and providing more personalized treatment plans for patients. Subsequently, this research team developed the HECTOR model to predict postoperative distant recurrence risk using H&E-stained slides from the same cohort. Their findings demonstrated that the proposed method surpassed the current gold standard of diagnosis, which integrates pathological and molecular analyses, in predicting distant recurrence risk. Furthermore, the model was found to predict the benefit of adjuvant chemotherapy, both of which contribute to the provision of personalized treatment in EC. 47
In summary, DL models have markedly enhanced the accuracy of EC classification and molecular subtyping by extracting rich pathological information from histopathological images. However, challenges remain, including the need for improved model interpretability and the ability to capture the heterogeneity of pathological samples. Addressing these challenges is crucial for future research. Moreover, the validation of molecular feature prediction models with real-world data is essential, necessitating multicenter, large-scale studies to assess their practical efficacy.
Discussion
The global incidence of EC is increasing, affecting not only developed nations but also showing a rising prevalence in developing countries. Early screening and diagnosis are crucial for EC patients, as they are key to ensuring prompt treatment and enhancing survival rates. Despite the critical need, effective screening methods for EC are lacking, with definitive diagnoses often delayed until the onset of overt symptoms. 52 In this context, the integration of DL models into clinical practice promises to enhance diagnostic precision and alleviate the workload of healthcare professionals, significantly boosting the efficiency of medical operations.
This review comprehensively synthesizes the existing literature on the application of DL models in predicting EC classification, molecular typing, and MI. DL models have demonstrated remarkable potential in MRI-based EC assessments, garnering considerable research interest. However, current studies predominantly employ T2WI data, with limited exploration of other MRI sequences, such as dynamic contrast-enhanced imaging. Future research should investigate the integration of diverse MRI sequences to enhance the model's diagnostic accuracy and generalizability.
Hysteroscopy also faces challenges in procuring high-quality image data, with image processing potentially compromised by factors such as lighting variations and equipment disparities. Yet, intraoperative hysteroscopic images offer a more direct visualization of lesions and endometrial features, which is invaluable for surgical decision-making. Thus, the advancement of DL technologies tailored to hysteroscopic images holds significant clinical relevance.
A notable gap exists in the development of DL models utilizing US imaging, despite its widespread use for EC assessment. This scarcity may stem from the challenges associated with acquiring high-quality US datasets, as image quality is contingent upon various factors, including operator expertise, equipment quality, and patient anatomy. While US is a staple in gynecological examinations, its diagnostic sensitivity and specificity for EC may be surpassed by other imaging modalities, such as MRI. Nonetheless, as a vital tool for early EC detection, bolstering research in DL models based on US imaging is crucial for the early identification of patients with high-risk factors, facilitating timely intervention. We advocate for an intensified focus on US-based DL research to bridge this gap.
Histopathology-based DL models represent a burgeoning area within medical AI, harnessing extensive pathological data to inform clinical decisions with precision. By integrating pathomics and radiomics approaches, these models can elucidate critical information such as lymph node status, MI depth, and histological classification, thereby optimizing medical efficiency and reducing costs.
In contrast to DL’s application in other oncological contexts, molecular typing in EC has garnered significant attention, with researchers employing four-class models to categorize EC into distinct molecular subtypes. This stratification facilitates personalized, precision medicine. However, current predictive models for molecular features are primarily confined to public datasets, necessitating validation with real-world clinical data. The extraction of molecular pathological features from biopsy specimens represents a timely avenue for achieving personalized treatment goals, marking a pivotal area for future research endeavors.
Future developments and challenges
The current landscape of DL models for EC is filled with challenges that need to be addressed to fully realize their potential. While progress has been made, most models are still trained and validated on limited datasets, which can reduce their effectiveness when applied to diverse real-world data. To overcome this, future studies should focus on multicenter and multisource data validation to ensure robust model performance across varied patient populations. Establishing open-access data-sharing platforms is crucial for maximizing data utility and fostering scientific collaboration, essential for the clinical effectiveness of DL models. However, creating extensive, high-quality datasets is a complex task with its own set of challenges.
The data preparation phase is inherently complex, requiring meticulous preprocessing, feature annotation, and accurate labeling, all of which are prone to human error. Discrepancies in scanner parameters and imaging device settings can introduce variability in image acquisition, highlighting the need for larger sample sizes to enhance model adaptability and robustness. Additionally, class imbalance poses a significant challenge for DL models, as they may exhibit bias towards more prevalent categories, potentially compromising their generalizability. International collaboration and data sharing are vital for tailoring models to diverse ethnicities and genetic backgrounds, thereby enhancing their applicability. Moreover, integrating DL models with varied architectures could synergistically improve diagnostic precision.
In anticipation of future advancements, research should explore the integration of image data, electronic medical records, and genomic information to develop multimodal DL models. Such an approach could enhance the models’ predictive capabilities, enabling personalized prognoses for individual patients. Furthermore, bolstering public education and fostering trust in artificial intelligence among patients and healthcare professionals are pivotal for the clinical adoption of DL models. In essence, by transcending dataset limitations, enhancing model generalizability, integrating multimodal data, and fostering public trust, DL models are poised to play an increasingly pivotal role in the diagnostic and therapeutic management of EC.
Conclusion
The rapid development of DL technology has made it possible to improve the diagnosis and precision medicine of EC by automatically extracting relevant features and fully leveraging large datasets. In this study, we reviewed research on the application of DL to various medical imaging modalities for EC diagnosis, such as “MRI,” “hysteroscopy,” “US,” and “histopathology,” to explore the diagnostic classification, MI, and molecular characteristics of EC. Furthermore, this research also outlines potential future research directions and challenges for the application of DL methods in EC.
Footnotes
Contributorship
JXJ, FCL, and SWY were involved in methodology, software, data curation, and writing–the original draft. FLL, LQQ, and HYP contributed to conceptualization, supervision, and writing–review and editing, and project administration. CBX was involved in writing–review and editing.
Declaration of conflicting interests
The authors declared no potential conflicts of interest to the research, authorship, and/or publication of this article.
Ethical approval statement
Review and approval by an ethics committee was not needed for this study because this study was an analysis based on the literature and did not involve human or animal studies.
Funding
The authors received no financial support for the research, authorship, and/or publication of this article.
Guarantor
Baoxia Cui
