A Preliminary Study on the Auto-Segmentation of Targets and Organs at Risk in Pediatric Total Marrow and Lymphoid Irradiation

Abstract

Introduction

Leukemia is one of the most prevalent cancers in children. The use of total marrow and lymphoid irradiation (TMLI) via helical tomotherapy (TOMO) as a conditioning regimen prior to bone marrow transplant (BMT) has been widely adopted in clinical practice. Accurate and efficient segmentation of target volumes and organs at risk (OARs) is a prerequisite for precise TMLI. The purpose of this study was to investigate the feasibility of deep learning-based auto-segmentation technology (using 2D U-net and 3D V-net models) for target volumes (bone marrow and lymphatic drainage regions) and organs at risk (OARs) in pediatric total marrow and lymphoid irradiation (TMLI).

Methods

This study was designed as a retrospective study. Thirty-six pediatric patients treated with TMLI between 2018 and 2024 were included. Target volumes and OARs were manually segmented and refined. The CT images and corresponding contours were imported into the AccuLearning workstation (Manteia Company, Xiamen, China) to train, validate, and test based on 2D U-net and 3D V-net deep learning models. The auto-segmentation performance was evaluated on 6 test cases using the Dice Similarity Coefficient (DSC), Hausdorff Distance (HD), and Average Surface Distance (ASD).

Results

Finally, analysis revealed DSC values >0.7 for all OARs except lenses segmented by the 3D V-net model. For target volumes, bone structures achieved high segmentation accuracy.

Conclusion

The 3D V-net model demonstrated superior performance compared to the 2D U-net model. Auto-segmented contours generated by the 2D U-net and 3D V-net models, with minor manual adjustments, are clinically applicable for TMLI radiotherapy planning.

Keywords

children radiotherapy total marrow and lymphoid irradiation auto-segmentation

Introduction

Radiotherapy is a common treatment modality for malignant tumors, with over 70% of cancer patients requiring radiation therapy.¹ Accurate and efficient segmentation of target volumes and Organs at Risk (OARs) is a prerequisite for precise radiotherapy.² Currently, manual segmentation by physicians remains the gold standard, though it is a time-consuming process. Studies^3,4 indicate that contouring for a single total marrow and lymphoid irradiation (TMLI) patient typically requires 12–16 h. Even with established contouring guidelines, inter-observer variability exists due to differing physician preferences, and intra-observer variations may occur when the same physician contours at different times.⁵

Compared to manual segmentation, auto-segmentation has demonstrated significant advantages since its inception, including reduced workload for physicians,⁶ shorter patient waiting times, and improved therapeutic ratios for tumors.^7–11 Currently, the most widely used auto-segmentation technology relies on Deep Learning (DL)-based algorithms, with the workflow illustrated in Figure 1.

Figure 1.

Flow chart of the auto-segmentation.

Leukemia is one of the most prevalent cancers in children. The use of TMLI via helical tomotherapy (TOMO) as a conditioning regimen prior to bone marrow transplant (BMT) has been widely adopted in clinical practice.^3,12–16 Recent researches on TMLI auto-segmentation have been conducted.^3,4,17–19

The primary objective of this study is to explore the feasibility of auto-segmentation for target volumes and OARs in pediatric TMLI using deep learning-based techniques [2D U-net (Figure 2) and 3D V-net (Figure 3) models]. This approach aims to address the limitations of conventional auto-segmentation software in handling pediatric-specific anatomical variations,^20,21 along with overcoming the time-consuming nature and inter-observer variability inherent in traditional manual segmentation, thereby enhancing radiotherapy workflow efficiency.

Figure 2.

2D U-net structure.

Figure 3.

3D V-net structure.

Materials and Methods

Dataset Acquisition and Preprocessing

This retrospective study enrolled 36 consecutive pediatric patients (25 males and 11 females; age distribution is presented in Figure 4) who underwent TMLI at the Department of Radiotherapy, The Seventh Medical Center of Chinese PLA General Hospital between 2018 and 2024. All patient data were de-identified. The study protocol was approved by the Ethics Committee of the same institution (No. S 2025-080-01), and the requirement for written informed consent was waived due to the retrospective nature of the study and the use of fully anonymized data.

Figure 4.

Patient age distribution.

CT simulation was performed using a Philips Brilliance Big Bore CT scanner (Philips Healthcare, Best, the Netherlands). Patients were positioned supine with a head-first orientation, immobilized using a head-neck-shoulder thermoplastic mask for the upper thorax, a thermoplastic mask for the abdomen/pelvis, and a vacuum cushion for the lower extremities. Due to variations in pediatric patient height, full-body scans (from head to feet) were acquired for shorter children, while taller children underwent two-phase scanning: initial head-first supine scans up to the knees, followed by feet-first supine scans to complete the procedure. Notably, the feet-first supine (FFS) scans only covered the leg bone marrow target regions. To standardize the study scope, all defined target volumes were limited to regions above the knees. Acquired CT images had a resolution of 512 × 512, with slice thickness and spacing of 5 mm, and a tube voltage of 120 kV. These images were then imported into the Pinnacle³ treatment planning system (Philips Radiation Oncology Systems, Madison, WI, USA) for manual segmentation.

The OARs defined in this study include: brain, brainstem, heart, kidneys, liver, lungs, oral cavity, parotid glands, stomach, bladder, lenses, eyeballs, thyroid, esophagus, small bowel, colon, bowel bag and rectum. The clinical target volume (CTV) is subdivided into three components (Figure 5): CTV1 (femoral heads, humeral heads, and bone marrow excluding the appendicular bones), CTV2 (bone marrow above the knees excluding CTV1 regions), and CTV3 (lymphatic drainage regions).

Figure 5.

Clinical target volume (CTV1 in red, CTV2 in green, CTV3 in purple).

The planning target volume (PTV) was generated by merging all three CTV components with appropriate margin expansions for subsequent treatment planning. Following initial segmentation, manual refinements were performed based on the patients’ clinical data and unified contouring criteria. The final contours were reviewed and validated by three experienced radiation oncologists, serving as the Ground Truth (GT).

No specific bladder preparation protocol was required during simulation, as the degree of bladder filling showed no significant impact on treatment delivery or OAR protection. While a filled bladder could be delineated with clearer boundaries on CT images, the borders between an unfilled bladder and adjacent immature pediatric pelvic structures (such as the uterus or prostate) were often indistinct due to their underdeveloped anatomical differentiation. To date, no internationally established guidelines exist for TMLI target volume definition; the criteria adopted in this study were based on our institution's clinical experience. The reporting of this study conforms to the STROBE guidelines.²²

Environment Configuration

The training and validation of the auto-segmentation models were conducted using AccuLearning, a Deep Learning (DL)-based medical image auto-segmentation training platform developed by Manteia Technologies Co., Ltd (Xiamen, China). The platform operates on a Windows 10 system with an Intel^® Core™ i7-10700 CPU @ 2.90 GHz processor. AccuLearning supports small-sample training for auto-segmentation algorithms, where high-precision models can be generated even with limited datasets, with a minimum recommended training cohort size of 30 cases. To date, few studies have reported the feasibility of small-sample algorithms. In this work, we trained and tested models using data from 36 pediatric leukemia patients on the AccuLearning platform to evaluate the platform's applicability. During training, the platform enables data-driven parameter updates and automated feature extraction, outperforming traditional image processing algorithms.²³

In recent years, Convolutional Neural Networks (CNNs), particularly 2D U-net and 3D V-net architectures and their variants, have been widely adopted in medical image auto-segmentation with promising results.^6,24–34 The AccuLearning platform utilizes two network architectures: 2D U-net and 3D V-net. Generally, 3D networks yield superior performance when processing large datasets with abundant z-axis slices in three-dimensional data, whereas 2D networks may achieve better results for smaller datasets or those with limited z-axis slices.

Model Training

The manually segmented patient data were transferred to the AccuLearning platform. Within AccuLearning, each dataset consisted of a CT image series and its corresponding RT Structure file. To optimize data processing and ensure model accuracy, regions of interest (ROIs) were divided into six training datasets: OARs Group 1: Brain, brainstem, oral cavity, lungs (bilateral), heart, liver, stomach, and bladder (8 OARs). OARs Group 2: Left/right eyeballs, left/right lenses, left/right parotid glands, and left/right kidneys (8 OARs). OARs Group 3: thyroid, esophagus, small bowel, colon, bowel bag, and rectum (6 OARs). CTV Groups: CTV1, CTV2, and CTV3 were assigned to three separate training sets.

The 36 patient cases were randomly split into training, validation, and test sets at a ratio of 26:4:6. The training set was used for model development, the validation set for hyperparameter tuning and training progress monitoring, and the test set for final model evaluation. Training parameters were standardized between the 2D U-net and 3D V-net frameworks to ensure comparability. After model training, auto-segmentation was performed on the 6 test cases, generating corresponding RT Structure files.³⁵

Evaluation Indicators

The manually segmented target volumes and OARs by physicians served as the Ground Truth (GT). Quantitative evaluation of the auto-segmentation model's performance was conducted using three metrics: Dice Similarity Coefficient (DSC), 95% Hausdorff Distance (HD95), Average Surface Distance (ASD). Studies^36,37 have indicated that a DSC value greater than 0.70 suggests good reproducibility between structures, and the auto-segmentation results are considered clinically acceptable. While DSC provides a straightforward and intuitive measure, it alone cannot fully characterize all aspects of segmentation quality. Therefore, we supplemented the evaluation with two distance-sensitive metrics: the HD95 and ASD.

The Hausdorff Distance (HD) is used to evaluate the surface distance in three-dimensional space between automatically and manually segmented structures. To avoid the impact of outlier noise points on evaluation results, typically the top 95% of data (HD95) with the smallest distances between point sets are selected for calculation. A smaller HD95 value indicates greater overlap between automatic and manual segmentations, representing higher segmentation accuracy. HD95 distance possesses strong fault tolerance and anti-interference capabilities, making it a highly position-sensitive parameter.³⁸ When image alignment is good, HD95 values remain very small; however, even minor deviations can cause HD95 values to suddenly increase to dozens or even hundreds of millimeters. Generally, larger anatomical regions correspond to higher HD95 values. Therefore, there is no definitive standard value for determining “good” HD95 values— as long as the HD95 value is not abnormally large, the alignment can generally be considered acceptable. The ASD quantifies the mean distance between two points sets by dividing the sum of mutual distances by their total surface area, serving as a metric to evaluate the overall contour deviation between automatically and manually segmented structures. An ASD value approaching zero indicates minimal shape discrepancy between the auto-segmented and manual reference contours.

Results

Tables 1 –3 present the three evaluation metrics for the auto-segmentation performance of all OARs using the 2D U-net and 3D V-net models, as well as the differences between the two models for the same metrics.

Table 1.

DSC Values for OARs Auto-Segmented by 2D U-net and 3D V-net Models ( $\bar{x} \pm s$ ).

OARs	2D U-net	3D V-net	Difference
Bladder	0.86 ± 0.10	0.90 ± 0.04	0.05 ± 0.05
Brain	0.98 ± 0.00	0.98 ± 0.00	0.01 ± 0.01
Brain Stem	0.87 ± 0.06	0.87 ± 0.06	0.01 ± 0.01
Heart	0.93 ± 0.02	0.93 ± 0.04	0.01 ± 0.01
Liver	0.96 ± 0.02	0.97 ± 0.01	0.01 ± 0.01
Lung_All	0.96 ± 0.03	0.96 ± 0.03	0
Oral Cavity	0.85 ± 0.05	0.86 ± 0.11	0.05 ± 0.04
Stomach	0.85 ± 0.05	0.89 ± 0.04	0.04 ± 0.03
Eye_L	0.90 ± 0.01	0.90 ± 0.02	0.01 ± 0.01
Eye_R	0.88 ± 0.03	0.90 ± 0.02	0.02 ± 0.01
Kidney_L	0.94 ± 0.01	0.95 ± 0.02	0.01 ± 0.01
Kidney_R	0.92 ± 0.03	0.95 ± 0.02	0.02 ± 0.02
Parotid_L	0.85 ± 0.02	0.86 ± 0.03	0.01 ± 0.01
Parotid_R	0.82 ± 0.07	0.83 ± 0.08	0.02 ± 0.01
Lens_L	0.75 ± 0.07	0.49 ± 0.10	0.24 ± 0.11
Lens_R	0.72 ± 0.09	0.52 ± 0.08	0.24 ± 0.10
Thyroid	0.83 ± 0.07	0.84 ± 0.06	0.02 ± 0.01
Esophagus	0.79 ± 0.09	0.81 ± 0.08	0.03 ± 0.02
Small bowel	0.80 ± 0.05	0.82 ± 0.06	0.02 ± 0.02
Colon	0.82 ± 0.06	0.84 ± 0.07	0.03 ± 0.01
Bowel bag	0.93 ± 0.02	0.93 ± 0.02	0.01 ± 0.01
Rectum	0.83 ± 0.05	0.82 ± 0.03	0.05 ± 0.03

Note: Boldface indicates that 3D V-net outperforms 2D U-net.

Table 2.

HD95 (mm) for OARs Auto-Segmented by 2D U-net and 3D V-net Models ( $\bar{x} \pm s$ ).

OARs	2D U-net	3D V-net	Difference
Bladder	11.44 ± 9.30	9.06 ± 8.85	3.64 ± 3.75
Brain	1.60 ± 0.23	1.70 ± 0.26	0.10 ± 0.23
Brain Stem	3.12 ± 1.26	2.97 ± 1.30	0.57 ± 0.79
Heart	7.22 ± 2.58	5.93 ± 2.94	1.88 ± 1.89
Liver	11.26 ± 10.02	6.63 ± 7.29	3.87 ± 4.81
Lung_All	2.14 ± 1.64	2.05 ± 1.71	0.12 ± 0.19
Oral Cavity	7.50 ± 4.18	6.05 ± 5.50	4.36 ± 4.01
Stomach	8.00 ± 4.47	13.00 ± 8.37	10.85 ± 9.63
Eye_L	1.95 ± 0.65	2.16 ± 1.63	1.00 ± 1.47
Eye_R	4.19 ± 1.37	3.59 ± 1.93	0.66 ± 1.03
Kidney_L	5.42 ± 4.99	2.10 ± 0.79	5.30 ± 5.92
Kidney_R	15.23 ± 11.72	1.92 ± 1.12	11.7 ± 11.5
Parotid_L	8.04 ± 3.73	5.56 ± 3.05	4.13 ± 2.77
Parotid_R	8.64 ± 5.12	7.30 ± 6.10	2.45 ± 4.04
Lens_L	2.41 ± 1.56	59.49 ± 5.93	56.8 ± 5.59
Lens_R	1.78 ± 0.69	59.48 ± 5.32	57.2 ± 5.05
Thyroid	3.91 ± 3.73	1.69 ± 0.50	4.30 ± 5.26
Esophagus	3.31 ± 1.94	3.39 ± 3.01	1.03 ± 0.98
Small bowel	10.93 ± 2.95	12.74 ± 9.16	4.62 ± 4.79
Colon	16.79 ± 7.74	14.76 ± 8.05	4.01 ± 2.59
Bowel bag	7.07 ± 2.73	5.82 ± 1.72	2.19 ± 1.70
Rectum	7.28 ± 2.55	17.70 ± 13.09	8.90 ± 12.35

Note: Boldface indicates that 3D V-net outperforms 2D U-net.

Table 3.

ASD (mm) for OARs Auto-Segmented by 2D U-net and 3D V-net Models ( $\bar{x} \pm s$ ).

OARs	2D U-net	3D V-net	Difference
Bladder	2.92 ± 2.95	1.82 ± 1.88	1.04 ± 1.36
Brain	0.45 ± 0.09	0.40 ± 0.11	0.06 ± 0.05
Brain Stem	0.84 ± 0.58	0.82 ± 0.53	0.16 ± 0.11
Heart	1.67 ± 0.61	1.42 ± 0.83	0.06 ± 0.05
Liver	2.31 ± 2.16	2.06 ± 3.26	0.85 ± 0.83
Lung_All	0.62 ± 0.51	0.70 ± 0.56	0.19 ± 0.25
Oral Cavity	1.83 ± 0.82	1.72 ± 1.81	1.08 ± 0.89
Stomach	1.91 ± 0.89	2.56 ± 1.68	2.20 ± 2.14
Eye_L	0.46 ± 0.10	0.50 ± 0.19	0.14 ± 0.14
Eye_R	2.96 ± 5.18	0.53 ± 0.18	2.06 ± 4.80
Kidney_L	0.97 ± 0.64	0.90 ± 1.08	1.14 ± 0.94
Kidney_R	2.50 ± 1.34	0.43 ± 0.22	1.96 ± 1.23
Parotid_L	2.84 ± 3.25	0.99 ± 0.38	1.59 ± 3.26
Parotid_R	1.61 ± 0.82	1.60 ± 1.37	0.50 ± 0.55
Lens_L	0.44 ± 0.22	14.13 ± 1.73	13.61 ± 1.71
Lens_R	0.46 ± 0.25	14.57 ± 1.36	13.12 ± 2.72
Thyroid	1.22 ± 1.07	0.36 ± 0.09	0.91 ± 0.99
Esophagus	0.64 ± 0.28	0.55 ± 0.36	0.18 ± 0.11
Small bowel	1.92 ± 0.66	1.93 ± 1.02	0.47 ± 0.33
Colon	2.03 ± 0.64	1.86 ± 0.73	0.29 ± 0.25
Bowel bag	1.37 ± 0.55	1.17 ± 0.39	0.35 ± 0.25
Rectum	1.3 ± 0.38	2.10 ± 0.88	0.83 ± 0.89

Note: Boldface indicates that 3D V-net outperforms 2D U-net.

The degree of bladder filling showed no significant impact on auto-segmentation outcomes; thus, no specific bladder preparation protocol was required during simulation. While partially filled bladders could be segmented on CT images, the boundaries of unfilled bladders were often indistinct from adjacent immature pediatric pelvic structures (uterus/prostate) due to their underdeveloped anatomical differentiation.

The 2D U-net model demonstrated clinically acceptable performance for lens contouring, with DSC values consistently around the 0.7 threshold, meeting predefined expectations for segmentation accuracy. The 3D V-net model failed to achieve the DSC threshold of 0.7 for lens segmentation, with significant deviations from ground truth contours and occasional segmentation errors.

Table 4 list the evaluation metrics for CTV auto-segmentation under 2D U-net and 3D V-net frameworks. For CTV2 (structurally simpler regions), DSC values were 0.87 ± 0.02 and 0.89 ± 0.04, with HD95 of 12.16 ± 12.14 mm and 8.61 ± 7.72 mm respectively, indicating better overlap accuracy with superior 3D V-net performance. CTV3, representing the lymphatic drainage regions, remains the most complex and labor-intensive component of the entire target volume. Nevertheless, its auto-segmentation achieved favorable results. For clinical implementation, only minor refinements to auto-segmented details are required to meet treatment planning specifications. Although CTV1 metrics appeared acceptable, 2D U-net failed to adequately contour the annular skull region above the pituitary gland, capturing only the outer bone margins while missing the inner contours (Figure 6), likely causing DSC underestimation.

Figure 6.

Comparison of CTV1 segmentation results using the 2D U-net model at the head (manual segmentation in green, auto-segmentation in red).

Table 4.

Evaluation Metrics for CTVs Auto-Segmented by 2D U-net and 3D V-net Models ( $\bar{x} \pm s$ ).

CTV	Metrics	2D U-net	3D V-net	Difference
CTV1	DSC	0.90 ± 0.06	0.90 ± 0.07	0.01 ± 0.01
	HD95/mm	5.06 ± 6.02	5.33 ± 6.06	0.47 ± 0.76
	ASD/mm	0.99 ± 0.88	1.23 ± 1.15	0.28 ± 0.58
CTV2	DSC	0.87 ± 0.02	0.89 ± 0.04	0.02 ± 0.03
	HD95/mm	12.16 ± 12.14	8.61 ± 7.72	3.56 ± 5.99
	ASD/mm	1.95 ± 1.22	1.56 ± 1.07	0.53 ± 0.57
CTV3	DSC	0.85 ± 0.04	0.84 ± 0.03	0.01 ± 0.01
	HD95/mm	5.44 ± 0.96	5.61 ± 1.02	0.17 ± 0.18
	ASD/mm	1.25 ± 0.40	1.37 ± 0.43	0.11 ± 0.06

Note: Boldface indicates that 3D V-net outperforms 2D U-net.

Manual contouring for a pediatric TMLI patient typically requires 5–8 h. By utilizing the auto-segmentation model trained in this study, the entire workflow—from data import and auto-segmentation to contour review and modifications meeting clinical planning requirements—can be reduced to approximately 1.5 h, thereby enhancing workflow efficiency and reducing pre-treatment waiting periods for patients.

Discussion

This study focused on pediatric TMLI contour segmentation, training and validating 2D U-net and 3D V-net models using imaging data from 30 pediatric patients and testing the auto-segmentation models on 6 additional pediatric cases. Analysis of the test set results demonstrated the feasibility of CNN-based auto-segmentation for pediatric TMLI. This approach enhances workflow efficiency in radiotherapy centers and provides technical support for implementing pediatric TMLI. While 3D neural networks slightly outperformed 2D counterparts in medical image segmentation, specific challenges persisted, such as the suboptimal lens segmentation observed in this study and the performance for CTV1 and CTV3.

The suboptimal lens segmentation performance of the 3D V-net model may be attributed to the inherently small volume of the lenses, which typically span only 2 CT slices. This resulted in insufficient z-axis data for robust 3D model training, thereby compromising the network's ability to learn effective spatial features. Regarding the failure of the 2D U-net model to adequately contour the inner boundaries of annular bony structures, repeated testing confirmed this persistent limitation, suggesting potential inherent limitations at the algorithmic level for parsing such complex anatomical configurations. In contrast, the 3D V-net model did not exhibit this specific issue.

Watkins et al¹⁸ trained an auto-segmentation model using 100 clinical TMLI patient datasets based on the U-net framework within the Medical Mind AI-software, and applied the trained model to 21 clinical cases. The results showed that 18 out of 21 OARs achieved a DSC >0.8, aligning closely with the findings of the current study. Notably, the DSC values for the oral cavity and stomach exceeded 0.9, surpassing the corresponding results in this study. Although the target definition criteria in their study differed slightly from ours (eg, lymphatic drainage coverage and bone marrow subdivision), the auto-segmentation outcomes exhibited remarkable similarity, indicating model robustness to anatomical variations. For lens auto-segmentation, Watkins et al¹⁸ also encountered suboptimal performance for small organs (eg, lenses and optic chiasm), mirroring the challenges observed in our study. Literature^39,40 further confirms that lens auto-segmentation generally underperforms; however, since manual lens segmentation is quick (typically requiring only few minutes) and does not significantly increase clinicians’ workload, excessive focus on optimizing its auto-segmentation may be unwarranted. In previous studies, the cranial bones were consistently defined as a separate target volume, achieving DSC values of 0.814 ± 0.99 and 0.893 ± 0.005, both indicating poorer auto-segmentation performance compared to other bony targets in the respective studies.¹⁸ Combined with the suboptimal cranial bone contouring observed in the 2D U-net model of this study, this limitation may stem from inherent algorithmic constraints—specifically, the U-net neural network's potential inability to recognize annular anatomical configurations (eg, the pituitary-adjacent skull region). However, this hypothesis requires further targeted studies for validation.

CNNs for auto-segmentation are not limited to U-net and V-net architectures; alternative networks can be explored to enhance performance. For instance, Chen et al⁴¹ employed the WB-net artificial intelligence algorithm, achieving modest improvements in auto-segmentation accuracy. Shi et al¹⁷ proposed a dual-encoder hybrid architecture (DE-net) and applied it to TMLI patient auto-segmentation, demonstrating superior performance compared to conventional algorithms and highlighting the potential of hybrid neural networks in medical image analysis. Additionally, studies have introduced decision tree-based approaches that combine atlas-based models with CNN frameworks to tailor auto-segmentation for OARs with distinct anatomical features, thereby improving precision.

This study utilized a limited cohort of 36 pediatric cases, which may constrain model generalizability. Furthermore, the test set contained only 6 cases, a limited sample size that precluded a formal statistical comparison of the performance differences between the two models. The current auto-segmentation results cannot be directly applied in clinical practice without manual verification and refinement. The model also demonstrated limited capability in handling unusual anatomical variations. Future efforts should focus on expanding datasets with multicenter imaging from diverse CT scanners, refining algorithms—including exploring alternative neural network architectures—to enhance pediatric TMLI-specific adaptability and robustness, and ultimately advancing clinical radiotherapy workflows through robust auto-segmentation support.

Conclusion

Based on the study findings, training deep learning-based auto-segmentation models for target volumes and OARs in pediatric TMLI appears feasible. This approach may addresses the limitations of traditional manual contouring, including its time-consuming nature and inter-observer variability, potentially enhancing radiotherapy efficiency. Overall, the 3D V-net model demonstrated slightly superior performance compared to the 2D U-net. However, in practical applications, the optimal network architecture should be selected based on the anatomical characteristics of specific structures to optimize results. The auto-segmented contours require manual verification and refinement to ensure clinical accuracy before application in radiotherapy planning. In practice, this refinement step is efficient: clinicians typically need only 5–10 min to perform minor manual adjustments to structures such as the lens, maxilla, mandible, testes, and the junctions of target regions.

Footnotes

ORCID iDs

Zhihua Xie

Fuli Zhang

Funding

The authors received no financial support for the research, authorship, and/or publication of this article.

Declaration of Conflicting Interests

The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

References

Barton

Frommer

Shafiq

. Role of radiotherapy in cancer control in low-income and middle-income countries. Lancet Oncol. 2006;7(7):584–595. doi:10.1016/S1470-2045(06)70759-8

Lee

Jun

Cho

, et al. Deep learning in medical imaging: General overview. Korean J Radiol. 2017;18(4):570. doi:10.3348/kjr.2017.18.4.570

Schultheiss

Wong

Liu

Olivera

Somlo

. Image-guided total marrow and total lymphatic irradiation using helical tomotherapy. Int J Radiat Oncol Biol Phys. 2007;67(4):1259–1267. doi:10.1016/j.ijrobp.2006.10.047

Dei

Lambri

Crespi

, et al. Deep learning and atlas-based models to streamline the segmentation workflow of total marrow and lymphoid irradiation. Radiol Med. 2024;129(3):515–523. doi:10.1007/s11547-024-01760-8

Liu

Xiao

, et al. Segmentation of organs-at-risk in cervical cancer CT images with a convolutional neural network. Phys Med. 2020;69:184–191. doi:10.1016/j.ejmp.2019.12.008

Lustberg

Van Soest

Gooding

, et al. Clinical evaluation of atlas and deep learning based automatic contouring for lung cancer. Radiother Oncol. 2018;126(2):312–317. doi:10.1016/j.radonc.2017.11.012

Ronneberger

Fischer

Brox

. U-Net: convolutional networks for biomedical image segmentation. In: Medical Image Computing and Computer-Assisted Intervention – MICCAI 2015: Part III. 2015:234–241.

Rigaud

Anderson

, et al. Automatic segmentation using deep learning to enable online dose optimization during adaptive radiation therapy of cervical cancer. Int J Radiat Oncol Biol Phys. 2021;109(4):1096–1110. doi:10.1016/j.ijrobp.2020.10.038

Vinod

Min

Jameson

Holloway

. A review of interventions to reduce inter-observer variability in volume delineation in radiation oncology. J Med Imag Rad Onc. 2016;60(3):393–406. doi:10.1111/1754-9485.12462

10.

Casati

Piffer

Calusi

, et al. Clinical validation of an automatic atlas-based segmentation tool for male pelvis CT images. J Applied Clin Med Phys. 2022;23(3):e13507. doi:10.1002/acm2.13507

11.

Wong

Fong

McVicar

, et al. Comparing deep learning-based auto-segmentation of organs at risk and clinical target volumes to expert inter-observer variability in radiotherapy planning. Radiother Oncol. 2020;144:152–158. doi:10.1016/j.radonc.2019.10.019

12.

Liveringhouse

Robinson

Garcia

Peters

Kim

Latifi

. Dosimetric comparison of volumetric modulated arc therapy with tomotherapy based total body irradiation for patients undergoing conditioning prior to hematopoietic stem cell transplantation for acute lymphocytic leukemia. Int J Radiat Oncol Biol Phys. 2021;111(3):303. doi:10.1016/J.IJROBP.2021.07.950

13.

Köksal

Baumert

Jazmati

, et al. Whole body irradiation with intensity-modulated helical tomotherapy prior to haematopoietic stem cell transplantation: Analysis of organs at risk by dose and its effect on blood kinetics. J Cancer Res Clin Oncol. 2023;149(10):7007–7015. doi:10.1007/s00432-023-04657-7

14.

Hui

Kapatoes

Fowler

, et al. Feasibility study of helical tomotherapy for total body or total marrow irradiationa. Med Phys. 2005;32(10):3214–3224. doi:10.1118/1.2044428

15.

Wong

JYC

Liu

Schultheiss

, et al. Targeted total marrow irradiation using three-dimensional image-guided tomographic intensity-modulated radiation therapy: An alternative to standard total body irradiation. Biol Blood Marrow Transplant. 2006;12(3):306–315. doi:10.1016/j.bbmt.2005.10.026

16.

Wong

JYC

Filippi

Scorsetti

Hui

Muren

Mancosu

. Total marrow and total lymphoid irradiation in bone marrow transplantation for acute leukaemia. Lancet Oncol. 2020;21(10):e477–e487. doi:10.1016/S1470-2045(20)30342-9

17.

Shi

Wang

Kan

, et al. Automatic segmentation of target structures for total marrow and lymphoid irradiation in bone marrow transplantation. In: 2022 44th Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC). IEEE; 2022:5025–5029. doi:10.1109/EMBC48229.2022.9871824

18.

Watkins

Qing

Han

Hui

Liu

. Auto-segmentation for total marrow irradiation. Front Oncol. 2022;12:970425. doi:10.3389/fonc.2022.970425

19.

Xue

Shi

Zeng

, et al. Deep learning promoted target volumes delineation of total marrow and total lymphoid irradiation for accelerated radiotherapy: A multi-institutional study. Phys Med. 2024;123:103393. doi:10.1016/j.ejmp.2024.103393

20.

La Barbera

Rouet

Boussaid

, et al. Tubular structures segmentation of pediatric abdominal-visceral ceCT images with renal tumors: Assessment, comparison and improvement. Med Image Anal. 2023;90:102986. doi:10.1016/j.media.2023.102986

21.

Wilson

Bryce-Atkinson

Green

, et al. Image-based data mining applies to data collected from children. Phys Med. 2022;99:31–43. doi:10.1016/j.ejmp.2022.05.003

22.

Von Elm

Altman

Egger

Pocock

Gøtzsche

Vandenbroucke

. The strengthening the reporting of observational studies in epidemiology (STROBE) statement: Guidelines for reporting observational studies. J Clin Epidemiol. 2008;61(4):344–349. doi:10.1016/j.jclinepi.2007.11.008

23.

Zhang

Chen

Liang

Zhou

. Acculearning: A user-friendly deep learning auto-segmentation platform for radiotherapy. Int J Radiat Oncol Biol Phys. 2021;111(3):122. doi:10.1016/J.IJROBP.2021.07.542

24.

Isensee

Jaeger

Kohl

SAA

Petersen

Maier-Hein

. nnU-Net: A self-configuring method for deep learning-based biomedical image segmentation. Nat Methods. 2021;18(2):203–211. doi:10.1038/s41592-020-01008-z

25.

Zunair

Ben Hamza

. Sharp U-net: Depthwise convolutional network for biomedical image segmentation. Comput Biol Med. 2021;136:104699. doi:10.1016/j.compbiomed.2021.104699

26.

Zhou

Rahman Siddiquee

Tajbakhsh

Liang

. UNet++: a nested U-net architecture for medical image segmentation. In: Stoyanov

Taylor

Carneiro

, et al., eds. Deep learning in medical image analysis and multimodal learning for clinical decision support. Vol 11045. Lecture Notes in Computer Science. Springer International Publishing; 2018:3–11. doi:10.1007/978-3-030-00889-5_1

27.

Mohammed

Hassanien

Afify

. A 3D image segmentation for lung cancer using V.Net architecture based deep convolutional networks. J Med Eng Technol. 2021;45(5):337–343. doi:10.1080/03091902.2021.1905895

28.

Wang

Jin

, et al. Application value of a deep learning method based on a 3D V-Net convolutional neural network in the recognition and segmentation of the auditory ossicles. Front Neuroinform. 2022;16:937891. doi:10.3389/fninf.2022.937891

29.

Hague

McPartlin

Lee

, et al. An evaluation of MR based deep learning auto-contouring for planning head and neck radiotherapy. Radiother Oncol. 2021;158:112–117. doi:10.1016/j.radonc.2021.02.018

30.

Cardenas

Beadle

Garden

, et al. Generating high-quality lymph node clinical target volumes for head and neck cancer radiation therapy using a fully automated deep learning-based approach. Int J Radiat Oncol Biol Phys. 2021;109(3):801–812. doi:10.1016/j.ijrobp.2020.10.005

31.

Rhee

Jhingran

Rigaud

, et al. Automatic contouring system for cervical cancer using convolutional neural networks. Med Phys. 2020;47(11):5648–5658. doi:10.1002/mp.14467

32.

Chin

Finnegan

Chlap

, et al. Validation of a fully automated hybrid deep learning cardiac substructure segmentation tool for contouring and dose evaluation in lung cancer radiotherapy. Clin Oncol. 2023;35(6):370–381. doi:10.1016/j.clon.2023.03.005

33.

Chen

Wang

Zhan

, et al. Author correction: A comparative study of auto-contouring softwares in delineation of organs at risk in lung cancer and rectal cancer. Sci Rep. 2024;14(1):3569. doi:10.1038/s41598-024-54103-y

34.

Zabel

Conway

Gladwish

, et al. Clinical evaluation of deep learning and atlas-based auto-contouring of bladder and Rectum for prostate radiation therapy. Pract Radiat Oncol. 2021;11(1):e80–e89. doi:10.1016/j.prro.2020.05.013

35.

Hoeben

BAW

Saldi

Aristei

, et al. Rationale, implementation considerations, delineation and planning target objective recommendations for volumetric modulated arc therapy and helical tomotherapy total body irradiation, total marrow irradiation, total marrow and lymphoid irradiation and total lymphoid irradiation. Radiother Oncol. 2025;206:110822. doi:10.1016/j.radonc.2025.110822

36.

Artaechevarria

Munoz-Barrutia

Ortiz-de-Solorzano

. Combination strategies in multi-atlas image segmentation: Application to brain MR data. IEEE Trans Med Imaging. 2009;28(8):1266–1277. doi:10.1109/TMI.2009.2014372

37.

Zijdenbos

Dawant

Margolin

Palmer

. Morphometric analysis of white matter lesions in MR images: Method and validation. IEEE Trans Med Imaging. 1994;13(4):716–724. doi:10.1109/42.363096

38.

Choi

Kang

Chie

Shin

Chang

Jang

. Assessment of lymph node area coverage with total marrow irradiation and implementation of total marrow and lymphoid irradiation using automated deep learning-based segmentation. PLoS ONE. 2024;19(3):e0299448. doi:10.1371/journal.pone.0299448

39.

Zhu

Huang

Zeng

, et al. Anatomynet: Deep learning for fast and fully automated whole-volume segmentation of head and neck anatomy. Med Phys. 2019;46(2):576–589. doi:10.1002/mp.13300

40.

Ibragimov

Xing

. Segmentation of organs-at-risks in head and neck CT images using convolutional neural networks. Med Phys. 2017;44(2):547–557. doi:10.1002/mp.12045

41.

Chen

Sun

Bai

, et al. A deep learning-based auto-segmentation system for organs-at-risk on whole-body computed tomography images for radiation therapy. Radiother Oncol. 2021;160:175–184. doi:10.1016/j.radonc.2021.04.019