Sage Journals: Discover world-class research

Abstract

Introduction

This study aims to evaluate auto-segmentation results using deep learning-based auto-segmentation models on different online CT imaging modalities in image-guided radiotherapy.

Methods

Phantom studies were first performed to benchmark image quality. Daily CT images for sixty patients were retrospectively retrieved from fan-beam kilovoltage CT (kVCT), kV cone-beam CT (kV-CBCT), and megavoltage CT (MVCT) scans. For each imaging modality, half of the patients received CT scans in the pelvic region, while the other half in the thoracic region. Deep learning auto-segmentation models using a convolutional neural network algorithm were used to generate organs-at-risk contours. Quantitative metrics were calculated to compare auto-segmentation results with manual contours.

Results

The auto-segmentation contours on kVCT images showed statistically significant difference in Dice similarity coefficient (DSC), Jaccard similarity coefficient, sensitivity index, inclusiveness index, and the 95^th percentile Hausdorff distance, compared to those on kV-CBCT and MVCT images for most major organs. In the pelvic region, the largest difference in DSC was observed for the bowel volume with an average DSC of 0.84 ± 0.05, 0.35 ± 0.23, and 0.48 ± 0.27 for kVCT, kV-CBCT, and MVCT images, respectively (p-value < 0.05); in the thoracic region, the largest difference in DSC was found for the esophagus with an average DSC of 0.63 ± 0.16, 0.18 ± 0.13, and 0.22 ± 0.08 for kVCT, kV-CBCT, and MVCT images, respectively (p-value < 0.05).

Conclusion

Deep learning-based auto-segmentation models showed better agreement with manual contouring when using kVCT images compared to kV-CBCT or MVCT images. However, manual correction remains necessary after auto-segmentation with all imaging modalities, particularly for organs with limited contrast from surrounding tissues. These findings underscore the potential and limits in applying deep learning-based auto-segmentation models for adaptive radiotherapy.

Keywords

artificial intelligence auto-segmentation kVCT kV-CBCT MVCT deep learning

Introduction

Radiation therapy plays an essential role in managing many cancers and other diseases. Over 50% of the patients with the most common cancer types will receive radiation therapy from, for example, linear accelerators (Linacs), as part of their cancer management.¹ With modern external-beam radiotherapy treatment planning and delivery techniques, image-guided radiotherapy (IGRT) has become routine before Intensity-modulated radiotherapy (IMRT) to position the target area more precisely and accurately. As a result, IGRT/IMRT has improved the therapeutic ratio by delivering highly conformal radiation to target volumes while sparing adjacent normal organs at risk (OARs). On the other hand, imaging from IGRT still needs to be entirely used to its potential. Despite modern immobilization and setup techniques, daily anatomical variations, such as bladder/rectum fillings and small bowel loop movement, frequently happen, leading to variations in delivered dose to target volumes and normal organs, especially in pelvic and thoracic regions where significant drifts due to the motion and changes in anatomical structures could cause suboptimal treatment. To resolve this clinical challenge, adaptive radiotherapy (ART) techniques have been proposed to adapt radiotherapy treatment to account for anatomy changes.² For ART, the imaging quality from IGRT needs to be further studied to see if they are suitable for ART purposes rather than just positioning purposes.

ART has the potential to reduce treatment-related organ toxicities further and to allow dose escalation to target volumes. However, as multiple adaptive treatment plans are generated during the radiotherapy course, sometimes even for each treatment fraction, there is an urgent need for efficient target and normal organ delineation and rapid plan generation to alleviate the extra time and resources required in ART workflow. One critical step for ART workflow is to generate the target and nearby OARs timely. The current workflow needs clinicians to manually contour those structures, which is time-consuming and prone to human errors. Different techniques have been proposed to efficiently generate structures in ART planning, including contour propagation through deformable image registration and auto-segmentation using artificial intelligence (AI) algorithms or atlas-based models. Those proposed methods need better image quality for the computer algorithm to generate structures more efficiently and accurately. Compared to other automated contour delineation techniques, AI auto-segmentation algorithms are independent of image registration results and could provide even higher accuracy. As a subset of AI auto-segmentation techniques, deep learning models are based on historical segmentation data and are widely used clinically for auto-segmentation. However, most deep learning auto-segmentation models are trained with high-quality CT images. In contrast, images used for ART treatment planning may be daily pre-treatment images from the treatment machine with inferior image quality compared to diagnostic or simulation CT images. The benefits of using deep learning auto-segmentation tools on daily images from treatment machines should be evaluated before ART treatment planning clinical implementation.

In 2020, a new linac model with integrated multi-modality imaging capabilities (X1, RefleXion Medical, Inc., Hayward, California) became commercially available in the US. It integrates a compact 6-MV photon beam linac with positron emission tomography (PET) and fan-beam kilovoltage CT (kVCT) imaging subsystems on the same ring gantry. The CT imaging system includes a 16-row kVCT scanner for daily pre-treatment image-guided target alignment. Research has shown that the kVCT images from the X1 machine had comparable image characteristics compared to CT simulation images and could be used for ART purposes.^3,4 We hypothesized that daily kVCT images from the X1 machine were more suitable for efficient structure delineation with clinical deep learning auto-segmentation models compared to other online imaging modalities. In this study, we evaluated the performance of deep learning auto-segmentation models on kVCT images from the X1 machine and compared it to the existing online CT imaging modalities from other radiotherapy delivery systems.

Methods

Ethics Approval Statement

This research project was carried out according to our institution's guidelines and was approved in January 2023. This is an IRB-approved retrospective study, and all patient information was de-identified, so patient consent was not required. Patient data will not be shared with third parties.

Demographics

The reporting of this study conforms to STROBE guidelines.⁵ We retrospectively retrieved daily IGRT images for a total of sixty patients who were previously treated at our institution. Patients were selected consecutively in a reverse chronological order starting with the most recent patients treated on the radiotherapy machines. All patients were scanned in a head-first supine position. Twenty patients were treated on helical tomotherapy (HT) (Hi-ART II, Accuray Inc., Sunnyvale, CA) with daily megavoltage CT (MVCT) scans, another twenty patients were treated on conventional C-arm linacs (TrueBeam linac, Varian Medical Systems, Inc., Palo Alto, CA) with daily kilovoltage cone-beam CT (kV-CBCT) scans, and the remaining twenty patients were treated on the X1 machine with daily fan-beam kVCT scans. For each imaging modality, ten patients had daily CT scans in the pelvic region, while the other ten had daily CT scans in the thoracic region. Details about patient demographics, dose prescription, and imaging protocols are listed in Table 1. The numbers of patients included in this study are based on the approximate numbers of patients treated on the X1 machine for each anatomical site at the time of study initiation.

Table 1.

Characteristics of Patients and Protocols of the Three Daily CT Imaging Modalities.

	Age (yrs)	Gender (M/F)	Fractions	Plan Dose (Gy)	Beam Energy	Current (mA)	Pitch
PELVIS
kVCT	71.4 ± 7.21	9/1	29 ± 9	60.1 ± 14.0	120 kV	400	1.333
kV-CBCT	71.8 ± 7.42	10/0	28 ± 14	67.0 ± 17.9	125 kV	60, 80, 100	-
MVCT	74.1 ± 9.19	10/0	30 ± 9	62.6 ± 12.7	1.5 MV	-	12 mm
THORAX
kVCT	71.3 ± 13.0	5/5	26 ± 8	54.0 ± 10.7	120 kV	400	1.333
kV-CBCT	58.0 ± 13.7	4/6	8 ± 5	37.1 ± 8.86	125 kV	60, 80, 100	-
MVCT	50.2 ± 15.9	10/0	9 ± 1	16.8 ± 3.68	1.5 MV	-	12 mm

Protocols of Daily CT Imaging

The X-ray tube peak potential values were 120 kV and 125 kV in the X1 kVCT and TrueBeam kV-CBCT scans, respectively, while it was 1.5 MV in HT MVCT scans. The X-ray tube currents were 400 mA in X1 kVCT scans and 60∼100 mA in kV-CBCT scans. The pitch values were 1.333 and 12 mm per rotation in X1 kVCT and HT MVCT scans, respectively.

Image Segmentation

The convolutional neural network (CNN) based deep learning methods have been shown to provide excellent auto-segmentation results with improved accuracy compared to traditional segmentation approaches, including but not limited to head and neck (H&N) organs at risk (OARs), thoracic OARs, and pelvic OARs.^6-8 In this work, an auto-segmentation software system, MedMinds AI (MedicalMind Co., Ltd, Beijing, China),⁹ was deployed in our institution which was based on a CNN algorithm and trained with diagnostic and treatment planning CT images and contouring data (> 1000 training datasets) to generate auto-segmentation models for the pelvic and thoracic regions, respectively. During model training, the entire network used the Adaptive Moment Estimation optimizer with an initial learning rate of 0.0001, which decays by an exponential function with gamma = 0.9 for every epoch.¹⁰ In pelvic cases, normal organs included the bladder, rectum, bowel, and left and right femoral head; in contrast, in thoracic cases, the left and right lung, esophagus, heart, and spinal cord were included. The U-Net was used as the basic architecture for the deep learning model,⁷ which contained an encoder part and a decoder part, and both the encoder and decoder included five context aggregation blocks instead of the convolutional layers, with the feature maps in the encoder concatenated to those in the decoder. Previous results have demonstrated that this optimized U-Net algorithm outperformed the benchmark U-Net methods, providing high-quality clinically acceptable organ segmentation that can be used in radiation therapy planning.^6,7 Furthermore, the trained auto-segmentation models have already incorporated data normalization and several artifact corrections (eg, gridding artifacts), as described by Liu et al..¹¹ Therefore, no additional pre-processing steps were performed in this work before feeding the daily images into models. For comparison, organs were also manually delineated by an experienced planner with over ten years of structure contouring experience as the ground truth. The manual contours were then visually inspected by one of the authors to ensure clinically acceptable quality. Figure 1 shows the study workflow chart.

Figure 1.

Workflow Chart for the Study. DSC: Dice Similarity Coefficient; JSC: Jaccard Similarity Coefficient; SI: Sensitivity Index; II: Inclusiveness Index; HD95: 95th Percentile Hausdorff Distance.

Phantom Study

The CATPHAN 504 phantom (model: CTP 504, The Phantom Laboratory, Salem, NY) was used in this work to evaluate and compare image quality among kVCT, kV-CBCT, and MVCT images using daily scanning protocols. The CTP 404 module inside the CATPHAN 504 phantom, consisting of seven cylindrical inserts with different densities, was used for CT number validation and contrast-to-noise ratio (CNR) calculation. The CTP 486 slice, which is a uniform density module, was utilized for noise and uniformity analysis. The CTP 528 slice with 1 through 21 line-pairs per centimeter (lp/cm) and the CTP 515 slice with 2 through 15 mm supra-slice and sub-slice targets were used for high- and low-contrast spatial resolution evaluation, respectively.

Image Analysis

For phantom studies, the following quantitative metrics were analyzed and compared based on methods described by Kamath et al:¹² (a) The difference in CT numbers: Seven region-of-interests (ROIs) were drawn based on the tested materials of CTP 404 images, with the corresponding nominal CT numbers (unit: HU) Air (−1000), PMP (−200), LDPE (−100), polystyrene (−35), Acrylic (120), Delrin (340), and Teflon (990). The absolute difference values were then calculated between the image from each imaging modality and the planning CT image. (b) Noise: The SD of pixel intensities over an ROI (eg, 0.5 cm diameter circle) of CTP 486 images was calculated as the image noise. (c) CNR: The polystyrene tube was used in this work for the CNR measurement based on the following equation:

C N R = \frac{M e a n (R O I) - M e a n (B a c k g r o u n d)}{S D (B a c k g r o u n d)}

(1)

where the background was chosen in any region of CTP 404 images with uniform density. (d) Uniformity: the image uniformity was measured using the equation below:

U n i f o r m i t y = M e a n (R O I_{P}) - M e a n (R O I_{C})

(2)

where

R O I_{P}

and

R O I_{C}

are the mean CT values in ROIs at the periphery and center of CTP 404 images, respectively. (e) High contrast resolution: The number of line pairs per centimeter shown in CTP 528 images that can be visually observed was evaluated as the high contrast resolution metric to distinguish between two closely spaced objects. (f) Low contrast resolution: The number of visible supra-slice targets in CTP 515 images was measured as the low contrast resolution to distinguish between various contrast materials.

For patient studies, a set of most widely used geometric metrics,¹³ such as Dice similarity coefficient (DSC), Jaccard similarity coefficient (JSC), sensitivity index (SI), inclusiveness index (II), and 95^th percentile Hausdorff distance (HD95), were used to evaluate the quality of the auto-segmentation results compared to manual delineations. The DSC is defined as shown in the following equation:

D S C (A, B) = \frac{2 | A \cap B |}{| A | + | B |}

(3)

where A represents the auto-segmented contours generated from deep learning models, and B represents the manual delineated by the planner.

A \cap B

denotes the intersection of A and B. DSC = 0 means no intersection between A and B, while DSC = 1 means perfect overlapping.

The JSC is defined as the size of the intersection divided by the size of the union of the sample sets, as shown below:

J S C (A, B) = \frac{| A \cap B |}{| A | + | B | - | A \cap B |}

(4)

where JSC = 0 means no intersection between A and B, and JSC = 1 means perfect overlapping.

The sensitivity index is calculated as below:

S I (A, B) = \frac{| A \cap B |}{B}

(5)

where SI = 0 means no intersection between A and B, and SI = 1 means that A fully covers B. Similarly, the inclusiveness index (II) is defined as below:

I I (A, B) = \frac{| A \cap B |}{A}

(6)

where II = 0 means no intersection between A and B, and II = 1 means that B fully covers A.

The 95^th percentile Hausdorff distance is shown as the following equation:

H D 95 (A, B) = \max (h (A, B), h (B, A), 95 t h)

(7)

where:

h (A, B) = max_{a \in A} min_{b \in B} | | a - b | |

(8)

and

h (B, A) = max_{b \in B} min_{a \in A} | | b - a | |

(9)

where ||.|| denotes the Euclidean norm of the points of A and B, and HD95 increases as the overlap between A and B decreases.

All the above metrics were calculated with codes written in MATLAB (Mathworks, Natick, MA).

Statistical Analysis

Given the limited sample size and the use of evaluation metrics with a non-normal distribution in this study, data comparison across the three imaging modalities was conducted using the non-parametric Wilcoxon rank-sum test, where the difference was considered statistically significant if the p-value was less than 0.05.

Results

Table 2 shows quantitative image quality analysis results on phantom images from kVCT, kV-CBCT, and MVCT. Overall, the image quality, including factors such as CT number difference and image noise, was best in kVCT, intermediate in kV-CBCT, and worst in MVCT. Specifically, the CT number difference ranged from 0.5 to 21 HU in kVCT images, from −23 to 57 HU in kV-CBCT images, and from −138 to 102 HU in MVCT images. The measured image noise was 5.6, 6.3, and 23.1 HU in the kVCT, kV-CBCT, and MVCT images, respectively. The CNR was 22.1, 18.9, and 4.8 in the kVCT, kV-CBCT, and MVCT images, respectively. Measured image uniformity was −1.6, −4.8, and 21.1 HU in the kVCT, kV-CBCT, and MVCT images, respectively. In high contrast resolution tests, the highest discernable resolution was 6, 5, and 3 lp/cm with the kVCT, kV-CBCT, and MVCT images, respectively. In low contrast resolution tests, the number of discernable low-resolution circles was found to be 3, 2, and 1 in the kVCT, kV-CBCT, and MVCT images, respectively.

Table 2.

Quantitative Image Quality Analysis on Phantom Images from kVCT, kV-CBCT, and MVCT.

	CT Number Difference* (unit: HU)							Noise	CNR	Uniformity	High Contrast Resolution	Low Contrast Resolution
	Air	PMP	LDPE	Poly	Acrylic	Delrin	Teflon	Noise	CNR	Uniformity	High Contrast Resolution	Low Contrast Resolution
kVCT	21	3	4	1	0.5	1	1	5.6	22.1	−1.6	6 lp/mm	3 discs
kV-CBCT	−23	−5	−1	−2	5.5	16	57	6.3	18.9	−4.8	5 lp/mm	2 discs
MVCT	102	60	64	48	−19.5	−5	−138	23.1	4.8	21.1	3 lp/mm	1 disc

* CT number difference between each daily CT image and the planning CT images.

Figure 2 shows representative axial images with manual-segmentation and auto-segmentation results for kVCT (top), kV-CBCT (middle), and MVCT (bottom) in both the pelvic (left) and thoracic (right) regions. The window levels were adjusted as [−350, 200] and [−300, 350] for all images in the pelvis and thorax, respectively.

Figure 2.

Representative Axial Images with Manual-Segmentation and Auto-Segmentation Results for the Three Daily CT Imaging Modalities in Both the Pelvic and Thorax Regions. Note that the Manual-Segmentation Results Were Delineated by an Experienced Planner for Each Organ and Imaging Modality, While the Auto-Segmentations Were from the Deep Learning Output.

Figures 3 and 4 show DSC, JSC, SI, II, and HD95 boxplots for the pelvis and thoracic cases, respectively. Besides, symbols “^,*,#” represent the statistically significant differences (p-value < 0.05) for each organ between kVCT and kV-CBCT, kVCT and MVCT, kV-CBCT and MVCT, respectively. The auto-segmentation contours on the kVCT images showed the highest average value of DSC, for example, compared to those on the kV-CBCT and MVCT images for all the major organs in both the pelvic and thoracic regions. With the kVCT images, the average DSC ranged from 0.58 ± 0.16 to 0.99 ± 0.01. In the pelvic region, the largest absolute difference in DSC was observed for the bowel volume with an average DSC of 0.84 ± 0.05, 0.35 ± 0.23, and 0.48 ± 0.27 for the kVCT, kV-CBCT, and MVCT images, respectively (p-value < 0.05). In the thoracic region, the largest absolute difference in DSC was observed for the esophagus, with an average DSC of 0.63 ± 0.16, 0.18 ± 0.13, and 0.22 ± 0.08 for the kVCT, kV-CBCT, and MVCT images, respectively (p-value < 0.05). Similarly, it can also be observed that the auto-segmentation contours on the kVCT images showed the highest average values of SI, II, and JSC compared to those on the kV-CBCT and MVCT images for all major organs except the rectum, where images from kV-CBCT yield the best in these metrics. The auto-segmentation contours on the kVCT images showed the lowest average values of HD95 compared to those on the kV-CBCT and MVCT images for all major organs except the rectum, where images from MVCT yield the lowest one. For the bowel, however, all the auto-segmentation results exhibited large values of HD95 compared to the manual-segmentation contours. In addition, the auto-segmentation models failed to contour certain organs (eg, rectum, heart, and esophagus) on images from kV-CBCT in some patients, where the image quality was inferior.

Figure 3.

Distributions of DSC, JSC, SI, II, and HD95 for Pelvic Organs in the kVCT (Left), kV-CBCT (Middle), and MVCT (Right) Images. The First Quartile, Median, and Third Quartiles are Shown as the Upper, Middle, and Lower Bars of the Boxes. The Minimum and Maximum Values are Shown as the Extent of the Vertical Lines. the ^,*,# Represent the Statistically Significant Differences (p-Value < 0.05) Between kVCT and kV-CBCT, kVCT and MVCT, kV-CBCT and MVCT, Respectively.

Figure 4.

Distributions of DSC, JSC, SI, II, and HD95 for Thoracic Organs in the kVCT (left), kV-CBCT (Middle), and MVCT (Right) Images. The first Quartile, Median, and third Quartiles are Shown as the Upper, Middle, and Lower Bars of the Boxes. The Minimum and Maximum Values are Shown as the Extent of the Vertical Lines. The ^,*,# Represent the Statistically Significant Differences (p-value < 0.05) Between kVCT and kV-CBCT, kVCT and MVCT, kV-CBCT and MVCT, Respectively.

Discussion

In this work, we have compared image quality among images from the RefleXion X1 (kVCT), TrueBeam (kV-CBCT), and HT (MVCT), respectively, and evaluated the performance of auto-segmentation models based on the CNN algorithm on the daily CT scanning images. Both CATPHAN and in vivo results show that images from kVCT yielded the best image quality over kV-CBCT and MVCT in terms of image contrast, artifacts, and noise, primarily because: 1) Compared to kVCT which used narrow fan-beam x-rays, kV-CBCT used broad cone-shaped beams, of which the spatial resolution, especially the through-plane resolution, was reduced. Furthermore, the scattering without the anti-scatter grid, the under-sampling especially at the periphery of the field of view, and the beam hardening effects were also increased, resulting in more streaking artifacts and reduced image contrast; 2) Compared to kVCT which used kilovolt x-rays (120 kVp in this work), MVCT used much higher energy MV photons beams which exhibited almost no photoelectric effect and significantly less beam attenuation, resulting in worse image contrast, low SNR, and image blurring. This was also demonstrated in auto-segmentation performance with kVCT showing the highest values of DSC, SI, II, JSC, and HD95 for most of the main organs in pelvic and thoracic regions among these three modalities. However, significant auto-segmentation errors occurred for organs with limited image contrast from surrounding tissues, as can be seen in Figures 2–4 when contouring the bowel, bladder, rectum, and esophagus, thus requiring manual corrections on the results of auto-segmentation. It is worth noting that although in phantom studies images from kV-CBCT did not show obvious artifacts, in patient studies, especially in thorax, the streaking artifacts significantly degraded the auto-segmentation accuracy.

Thanks to technological advancement in AI algorithms, auto-segmentation systems based on CNN models are currently widely used in clinical radiotherapy workflow. At our institution, implementation of the auto-segmentation system has led to an average reduction of 70%–80% in structure delineation time on planning CT images compared to manual contouring, as reported by the dosimetrist team. It has been used for almost all clinical radiotherapy cases and was used in delineating a comprehensive set of major organs throughout the body on planning CT images.⁸ As modern radiotherapy delivery machines typically provide integrated three-dimensional (3D) imaging modalities, 3D imaging is commonly used for daily image-guided patient positioning. The availability of 3D images throughout the treatment course also allows monitoring of anatomical and physiological changes in the treatment course. Studies have evaluated the feasibility of using daily CT images as imaging biomarkers^4,14 and for evaluating dosimetric variations due to anatomical and positioning changes.^15,16 In addition, studies have been conducted to directly use daily 3D images for ART.^17,18 For all the applications with daily CT images, efficient segmentation methods are needed so that they can be routinely used in clinical settings. This study highlights the difference in AI-generated auto-segmentation results among different online CT imaging modalities. Of note, the performance of the U-net deep learning models will be affected by the difference in image contrast, artifacts, and noise. For example, adding noise to input images can significantly lower the accuracy in object segmentation by U-Net models, and noise makes it harder for the U-Net model to accurately identify boundaries of target objects.¹⁹ At the same time, a low CNR can significantly impact the U-Net model's ability to detect boundaries and features.²⁰

Note that in this work, the vendor-provided auto-segmentation model was based on diagnostic and planning kVCT images only, and we did not implement models trained specifically for kV-CBCT and MVCT images. In fact, most commercial auto-segmentation systems were trained with high-quality diagnostic CT image data, as feeding low-quality image data for model training could significantly lower the U-Net model's ability to correctly segment objects.¹⁹ Research is ongoing to improve the performance of U-Net models in the presence of noise, which could potentially improve the performance of auto-segmentation systems on kV-CBCT and MVCT images. Some pre-processing techniques have been evaluated to denoise images before input to the U-Net model.²¹ Specialized U-Net architecture has also been proposed for improved noise robustness.²² Data augmentation has demonstrated its ability in auto-segmentation using CT images by introducing variability in the training datasets, including but not limited to improving model generalization, reducing overfitting, and enhancing robustness.^23-25 Therefore, data augmentation will be a promising mean to be incorporated into modality-specific auto-contouring models. On the other hand, we also observed that although the same protocol of daily imaging was used for patients treated on a specific machine, the variation in image quality, such as image artifacts (eg, metal artifacts) and SNR, degraded the auto-segmentation accuracy, presenting challenges on the stability and reproducibility of the deep learning model.

This study has several limitations. First, we only performed one deep learning-based auto-segmentation system that was deployed in our institution; however, we recommend evaluating multiple AI algorithms^26-32 to evaluate the accuracy of auto-segmentation against manual-segmentation. For example, using U-Net-generative adversarial network (U-Net-GAN) has shown improved auto-segmentation accuracy in thorax CT planning images over the conventional U-Net methods.²⁶ Chen et al used a transfer learning method to fine tune the neural network weights, outperforming a common population network and clinical registration-based method with higher accuracy in H&N and pelvic cancers.³² Second, the sample size of this retrospective study is relatively small, consisting of only sixty patients. Therefore, there may be some biases in auto-segmentation results, potentially resulting in overestimated performance which cannot fully represent the total population and low sensitivity in detecting the outliers because of specific cases chosen. The need for a larger patient study to be performed in future work is urgent. Third, the lack of modality-specific models for MVCT and kV-CBCT could result in degradation in auto-segmentation performance due to the difference in image quality (eg, spatial resolution, contrast, and artifacts) between the model training datasets (the planning kVCT images) and the testing datasets (the daily patient-specific kV-CBCT or MVCT images). Fine-tuning or re-training the CNN with the data augmentation technique or on MVCT³³ and kV-CBCT³⁴ training images could potentially improve auto-segmentation performance; however, it is beyond the scope of this paper, and the model generation and optimization implemented by the vendor have not yet been made publicly available to users. Last, it is impractical to collect the daily CT images with all three imaging modalities for the same patient, and thus, a robust pair comparison is not possible to evaluate. CATPHAN study can be performed via all kVCT, kV-CBCT, and MVCT, yet the simple structures cannot be perfectly simulated as human organs, where any motion and CT streaking artifacts are nonnegligible.

Conclusion

Results from deep learning-based auto-segmentation models showed improved agreement with gold-standard manual contouring when kVCT images from the X1 machine were used, compared to kV-CBCT or MVCT images acquired on other radiotherapy machines. More efficient structure delineation is expected when daily kVCT images are used with clinical deep learning-based auto-segmentation models in the ART workflow. However, manual correction is necessary for auto-segmentation results from all imaging modalities, especially for organs with limited contrast from surrounding tissues, such as bowel, bladder, rectum, and esophagus.

Footnotes

Abbreviations

Author’s Note

Seyi M Oderinde, Advanced Molecular Imaging in Radiotherapy (AdMIRe) Research Laboratory, School of Health Sciences, Purdue University, West Lafayette, IN, USA and Department of Radiation Oncology, Indiana University School of Medicine, Indianapolis, IN, USA.

ORCID iDs

Bo Liu

An Liu

Chunhui Han

Author Contributions

The authors contributed to the paper as follows: study design: SO and CH; data collection: ZW, CW, and CH; data analysis: ZW, CS, and CH; manuscript preparation: ZW, CS, and CH; manuscript review: ZW, CS, CW, SO, WW, KQ, BL, TW, AL, and CH. All authors reviewed the results and approved the ﬁnal version of the manuscript.

Funding

The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work is supported by RefleXion Medical, Inc., (grant number Exhibit 9).

Declaration of Conflicting Interests

The authors declared the following potential conflicts of interest with respect to the research, authorship, and/or publication of this article: CH and AL received research funding support from Reflexion Medical, Inc.

Date Availability Statement

Data in this study is stored in authors’ institutional storage space and is available upon request.

References

Laskar

Sinha

Kishnatry

, et al. Access to radiation therapy: From local to global and equality to equity. JCO Glob Oncol. 2022;8:e2100358. doi:https://doi.org/10.1200/GO.21.00358

Glide-Hurst

Lee

Yock

, et al. Adaptive radiation therapy (ART) strategies and technical considerations: A state of the ART review from NRG oncology. Int J Radiat Oncol Biol Phy. 2021;109(4):1054-1075.

Pham

Simiele

Breitkreutz

, et al. IMRT and SBRT treatment planning study for the first clinical biology-guided radiotherapy system. Technol Cancer Res Treat. 2022;21:1-11. doi:10.1177/15330338221100231

Ketcherside

Shi

Chen

, et al. Evaluation of repeatability and reproducibility of radiomic features produced by the fan-beam kV-CT on a novel ring gantry-based PET/CT linear accelerator. Med Phys. 2023;50(6):3719-3725.

von Elm

Altman

Egger

, et al. The Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) statement: Guidelines for reporting observational studies. Ann Intern Med. 2007;147(8):573-577.

Kang

Han

, et al. A blind randomized validated convolutional neural network for auto-segmentation of clinical target volume in rectal cancer patients receiving neoadjuvant radiotherapy. Cancer Med. 2022;11(1):166-175.

Liu

Xiao

, et al. Segmentation of organs-at-risk in cervical cancer CT images with a convolutional neural network. Phys Med. 2020;69:184-191.

Watkins

Qing

Han

Hui

Liu

. Auto-segmentation for total marrow irradiation. Front Oncol. 2022;12:970425.

MedMinds AI. https://www.medminds.ai/ .

10.

Kingma

. Adam: A method for stochastic optimization. arXiv:1412.6980.

11.

Liu

Guan

, et al. Development and validation of a deep learning algorithm for auto-delineation of clinical target volume and organs at risk in cervical cancer radiotherapy. Radiother Oncol. 2020;153:172-179.

12.

Kamath

Song

Chvetsov

, et al. An image quality comparison study between XVI and OBI CBCT systems. J Appl Clin Med Phys. 2011;12(2):3435. doi:https://doi.org/10.1120/jacmp.v12i2.3435

13.

Mackay

Bernstein

Glocker

Kamnitsas

Taylor

. A review of the metrics used to assess auto-contouring systems in radiotherapy. Clin Oncol (R Coll Radiol). 2023;35(6):354-369.

14.

Wang

Zhou

Wang

, et al. Reproducibility and repeatability of CBCT-derived radiomics features. Front Oncol. 2021;11:773512.

15.

Han

Chen

Liu

, et al. Actual dose variation of parotid glands and spinal cord for nasopharyngeal cancer patients during radiotherapy. Int J Radiat Biol Phys. 2008;70(4):1256-1262.

16.

Jin

Shang

, et al. CBCT-based volumetric and dosimetric variation evaluation of volumetric modulated arc radiotherapy in the treatment of nasopharyngeal cancer patients. Radiat Oncol. 2013;8:279.

17.

Posiewnik

Piotrowski

. A review of cone-beam CT applications for adaptive radiotherapy of prostate cancer. Phys Med. 2019;59:13-21.

18.

Yadav

Kozak

Tolakanahalli

, et al. Adaptive planning using megavoltage fan-beam CT for radiation therapy with testicular shielding. Med Dosim. 2012;37(2):157-162.

19.

Jiang

, et al. Noise-robustness test for ultrasound breast nodule neural network models as medical devices. Front Oncol. 2023;13:1177225.

20.

You

Reyes

. Influence of contrast and texture based image modifications on the performance and attention shift of U-Net models for brain tissue segmentation. Front Neuroimaging. 2022;1:1012639.

21.

Cai W Chen

, et al. A lightweight U-Net model for denoising and noise localization of ECG signals. Biomed Signal Process Control. 2024;88:105504.

22.

Cheng

Runkel

Liu

, et al. Continuous U-Net: faster, greater and noiseless. arXiv preprint arXiv:2302.00626.

23.

Chlap

Min

Vandenberg

, et al. A review of medical image data augmentation techniques for deep learning applications. J Med Imaging Radiat Oncol. 2021;65(5):545-563.

24.

Gardner

Bouchta

Mylonas

, et al. Realistic CT data augmentation for accurate deep-learning based segmentation of head and neck tumors in kV images acquired during radiation therapy. Med Phys. 2023;50(7):4206-4219.

25.

Sandfort

Yan

Pickhardt

, et al. Data augmentation using generative adversarial networks (CycleGAN) to improve generalizability in CT segmentation tasks. Sci Rep. 2019;9:16884.

26.

Dong

Lei

Wang

, et al. Automatic multiorgan segmentation in thorax CT images using U-net-GAN. Med Phys. 2019;46(5):2157-2168.

27.

Seo

Huang

Bassenne

Xiao

Xing

. Modified U-Net (mUNet) with incorporation of object-dependent high level features for improved liver and liver-tumor segmentation in CT images. IEEE Trans Med Imaging. 2019;39(5):1316-1325. doi:https://doi.org/10.1109/TMI.2019.2948320

28.

Feldman

Dai

Carver

, et al. Utilizing a deep learning-based object detection and instance segmentation algorithm for the delineation of prostate and prostate cancer segmentation. Int J Radiat Oncol Biol Phys. 2019;105(1):S197-S198.

29.

Zhou

. Automatic segmentation of multiple organs on 3D CT images by using deep learning approaches. Adv Exp Med Biol. 2020;1213:135-147. doi:https://doi.org/10.1007/978-3-030-33128-3_9

30.

Dai

Lei

Wang

, et al. Automated delineation of head and neck organs at risk using synthetic MRI-aided mask scoring regional convolutional neural network. Med Phys. 2021;48(10):5862-5873.

31.

Chun

Park

Olberg

, et al. Intentional deep overfit learning (IDOL): A novel deep learning strategy for adaptive radiation therapy. Med Phys. 2022;49(1):488-496.

32.

Chen

Gensheimer

Bagshaw

, et al. Patient-Specific auto-segmentation on daily kVCT images for adaptive radiation therapy. Int J Radiat Oncol Biol Phys. 2023;117(2):505-514.

33.

Lee

Choi

Kim

, et al. Feasibility of artificial intelligence-driven interfractional monitoring of organ changes by mega-voltage computed tomography in intensity-modulated radiotherapy of prostate cancer. Radiat Oncol J. 2023;41(3):186-198.

34.

Radici

Piva

Casanova Borca

, et al. Clinical evaluation of a deep learning CBCT auto-segmentation software for prostate adaptive radiation therapy. Clin Transl Radiat Oncol. 2024;47:100796.

Comparison of Deep Learning-Based Auto-Segmentation Results on Daily Kilovoltage,Megavoltage,and Cone Beam CT Images in Image-Guided Radiotherapy

Abstract

Introduction

Methods

Results

Conclusion

Keywords

Introduction

Methods

Ethics Approval Statement

Demographics

Protocols of Daily CT Imaging

Image Segmentation

Phantom Study

Image Analysis

Statistical Analysis

Results

Discussion

Conclusion

Footnotes

Abbreviations

Author’s Note

ORCID iDs

Author Contributions

Funding

Declaration of Conflicting Interests

Date Availability Statement

References