Sage Journals: Discover world-class research

Abstract

This study aimed to develop an automated classification framework for distinguishing between cervical cancer tumor and normal uterine tissue, leveraging CT images for radiomics feature extraction. We retrospectively analyzed CT images from 117 cervical cancer patients. To distinguish between cancerous and healthy tissue, we segmented gross tumor volume and normal uterine tissue as distinct regions of interest (ROIs) using manual segmentation techniques. Key radiomic parameters were extracted from these ROIs. To bolster model's predictive capability, the data was stratified into train data (70%) and validation data (30%). During feature selection phase, we applied Least Absolute Shrinkage and Selection Operator regression algorithm to identify most relevant features. Subsequently, we built classification models using five state-of-the-art machine learning algorithms: Support Vector Machine (SVM), Random Forest (RF), K-Nearest Neighbors (KNN), Extreme Gradient Boosting (XGBoost), and Decision Tree (DT). Ultimately, the performance of each model was evaluated. Through stringent feature selection process, we identified 18 pivotal radiomic features for classification of cervical cancer and normal uterine tissue. When applied to test data, all five models achieved excellent performance, with area under the curve (AUC) values ranging from 0.8866 to 0.9190 (SVM: 0.9144, RF: 0.9078, KNN: 0.9051, DT: 0.8866, XGBoost: 0.9190), all surpassing threshold of 0.8. In terms of test data, all five models had high sensitivity; accuracy of SVM, RF, and XGBoost models was comparable; and specificity of five models was similar. XGBoost model outperformed the others in terms of diagnostic accuracy, achieving an AUC of 0.8737 (95% CI: 0.8198-0.9277) for train data and 0.9190 (95% CI: 0.8525-0.9854) for test data. Our findings underscore the potential of CT radiomics combined with machine learning algorithms for accurately classifying cervical cancer tumors and normal uterine tissue with high recognition capabilities. This approach holds significant promise for clinical diagnostics.

Keywords

CT radiomics cervical cancer machine learning automatic classification model gross tumor volume (GTV)

Introduction

Cervical cancer, predominantly caused by the Human Papillomavirus (HPV), poses a significant global health threat to women's health.^1,2,3,4 Ranking fourth among female malignancies after breast, colorectal, and lung cancers, it is a leading cause of mortality among women. According to the 2020 global statistics, 604 000 new cases were diagnosed, resulting in 342 000 deaths.⁵ In China, this trend is even more acute, with 109 000 new cases and 59 000 deaths, marking a continual escalation in both incidence and mortality rates over the past two decades.⁶

Radiotherapy stands as a pivotal treatment modality for cervical cancer, and its precision plays a critical role in minimizing radiation-induced side effects and optimizing therapeutic outcomes.^7,8,9 However, the complexity of cervical cancer target volumes, often encompassing critical organs, necessitates meticulous delineation of tumors and their surroundings before treatment.¹⁰ Current methods for GTV segmentation primarily rely on the expertise and subjective judgment of radiation oncologists, which can introduce limitations on the design of the optimal radiotherapy plan for patients.^11,12 Recent studies have further underscored the significance of early detection and precise diagnosis in mitigating mortality rates linked to cervical cancer.^13–14

The emergence of radiomics technology offers a novel solution to this pressing challenge.¹⁵ By meticulously extracting a comprehensive set of features from medical images, radiomics allows for the quantification of disease complexity, providing a quantitative tool for assessing tumor heterogeneity.¹⁶ Its application has already demonstrated remarkable potential in diagnosing and predicting the prognosis of various cancers, paving the way for a new era of precision oncology.¹⁷ Recent advancements pertaining to machine learning algorithms have further augmented the applicability of radiomics for diagnosing and predicting the prognosis of cervical cancer.^18–19

Addressing the challenges inherent in cervical cancer management, this study introduces an innovative radiomics approach. By analyzing the features of cervical cancer GTV and normal uterine regions on CT images and leveraging a diverse array of machine learning algorithms, we aimed to automate the classification of cervical cancer GTV from normal tissues. This research hinged on identifying critical radiomic features that can effectively substitute for raw pixel values in CT images. By leveraging these features, we aimed to mitigate human bias inherent in manual tumor delineation. This approach has the potential to improve the precision of cervical cancer treatments, ultimately enhancing treatment efficacy and patient quality of life.

Materials and methods

Data collection

The reporting of this study conforms to TRIPOD guidelines (https://www.equator-network.org/reporting-guidelines/tripod-statement/).²⁰ We conducted a retrospective study, collecting clinical data and CT imaging data from 117 patients diagnosed with cervical cancer who underwent biopsy surgery between January 2016 and September 2022. To ensure the generalization ability of the model, we randomly divided the patient dataset into 70% train data and 30% test data for model development and evaluation. The inclusion criteria were strictly set, including (1) patients who underwent dual-phase enhanced CT scans before surgery; (2) no history of radiotherapy and chemotherapy; (3) histologically-confirmed malignant cervical tumors as determined by postoperative pathology, with complete clinical records; (4) age range of 18–80 years old, without contraindications for radiotherapy, and an expected survival period exceeding three months.

The exclusion criteria included: (1) CT images with severe motion artifacts or obvious noise interference; (2) maximum tumor diameter >1 cm; (3) presence of concurrent malignancies; (4) pregnant or lactating women, or individuals declined appropriate contraception.

CT Scanning parameters

This study utilized a Philips GEMINI TF 16 PET/CT system for image acquisition. Patients fasted for at least 6 h before the scan and their blood sugar levels were maintained within the physiological range (4.1-7.1 mmol/L). They then received an intravenous injection of “F-FDG” at a dosage of 0.10 to 0.15 mCi/kg. Following a 45–60-min rest period and voiding their bladders, patients underwent a PET/CT scan encompassing the region of the auditory meatus to the upper thighs. The scan employed a 3D acquisition mode with a 1.5-min interval per bed position. Low-dose CT scan parameters were set at 120 kV tube voltage, a tube current of 100mAs, and a slice thickness of 5 mm. Following the completion of the scan, the images were transferred to the radiation therapy planning system Eclipse 16.0, where they were fused with PET/MRI.

Delineation of Regions of Interest

The gross tumor volume (GTV) and the normal uterus were defined as the two regions of interest (ROIs). To facilitate image feature extraction, all images and ROIs were batch processed in DICOM format. All images were then manually segmented on the CT scans by two senior radiation therapists, each with 10 years of experience, with PET/MRI as a fusion reference. In case of disagreements, a third radiation therapist with 15 years of experience made the final decision. The primary tumor lesion GTV was delineated first on the images. Subsequently, the normal uterus tissue was derived by subtracting the GTV from the entire uterus, resulting in the final ROIs.

Image Preprocessing and Feature Extraction

The radiomics features relevant for IMRT planning were extracted from the GTV using the three-dimensional (3D) slicer platform. The PyRadiomics package, available at http://PyRadiomics.readthedocs.io/en/latest, was used to perform the feature extraction. First, all images were resampled to a uniform voxel size of 1 × 1 × 1 mm³. The image quantification method utilized a bin width of 25. Subsequently, the GTV and normal uterine tissue were taken as the regions of interest (ROI), and the open-source radiomics plugin SlicerRadiomics in 3Dslicer, combined with wavelet transform filtering technology, comprehensively extracted radiomics features from the ROI area of the CT image, including first-order statistics (firstorder), gray-level co-occurrence matrix (GLCM), gray-level dependence matrix (GLDM), gray-level run-length matrix (GLRLM), gray-level size zone matrix (GLSZM), and adjacent gray-level difference matrix (AGTDM). The algorithms used feature extraction mainly referenced the Image Biomarker Standardization Initiative.²¹ This study focused on texture analysis and did not include shape features. All extracted features were normalized by Z-score to achieve intensity standardization, ensuring that features from different ROIs could be compared on the same scale.

Radiomics feature screening

Lasso regression offers several advantages for feature selection, especially in high-dimensional datasets. It combines feature selection and regularization by shrinking less important feature coefficients to zero, which helps prevent overfitting and enhances model interpretability.^22–23 Lasso is particularly effective in handling multicollinearity by selecting one variable from among correlated features, making it superior to traditional methods such as Pearson correlation or t-tests.²⁴ Therefore, Lasso regression was used as the main method for feature dimension reduction and selection. In the R-4.3.2 software environment, Lasso regression was performed using the “glmnet” package, compressing non-important feature coefficients to zero by introducing an absolute value function as a penalty term, while minimizing the mean square error. In addition, the variance inflation factor (VIF) was calculated to evaluate the multicollinearity problem of the selected features in the multiple regression model, and features with VIF values >5 were removed to reduce the impact of multicollinearity. The specific screening process included: feature variable standardization, generating a sequence of candidate values for the regularization parameter λ, executing five-fold cross-validation to determine the optimal λ value, screening non-zero coefficients and features with VIF ≤ 5, and finally using the optimal λ value to predict the original feature matrix to determine the key radiomics features for Radscore calculation.

Radscore calculation

Radscore is a comprehensive assessment index that incorporates multiple radiomics features into a single value. This value reflects the biological characteristics, prognosis, and potential treatment response of tumors or lesions. The Radscore calculation is based on the following formula:

Radscore = Σ (C i \times X i),

where n represents the number of features with non-zero coefficients, Ci represents the coefficient value of the i-th feature selected by the Lasso regression model, and Xi represents the actual measured value of the corresponding feature.

Model Construction and Validation

Based on the calculated Radscore values, we randomly divided the samples into 70% train data and 30% test data. Utilizing R-4.3.2 software, we built binary classification models employing five distinct machine learning techniques: Support Vector Machine (SVM), Random Forest (RF), Extreme Gradient Boosting (XGBoost), K-Nearest Neighbors (KNN), and Decision Tree (DT). The efficacy of these predictive models was meticulously assessed and juxtaposed through metrics such as the Area Under the Curve (AUC), Sensitivity, Accuracy, Specificity, F1 Score, and Matthews Correlation Coefficient (MCC).

The specific training parameters for the five machine learning models are detailed below:

The Support Vector Machine (SVM) model employed a Radial Basis Function (RBF) kernel with hyperparameter optimization through grid search and Bayesian optimization. The final parameters included a cost value of 0.0136 and an RBF sigma value of 0.0426. The dataset was divided into 70% train data and 30% test data, with stratified sampling used to maintain balanced representation. The model was validated using 5-fold cross-validation.

The Random Forest model was implemented by tuning key hyperparameters, including the number of trees (ranging from 200 to 500), the number of variables sampled at each split, and the terminal node size. The final model used 200 trees and a minimum node size of 50. The dataset was split using a 70/30 train-to-test ratio, with stratified sampling employed to maintain balanced representation. The model was then validated using 5-fold cross-validation.

The XGBoost model's architecture included key hyperparameters such as mtry (number of features at each split), trees (boosting rounds), min_n (minimum samples required to split), tree_depth, learn_rate, loss_reduction, and sample_size. These parameters were tuned through grid search and Latin hypercube sampling, resulting in optimal values of mtry = 3, min_n = 8, tree_depth = 2, and learn_rate = 0.00212. The data was split 70% train and 30% test data with stratified sampling, and validated using 5-fold cross-validation.

The K-Nearest Neighbors (KNN) model was optimized by tuning hyperparameters, including the number of neighbors (k), which was varied between 3 and 11, and various weighting functions such as biweight, cosine, and epanechnikov. The distance metric used was Euclidean, and tuning was conducted using 5-fold cross-validation, followed by grid search and Bayesian optimization. The best-performing model used 15 neighbors along with the epanechnikov kernel.

The Decision Tree model was optimized by tuning parameters such as tree depth, minimum samples per split, and cost complexity for pruning. The optimal parameters were tree depth = 7, minimum samples = 16, and cost complexity = 0.00224.

To further substantiate the classification prowess of the models developed by the aforementioned algorithms, calibration curves and Decision Curve Analysis (DCA) were plotted. To determine whether there are statistical differences in the AUC values between the train data and test data of various models, this study employed the Delong test to compare the AUCs of five classification models. The Delong test outputs a P value, which indicates whether the observed difference between two AUCs could be attributed to random error alone. If the P value is < .05, the null hypothesis (stating that there is no difference in performance between the two classifiers) can be rejected. It is therefore considered that the performance difference between the two classifiers is significant.

Statistical Analyses

In this study, the R-4.3.2 software is used to construct the two classification model by using five machine learning models, including SVM, XGBoost, KNN, RF and DT. In order to comprehensively and accurately evaluate the performance of the model, AUC, sensitivity, accuracy, specificity, F1 score and MCC were selected as the evaluation indicators of the model performance. In addition, chi square test was used to analyze the pathological data in the clinical baseline indicators of patients in the train data and the test data, and t test was used to analyze the significance of the remaining continuous variables (P < .05 means there is a statistical difference). The Delong test is used to compare the AUC difference of the same model between the train data and the test data. If the p value is less than 0.05, it indicates that there is a significant difference in the AUC of the model between the train data and the test data.

Results

Baseline Characteristics

The baseline characteristics of patients are shown in Table 1. We randomly divided the 117-patient dataset into a 70:30 train-to-test ratio for model development and evaluation. The average age was similar in both sets (53.5 ± 8.1 years in the train data and 52.4 ± 8.4 years in the test data). The proportion of patients with pathological grade II was 59.3% in the train data and 61.1% in the test data. The Radscore values were −0.9 ± 1.3 and −1.2 ± 1.2 in the train and test data, respectively. However, there was no significant difference in baseline characteristics between the two sets.

Table 1.

Baseline Characteristics $\bar{x} \pm s$ .

Variable	Train data	Test data	P value
Age（Mean ± SD）	53.5 ± 8.1	52.4 ± 8.4	0.512
Pathological grade (%)			0.659
I	12.0 (14.8)	7.0 (19.4)
II	48.0 (59.3)	22.0 (61.1)	0.221
III	7.0 (8.6)	1.0 (2.8)
IV	14.0 (17.3)	6.0 (16.7)
Radscore（Mean ± SD）	−0.9 ± 1.3	−1.2 ± 1.2

Feature Screening

Radiomics features were extracted from GTV and normal cervical tissue (Figure 1) with a total of 930 features successfully extracted. Through Lasso regression analysis, we screened these features and ultimately retained 30 features with non-zero coefficients. The model's bias was minimized when the minimum value of λ was 0.013 (Figure 2).

Figure 1.

ROI of cervical cancer (the red part is GTV, and the green part is normal cervical tissue).

Figure 2.

A is a five-fold cross-validation diagram for selecting the minimum criterion in the Lasso model. The vertical line on the left represents the optimal value of the Lasso tuning parameter (λ), and the dashed line on the right represents the maximum λ value within one standard deviation of the average error. B is a distribution diagram of Lasso coefficients for different logarithms (λ).

After screening, the Variance Inflation Factor (VIF) of the independent variables in the Lasso regression model was calculated. After Lasso logistic regression analysis, variables with a VIF > 5 were removed to eliminate features that could lead to overfitting due to high multicollinearity, until the logistic model converged. A total of 18 imaging features were retained, all with a VIF < 5, indicating that there is no multicollinearity among the 18 imaging features (Table 2). The Radscore was calculated based on the 18 selected features after screening.

Table 2.

18 Selected Radiomics Features and Their VIF Values.

selected radiomics features	VIF
original.firstorder.Maximum	1.48429
original.gldm.DependenceVariance	1.7647
log.sigma.1.0.mm.3D.glszm.GrayLevelNonUniformityNormalized	2.8616
wavelet.LHH.glszm.SizeZoneNonUniformity	3.0163
wavelet.HLH.glszm.SizeZoneNonUniformity	3.9113
original.glszm.SmallAreaLowGrayLevelEmphasis	2.2188
wavelet.LLH.glszm.SizeZoneNonUniformityNormalized	1.8692
wavelet.HHH.glszm.SmallAreaEmphasis	3.9676
wavelet.HHH.glszm.SmallAreaLowGrayLevelEmphasis	3.9822
wavelet.HHL.glszm.SizeZoneNonUniformityNormalized	2.0131
original.ngtdm.Coarseness	2.2860
wavelet.LHL.firstorder.Maximum	4.1582
wavelet.LHL.ngtdm.Busyness	3.2304
wavelet.LHH.firstorder.Skewness	2.5440
wavelet.HLL.firstorder.Maximum	4.7140
wavelet.HHL.firstorder.Entropy	4.3355
wavelet.HHL.glrlm.LongRunLowGrayLevelEmphasis	2.1100
wavelet.LLL.glcm.MCC	2.1503

Model Construction

The Radscore was used to construct binary classification models based on SVM, RF, XGBoost, KNN, and DT algorithms. The ROC curves of the models are shown in Figure 3. The calibration curves and decision curves for the test data are presented in Figures 4 and 5, respectively. The AUC values, sensitivity, accuracy, specificity, F1 scores, and MCC of the classification models for the train and test data of the five models are shown in Table 3. The results of the DeLong test are shown in Table 4. For the test data, the AUC of all five models ranged from 0.8866 to 0.9190, and all values exceeded 0.8. Notably, the XGBoost model achieved an AUC of 0.8737 (95% CI: 0.8198-0.9277) in the train data and 0.9190 (95% CI: 0.8525-0.9854) in the test data. The results show that the P values for all five classification models are > 0.05, indicating no significant difference in classification performance between the train and test data, thereby verifying the stability and reliability of these models.

Figure 3.

ROC curves of five classification models, A-E were SVM, RF, KNN, XGBoost, and DT.

Figure 4.

Calibration curves of five classification models, A-E were SVM, RF, KNN, XGBoost, and DT.

Figure 5.

Decision curves of five classification models, A-E were SVM, RF, KNN, XGBoost, and DT respectively.

Table 3.

AUC, Sensitivity, Accuracy, Specificity, F1 Score, and MCC for Train Data and Test Data in SVM, RF, KNN, DT, and XGBoost Models.

models	AUC	Sensitivity	Accuracy	Specificity	F1 Score	MCC
SVM(train data)	0.8802	0.8770	0.8090	0.7410	0.8208	0.6230
SVM(test data)	0.9144	0.9170	0.8060	0.6940	0.8250	0.6270
RF(train data)	0.8916	0.8770	0.8090	0.7410	0.8208	0.6230
RF(test data)	0.9078	0.9170	0.8060	0.6940	0.8250	0.6270
KNN(train data)	0.9003	0.8770	0.8270	0.7780	0.8353	0.6580
KNN(test data)	0.9051	0.8890	0.7920	0.6940	0.8101	0.5950
DT(train data)	0.8917	0.9010	0.8520	0.8020	0.8588	0.7070
DT(test data)	0.8866	0.8890	0.7780	0.6670	0.8000	0.5700
XGBoost(train data)	0.8737	0.8890	0.8020	0.7160	0.8182	0.6140
XGBoost(test data)	0.9190	0.9170	0.8060	0.6940	0.8250	0.6270

Table 4.

CI of AUC Values and Delong Test Results for Train Data and Test Data in SVM, RF, KNN, DT, and XGBoost Models.

models	CI	p-value
SVM(train data)	0.8284-0.9320	0.4502
SVM(test data)	0.8427-0.9860	0.4502
RF(train data)	0.8437-0.9396	.7237
RF(test data)	0.8323-0.9833	.7237
KNN(train data)	0.8538-0.9468	.9084
KNN(test data)	0.8385-0.9717	.9084
DT(train data)	0.8404-0.9430	0.9131
DT(test data)	0.8102-0.9630	0.9131
XGBoost(train data)	0.8198-0.9277	.3015
XGBoost(test data)	0.8525-0.9854	.3015

Discussion

Radiotherapy has long been the standard treatment for cervical cancer. In recent years, most studies on artificial intelligence of cervical cancer have focused in the direction of early diagnosis and efficacy prediction (Table 5). However, achieving accurate classification and recognition of tumor targets and organs at risk of cervical cancer continues to be a significant challenge in radiotherapy. This study successfully constructed a model for classifying cervical cancer tumor tissue and normal tissue using CT radiomics combined with machine learning algorithms. The models achieved excellent performance in the test data, with AUC values ranging from 0.8866 to 0.9190 (all exceeding 0.8). In terms of sensitivity, the values of all models fall within the range of 0.8770 to 0.9170, indicating a high ability to correctly identify cancer cases. With regard to accuracy, the performance of all models was stable between 0.7920 and 0.8060. For specificity, the values of all models ranged from 0.6670 to 0.6940. In terms of F1 score, the scores of all models are between 0.8000 and 0.8250. All these parameters fully show that these models have excellent comprehensive performance. In addition, the MCC values of the five models were between 0.57 and 0.7070, which indicates a strong positive correlation between the predicted results and the actual results of the five classification models. Therefore, it is feasible to apply CT-based imaging methods to the classification of cervical cancer GTV and normal uterus.

Table 5.

Research Projects on Artificial Intelligence Applications in Cervical Cancer (2022–2024).

Year	Title	Key Findings/Contributions
2024	Machine Learning-Based Radiomics for Predicting Outcomes in Cervical Cancer Patients Undergoing Concurrent Chemoradiotherapy¹⁸	Demonstrated the use of machine learning and radiomics to predict treatment outcomes in cervical cancer patients undergoing concurrent chemoradiotherapy.
2024	Artificial intelligence for cervical cancer screening: Scoping review, 2009-2022²⁵	Compared the performance of machine learning algorithms in early diagnosis of cervical cancer.
2024	Enhancing Cervical Cancer Detection and Robust Classification through a Fusion of Deep Learning Models²⁶	Introduces a machine learning algorithms, highlighting the potential of intelligent automation in medical diagnostics.
2023	Cervical Cancer Screening: A Review²⁷	Provided a comprehensive review of current cervical cancer screening methods .
2023	An Interpretable Clinical Ultrasound-Radiomics Combined Model for Diagnosis of Stage I Cervical Cancer²⁸	Introduced an interpretable model combining clinical ultrasound and radiomics features for the diagnosis of stage I cervical cancer.
2022	Using Radiomics and Machine Learning Applied to MRI to Predict Response to Neoadjuvant Chemotherapy in Locally Advanced Cervical Cancer²⁹	Evaluated the effects of automatic segmentation algorithms on ultrasound radiomics models for predicting lymph node metastasis in cervical cancer

First, the application of radiomics technology has provided a new perspective for the diagnosis of cervical cancer. Traditional diagnostic methods rely on the experience and subjective judgment of physicians, whereas radiomics can extract a multitude of quantitative features from CT images, offering a new means of assessing tumor heterogeneity.³⁰ The use of this technology not only reduces human bias but also enhances diagnostic precision.²⁸

Second, the introduction of machine learning algorithms has further enhanced the model's classification capabilities.^31,32,33 In this study, we employed five distinct machine-learning techniques, including SVM, RF, KNN, XGBoost, and DT. KNN is relatively simple and intuitive, relying on proximity for classification. However, it can struggle with feature selection in high-dimensional spaces and is computationally intensive during the prediction phase.³⁴ DT models offer a straightforward approach with inherent interpretability, making feature selection easy to visualize; however, they are prone to overfitting and may not perform well on small or noisy datasets without pruning techniques.³⁵ In contrast, SVM excels in high-dimensional spaces and is effective at finding the optimal hyperplane for classification tasks, making it a reliable choice for radiomics.³⁶ Nonetheless, it can be computationally expensive, particularly with large datasets.³⁷ RF provides robust feature selection through its ensemble of DTs, offering resistance to overfitting and better interpretability of the selected features. However, its complexity can lead to longer training times.³⁸ XGBoost is known for its efficiency and high predictive power,³⁹ especially in handling sparse and unbalanced datasets; however, it often requires careful tuning of hyperparameters to avoid overfitting and ensure optimal performance.⁴⁰ XGBoost leverages a gradient boosting framework that sequentially builds models, typically DTs, each model improving on the errors of the previous ones. This approach is highly effective for minimizing loss functions, making it suitable for handling imbalanced datasets and enhancing predictive accuracy.^40–41 A significant advantage of XGBoost is its use of L1 (Lasso) and L2 (Ridge) regularization, which helps prevents overfitting, ensuring a balance between simplicity and generalization.^21–22 XGBoost excels at handling sparse data, which is common in real-world applications like medical diagnostics, as it manages missing values and improves robustness and efficiency.⁴² Its scalability and parallel processing capabilities allow for the rapid processing of large datasets, such as those in medical imaging, making it highly efficient for vast data analyses.⁴⁰The XGBoost model was selected as the optimal model due to its high AUC values in both the train and test data. The AUC value of the XGBoost model in the test data reached 0.9190, which is particularly important in medical image analysis.^43,44,45,26

Moreover, the feature selection process using Lasso regression analysis was a crucial step in this study. Through Lasso regression analysis, we screened 930 features and ultimately retained 18 key features with non-zero coefficients, indicating their significant impact on the classification results. This process not only improved the model's interpretability but also reduced its complexity, facilitating rapid and accurate classification in practical applications.

However, this study also had some limitations. The relatively small sample size might affect the model's generalization. Future studies should validate the model on a larger sample size to confirm its stability and reliability. Additionally, this study focused solely on CT images for analysis, excluding other potentially valuable clinical information, such as the patient's age and medical history. Future studies could consider incorporating these factors into the model to enhance the comprehensiveness of the diagnosis.

Furthermore, despite the excellent performance of the XGBoost model in this study, the interpretability of machine learning models remains a challenge. In clinical applications, physicians and patients may need to understand the model's decision-making process to facilitate trust in the diagnostic results. Therefore, future research should explore methods to improve the transparency and interpretability of models.

Lastly, the conclusions of this study need to be verified in a broader clinical setting. Although the model showed good performance in both the train and test data, it may face different challenges in practical application, such as differences in image quality and the impact of different devices.^46,47 Therefore, future research should test and optimize the model in a multicenter, multi-device environment.

Conclusion

This study demonstrates the potential of machine learning models based on CT radiomics in the diagnosis of cervical cancer. With further research and validation, this approach is expected to become an important tool in the diagnosis and treatment planning of cervical cancer. Our future work will focus on expanding the sample size, integrating multi-source data, improving the model's interpretability, and conducting model validation in a broader clinical context.

Footnotes

Abbreviations

Author Contribution

Huai-wen Zhang, Jinghong Pei and Hao-wen Pang conceived of the presented idea. Huai-wen Zhang and Hao-wen Pang collected the planning data of all patients in this study. Huai-wen Zhang, Jinghong Pei and Hao-wen Pang took the lead in writing the manuscript. All authors provided critical feed-back and helped shape the research, analysis, and manuscript.

Availability of Data and Materials

All data generated and analyzed during this study are included in this published article.

Consent for Publication

Consent for publication is not applicable in this study, because there is not any individual person's data.

Declaration of Conflicting Interests

The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Ethics Approval and Consent to Participate

This retrospective study was approved by the ethics committee of Jiangxi Cancer Hospital([2024-04-26] 2024ky057)& Second People's Hospital of Jingdezhen([2024-1-10] 2024-LLLW-06). Due to the retrospective nature of this study, the ethics committee of the two hospitals waived the informed consent of the patients and confirmed compliance with the Declaration of Helsinki and the confidentiality of the patient data.

Funding

The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by the Science and technology plan of Jingdezhen Health Commission, The Xuyong County People's Hospital- Southwest Medical University Science and Technology Strategic Cooperation Project, The Open Fund for Scientific Research of Jiangxi Cancer Hospital, Science and technology plan of Jiangxi Provincial Health Commission, the Gulin County People's Hospital-The Affiliated Hospital of Southwest Medical University Science and Technology Strategic Cooperation Project, (grant number 20231SFZC076, 2024XYXNYD05, 2021J15, 202310876, 2022GLXNYDFY05).

ORCID iD

Huaiwen Zhang

References

Crosbie

Einstein

Franceschi

, et al. Human papillomavirus and cervical cancer. Lancet. 2013;382(9895):889‐899.

Arbyn

Weiderpass

Bruni

, et al. Estimates of incidence and mortality of cervical cancer in 2018: A worldwide analysis. Lancet Glob Health. 2020;8(2):e191‐e203.

Okunade

. Human papillomavirus and cervical cancer. J Obstet Gynaecol. 2020;40(5):602‐608.

Kusakabe

Taguchi

Sone

, et al. Carcinogenesis and management of human papillomavirus-associated cervical cancer. Int J Clin Oncol. 2023;28(8):965‐974.

Sung

Ferlay

Siegel

, et al. Global cancer statistics 2020: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J Clin. 2021 May;71(3):209‐249.

Xia

Dong

, et al. Cancer statistics in China and United States，2022:Profiles，trends，and determinants. Chin Med J. 2022;135(5):584‐590.

Chargari

Peignaux

Escande

, et al. Radiotherapy of cervical cancer. Cancer Radiother. 2022 Feb-Apr;26(1-2):298‐308.

Mayadev

Mahantshetty

, et al. Global challenges of radiotherapy for the treatment of locally advanced cervical cancer. Int J Gynecol Cancer. 2022 Mar;32(3):436‐445.

Polgár

Major

Varga

. A méhnyakrák sugárkezelése és radiokemoterápiája [radiotherapy and radio-chemotherapy of cervical cancer. Magy Onkol. 2022 Dec 31;66(4):307‐314.

10.

Hawkins P

Kadam

Jackson

, et al. Organ-sparing in radiotherapy for head-and-neck cancer: Improving quality of life. Semin Radiat Oncol. 2018;28(1):46‐52.

11.

Rouhi

Niyoteka

Carré

, et al. Automatic gross tumor volume segmentation with failure detection for safe implementation in locally advanced cervical cancer. Phys Imaging Radiat Oncol. 2024;30:100578.

12.

Liu

Guan

, et al. Development and validation of a deep learning algorithm for auto-delineation of clinical target volume and organs at risk in cervical cancer radiotherapy. Radiother Oncol. 2020;153:172‐179.

13.

Perkins

Wentzensen

Guido

, et al. Cervical cancer screening: A review. JAMA. 2023;330(6):547‐558.

14.

Kakotkin

Semina

Zadorkina

, et al. Prevention strategies and early diagnosis of cervical cancer: Current state and prospects. Diagnostics. 2023;13(4):610.

15.

Xia

, et al. Radiomics based on nomogram predict pelvic lymphnode metastasis in early-stage cervical cancer. Diagnostics (Basel). 2022;12(10):2446.

16.

Chen

Liu

Thai

, et al. Developing a new radiomics-based CT image marker to detect lymph node metastasis among cervical cancer patients. Comput Methods Programs Biomed. 2020;197:105759.

17.

Wang

Gao

Guo

, et al. Preoperative prediction of parametrial invasion in early-stage cervical cancer with MRI-based radiomics nomogram. Eur Radiol. 2020;30(6):3585‐3593.

18.

Xin

Rixin

Linrui

, et al. Machine learning-based radiomics for predicting outcomes in cervical cancer patients undergoing concurrent chemoradiotherapy. Comput Biol Med. 2024;177:108593.

19.

Zhang

, et al. Radiomics analysis based on multiparametric magnetic resonance imaging for differentiating early stage of cervical cancer. Front Med. 2024;11:1336640.

20.

Collins

Moons

KGM

Dhiman

, et al. TRIPOD + AI statement: Updated guidance for reporting clinical prediction models that use regression or machine learning methods. Br Med J. 2024;385:e078378.

21.

Zwanenburg

Vallières

Abdalah

, et al. The image biomarker standardization initiative: Standardized quantitative radiomics for high-throughput image-based phenotyping. Radiology. 2020;295(2):328‐338.

22.

Tibshirani

. Regression shrinkage and selection via the lasso. J R Statist Soc B. 1996;58(1):267‐288.

23.

Zou

Hastie

. Regularization and variable selection via the elastic net. J R Statist Soc B. 2005;67(2):301‐320.

24.

Fonti

Belitser

. Feature selection using lasso. VU Amsterdam Research Paper in Business Analytics. 2017;30:1‐25.

25.

Vargas-Cardona

Rodriguez-Lopez

Arrivillaga

, et al. Artificial intelligence for cervical cancer screening: Scoping review, 2009-2022. Int J Gynaecol Obstet. 2024;165(2):566‐578.

26.

Mathivanan

Francis

Srinivasan

, et al. Enhancing cervical cancer detection and robust classification through a fusion of deep learning models. Sci Rep. 2024;14(1):10812.

27.

Perkins

Wentzensen

Guido

, et al. Cervical cancer screening: A review. JAMA. 2023 Aug 8;330(6):547‐558.

28.

Yang

Gao

Sun

, et al. An interpretable clinical ultrasound-radiomics combined model for diagnosis of stage I cervical cancer. Front Oncol. 2024;14:1353780.

29.

Chiappa

Bogani

Interlenghi

, et al. Using radiomics and machine learning applied to MRI to predict response to neoadjuvant chemotherapy in locally advanced cervical cancer. Diagnostics (Basel). 2023;13(19):3139.

30.

Zhang

Yang

Chen

, et al. Construction of a diagnostic classifier for cervical intraepithelial neoplasia and cervical cancer based on XGBoost feature selection and random forest model. J Obstet Gynaecol Res. 2023;49(1):296‐303.

31.

Reuzé

Schernberg

Orlhac

, et al. Radiomics in nuclear medicine applied to radiation therapy: Methods, pitfalls, and challenges. Int J Radiat Oncol Biol Phys. 2018;102(4):1117‐1142.

32.

Jin

Zhu

Teng

, et al. The accuracy and radiomics feature effects of multiple U-net-based automatic segmentation models for transvaginal ultrasound images of cervical cancer. J Digit Imaging. 2022 Aug;35(4):983‐992.

33.

Jin

, et al. PET-CT radiomics by integrating primary tumor and peritumoral areas predicts E-cadherin expression and correlates with pelvic lymph node metastasis in early-stage cervical cancer. Eur Radiol. 2021;31(8):5967‐5979.

34.

Zhang

. Introduction to machine learning: K-nearest neighbors. Ann Transl Med. 2016;4(11):218.

35.

Kingsford

Salzberg

. What are decision trees? Nat Biotechnol. 2008;26(9):1011‐1013.

36.

Kecman

. Support vector machines—an Introduction. In: Wang

, ed. Support vector machines: theory and applications. Springer; 2005:1‐47.

37.

Ben-Hur

Weston

. A user's guide to support vector machines. Methods Mol Biol. 2010;609:223‐239.

38.

Rigatti

. Random forest. J Insur Med. 2017;47(1):31‐39.

39.

Barile

Marzullo

Stamile

, et al. Ensemble learning for multiple sclerosis disability estimation using brain structural connectivity. Brain Connect. 2022;12(5):476‐488.

40.

Chen

Guestrin

. Xgboost: a scalable tree boosting system. Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining; 2016:785‐794.

41.

Friedman

. Greedy function approximation: A gradient boosting machine. Ann Stat. 2001;29(5):1189‐1232.

42.

Meng

Finley

, et al. Lightgbm: A highly efficient gradient boosting decision tree. Adv Neural Inf Process Syst. 2017;30.

43.

Park

Kim

, et al. Comparison of machine and deep learning for the classification of cervical cancer based on cervicography images. Sci Rep. 2021;11(1):16143.

44.

Cho

Choi

Lee

, et al. Classification of cervical neoplasms on colposcopic photography using deep learning. Sci Rep. 2020;10(1):13652.

45.

Zhu

Wang

Chen

, et al. Machine learning-based radiomics analysis of preoperative functional liver reserve with MRI and CT image. BMC Med Imaging. 2023;23(1):94.

46.

Chen

Guo

, et al. The impact of respiratory motion and CT pitch on the robustness of radiomics feature extraction in 4DCT lung imaging. Comput Methods Programs Biomed. 2020;197:105719.

47.

Reyhan

Zhang

, et al. The impact of phantom design and material-dependence on repeatability and reproducibility of CT-based radiomics features. Med Phys. 2022;49(3):1648‐1659.

Constructing a Classification Model for Cervical Cancer Tumor Tissue and Normal Tissue Based on CT Radiomics

Abstract

Keywords

Introduction

Materials and methods

Data collection

CT Scanning parameters

Delineation of Regions of Interest

Image Preprocessing and Feature Extraction

Radiomics feature screening

Radscore calculation

Model Construction and Validation

Statistical Analyses

Results

Baseline Characteristics

Feature Screening

Model Construction

Discussion

Conclusion

Footnotes

Abbreviations

Author Contribution

Availability of Data and Materials

Consent for Publication

Declaration of Conflicting Interests

Ethics Approval and Consent to Participate

Funding

ORCID iD

References