Abstract
Objective
This study aimed to develop a robust framework for breast cancer diagnosis by integrating advanced segmentation and classification approaches. Transformer-based and U-Net segmentation models were combined with radiomic feature extraction and machine learning classifiers to improve segmentation precision and classification accuracy in mammographic images.
Materials and Methods
A multi-center dataset of 8000 mammograms (4200 normal, 3800 abnormal) was used. Segmentation was performed using Transformer-based and U-Net models, evaluated through Dice Coefficient (DSC), Intersection over Union (IoU), Hausdorff Distance (HD95), and Pixel-Wise Accuracy. Radiomic features were extracted from segmented masks, with Recursive Feature Elimination (RFE) and Analysis of Variance (ANOVA) employed to select significant features. Classifiers including Logistic Regression, XGBoost, CatBoost, and a Stacking Ensemble model were applied to classify tumors into benign or malignant. Classification performance was assessed using accuracy, sensitivity, F1 score, and AUC-ROC. SHAP analysis validated feature importance, and Q-value heatmaps evaluated statistical significance.
Results
The Transformer-based model achieved superior segmentation results with DSC (0.94 ± 0.01 training, 0.92 ± 0.02 test), IoU (0.91 ± 0.01 training, 0.89 ± 0.02 test), HD95 (3.0 ± 0.3 mm training, 3.3 ± 0.4 mm test), and Pixel-Wise Accuracy (0.96 ± 0.01 training, 0.94 ± 0.02 test), consistently outperforming U-Net across all metrics. For classification, Transformer-segmented features with the Stacking Ensemble achieved the highest test results: 93% accuracy, 92% sensitivity, 93% F1 score, and 95% AUC. U-Net-segmented features achieved lower metrics, with the best test accuracy at 84%. SHAP analysis confirmed the importance of features like Gray-Level Non-Uniformity and Zone Entropy.
Conclusion
This study demonstrates the superiority of Transformer-based segmentation integrated with radiomic feature selection and robust classification models. The framework provides a precise and interpretable solution for breast cancer diagnosis, with potential for scalability to 3D imaging and multimodal datasets.
Keywords
Get full access to this article
View all access options for this article.
