Abstract
(1) Medical image segmentation is crucial for disease diagnosis, surgical planning, and therapeutic monitoring, but existing methods face significant challenges due to the complex structures of human organs, including substantial size variations, indistinct boundaries, and low inter-tissue contrast. (2) To address this, we propose SDM-UNet, a hybrid network integrating CNN and Transformer modules to enhance segmentation performance. The architecture features a Multi-Attention Feature Refinement (MAFR) block replacing the Swin-UNet bottleneck, which combines adaptive kernel convolution, enhanced convolution, and channel attention to improve local feature extraction, and Multi-Fusion Dense Skip Connections that facilitate multi-scale feature fusion between the encoder and decoder to mitigate spatial information loss during downsampling. (3) Validated on the Synapse multi-organ CT and ACDC cardiac MRI datasets, SDM-UNet was trained using the PyTorch framework with ImageNet-pretrained weights and evaluated via Dice Similarity Coefficient (DSC) and 95th percentile Hausdorff Distance (HD95). (4) Experimental results show that SDM-UNet achieves an average DSC of 80.51% and HD95 of 22.09 mm on Synapse, and an average DSC of 90.58% and HD95 of 1.12 mm on ACDC, outperforming state-of-the-art methods like Swin-UNet and SCUNet++ and demonstrating its superiority in balancing global context understanding and local detail preservation.
Keywords
Get full access to this article
View all access options for this article.
