Sage Journals: Discover world-class research

Abstract

Accurate medical image segmentation models can efficiently assist healthcare professionals in diagnosis. Segmentation methods based on Convolutional Neural Networks (CNNs) are effective in extracting local features. However, due to their inherently limited receptive fields, they exhibit shortcomings in integrating global dependencies and extracting multi-scale features. This limitation of CNNs has prompted researchers to explore Transformer-based approaches. This shift is driven by the unique self-attention mechanism of Transformers, which effectively models global dependencies and facilitates multi-scale feature extraction. In this study, we designed a medical image segmentation model called FRISFormer, based on a U-shaped architecture. FRISFormer is entirely built on the Transformer architecture and possesses the characteristic of being effectively trainable without the need for pre-trained models. Specifically, the innovation of FRISFormer is primarily reflected in two key aspects: (1) FRISFormer refines the features extracted by the Efficient Self-Attention (ESA) module through a Feature Refinement Feed-forward Network (FRFN), further achieving deep deconstruction and enhancement of features. (2) FRISFormer replaces the classic skip connections with a ReMixed Transformer Context Bridge, effectively promoting the correlation between global dependencies and local context. This study tested FRISFormer on the multi-organ segmentation dataset (Synapse) and the skin lesion segmentation dataset (ISIC 2018). On the Synapse dataset, FRISFormer improved the test metric by 0.50, while on the ISIC dataset, the test metric improved by 0.23. The experimental results fully demonstrate the effectiveness and superiority of FRISFormer in feature representation and segmentation accuracy.

Keywords

medical image segmentation feature refinement context bridge feature correlation

Get full access to this article

View all access options for this article.

References

Houssein

Mohamed

Djenouri

, et al. Nature-inspired optimization algorithms for medical image segmentation: a comprehensive review. Clust Comput 2024; 27: 14745–14766.

Ramesh

Kumar

Swapna

, et al. A review of medical image segmentation algorithms. EAI Endorsed Trans Pervasive Health Technol 2021; 7(27): e6.

Sinha

Dolz

. Multi-scale self-guided attention for medical image segmentation. IEEE J Biomed Health Inform 2021; 25(1): 121–130.

Chen

Papandreou

Kokkinos

, et al. DeepLab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. IEEE Trans Pattern Anal Mach Intell 2018; 40(4): 834–848.

Xiao

Liu

, et al. Transformers in medical image segmentation: a review. Biomed Signal Process Control 2023; 84: 104791.

Yao

Bai

Liao

, et al. From CNN to transformer: a review of medical image segmentation models. J Imaging Inform Med 2024; 37(4): 1529–1547.

, et al. Applications of deep learning in fundus images: a review. Med Image Anal 2021; 69: 101971.

Shelhamer

Long

Darrell

. Fully convolutional networks for semantic segmentation. IEEE Trans Pattern Anal Mach Intell 2017; 39(4): 640–651.

Ronneberger

Fischer

Brox

. “U-net: convolutional networks for biomedical image segmentation”. ArXiv abs/1505.04597, 2015.

10.

Zhou

Siddiquee

MMR

Tajbakhsh

, et al. UNet++: a nested U-net architecture for medical image segmentation. Deep Learn Med Image Anal Multimodal Learn Clin Decis Support (2018) 2018; 11045: 3–11.

11.

Huang

Lin

Tong

, et al. UNet 3+: a full-scale connected UNet for medical image segmentation. In: ICASSP 2020 - 2020 IEEE international conference on acoustics, speech and signal processing (ICASSP), Instituteof Electrical and Electronics Engineerst. 2020, pp. 1055–1059.

12.

Zhu

Chen

Qiu

, et al. SelfReg-UNet: self-regularized UNet for medical image segmentation. In: Linguraru

(ed). Medical Image Computing and Computer Assisted Intervention – MICCAI 2024. MICCAI 2024. Lecture notes in computer science. Springer, 2024, (Vol. 15008, pp. 601–611).

13.

Oktay

Schlemper

Folgoc

, et al. Attention U-net: learning where to look for the pancreas. ArXiv 2018.

14.

Isensee

Jaeger

Kohl

SAA

, et al. nnU-Net: a self-configuring method for deep learning-based biomedical image segmentation. Nat Methods 2021; 18(2): 203–211.

15.

Cheng

, et al. CE-net: context encoder network for 2D medical image segmentation. IEEE Trans Med Imag 2019; 38(10): 2281–2292.

16.

Mou

Zhao

Chen

, et al. CS-net: channel and spatial attention network for curvilinear structure segmentation. In: Shen

(ed). Medical Image Computing and Computer Assisted Intervention – MICCAI 2019. Lecture notes in computer science. Springer, 2019, (Vol. 11764, pp. 721–730).

17.

Vaswani

Shazeer

Parmar

, et al. Attention is all you need. Neural information processing systems, 2017, 17, pp. 6000–6010.

18.

Dosovitskiy

Beyer

Kolesnikov

, et al. “An image is worth 16x16 words: transformers for image recognition at scale”. ArXiv abs/2010.11929 (2020): n. pag.

19.

Liu

Lin

Cao

, et al. Swin transformer: hierarchical vision transformer using shifted windows. In: 2021 IEEE/CVF international conference on computer vision (ICCV). Institute of Electrical and Electronics Engineers. 2021, pp. 9992–10002.

20.

Wang

Xie

, et al. Pyramid vision transformer: a versatile backbone for dense prediction without convolutions. In: 2021 IEEE/CVF international conference on computer vision (ICCV). Institute of Electrical and Electronics Engineers. 2021, pp. 548–558.

21.

, et al. LViT: language meets vision transformer in medical image segmentation. IEEE Trans Med Imag 2024; 43(1): 96–107. Epub 2024 Jan 2. PMID: 37399157.

22.

Zheng

Shan

, et al. ScribFormer: transformer makes CNN work better for scribble-based medical image segmentation. IEEE Trans Med Imag 2024; 43(6): 2254–2265.

23.

Lan

Cai

Jiang

, et al. “BRAU-Net++: U-shaped hybrid CNN-transformer network for medical image segmentation”. ArXiv abs/2401.00722 2024: n. pag.

24.

Chen

, et al. “TransUNet: transformers make strong encoders for medical image segmentation.” ArXiv abs/2102.04306 (2021): n. pag.

25.

Cao

Wang

Chen

, et al. Swin-unet: unet-like pure transformer for medical image segmentation. In: Karlinsky

Michaeli

Nishino

(eds). Computer Vision – ECCV 2022 Workshops. ECCV 2022. Lecture notes in computer science. Springer, 2023, 13803, pp. 205–218.

26.

Zhou

Chen

Pan

, et al. Adapt or perish: adaptive sparse transformer with attentive feature refinement for image restoration. In: 2024 IEEE/CVF conference on computer vision and pattern recognition (CVPR), Seattle, WA, USA, 2024, pp. 2952–2963.

27.

Huang

Deng

, et al. MISSFormer: an effective transformer for 2D medical image segmentation. IEEE Trans Med Imag 2023; 42(5): 1484–1494.

28.

Badrinarayanan

Kendall

Cipolla

. SegNet: a deep convolutional encoder-decoder architecture for image segmentation. IEEE Trans Pattern Anal Mach Intell 2017; 39(12): 2481–2495.

29.

Chen

, et al. H-DenseUNet: hybrid densely connected UNet for liver and tumor segmentation from CT volumes. IEEE Trans Med Imag 2018; 37(12): 2663–2674.

30.

Touvron

Cord

Douze

, et al. Training data-efficient image transformers & distillation through attention. In: International conference on machine learning the conference. 2020.

31.

Deng

Dong

Socher

, et al. ImageNet: a large-scale hierarchical image database. In: 2009 IEEE conference on computer vision and pattern recognition, IEEE, 2009, pp. 248–255.

32.

Chen

Liu

Zhang

, et al. TransAttUnet: multi-level attention-guided U-net with transformer for medical image segmentation. IEEE Trans Emerg Top Comput Intell 2024; 8(1): 55–68.

33.

Chen

Kao

, et al. Run, don't walk: chasing higher FLOPS for faster neural networks. In: 2023 IEEE/CVF conference on computer vision and pattern recognition (CVPR). Vancouver, CANADA: Institute of Electrical andElectronics Engineers. 2023, pp. 12021–12031.

34.

Landman

Igelsias

, et al. MICCAI multi-atlas labeling beyond the cranial vaultworkshop and challenge. Proc Int Conf Med Image Comput Comput-Assist Intervent 2015: 12.

35.

Ali

RAA

Hardie

De Silva

, et al. “Skin lesion segmentation and classification for ISIC 2018 by combining deep CNN and handcrafted features”. ArXiv abs/1908.05730 2019: n. pasg.

FRISFormer: An efficient feature refinement Transformer for medical image segmentation

Abstract

Keywords

Get full access to this article

References