Sage Journals: Discover world-class research

Abstract

Existing saliency detection methods have achieved great progress in extracting multi-level features, however it is a challenging problem to catch accurate long-range dependencies that can enhance the accuracy of semantic information. To address this, a Transformer-based multi-scale attention and boundary enhancement with long-range dependency (MSBE) network is proposed in this paper. A multi-scale attention enhancement module (MSAEM) is designed to reduce the redundant or noisy features and generate a high-quality feature representation by integrating multiple attentional features with diverse perspectives. The high-quality features are then fed into the triple Transformer encoder embedding module (TEM) to enhance high-level semantic features by learning long-range dependencies across layers. In the decoder part, a cross-layer feature fusion module (CLFFM) and boundary enhancement module (BEM) are designed to improve the effect of feature fusion and get accurate prediction results. Extensive experiments on six challenging public datasets demonstrate that the proposed method achieves competitive performance.

Keywords

Salient object detection long-range dependencies transformer encoder cross-layer feature fusion boundary enhancement module

Get full access to this article

View all access options for this article.

References

Cheng

M.M.

, Zhang

F.L.

, Mitra

N.J.

et al. Repfinder: finding approximately repeated scene elements for image editing [J], ACM Transactions on Graphics (TOG) 29(4) (2010), 1–8.

Gao

, Wang

, Zha

Z.J.

et al. Visual-textual joint relevance learning for tag-based social image search [J], IEEE Transactions on Image Processing 22(1) (2012), 363–376.

Marchesotti

, Cifarelli

, Csurka

A framework for visual saliency detection with applications to image thumb nailing [C], 2009 IEEE 12th International Conference on Computer Vision. IEEE, 2009:2232–2239.

Zheng

, Li

, Yao

LHRNet: Lateral hierarchically refining network for salient object detection [J], Fuzzy Systems 37(2) (2019), 2503–2514.

Liu

, Han

, Yang

M.H.

Picanet: learning pixel-wise contextual attention for saliency detection [C], Proceedings of the IEEE conference on computer vision and pattern recognition, 2018:3089–3098.

Liu

J.J.

, Hou

, Cheng

M.M.

et al. Asimple pooling-based design for real-time salient object detection [C], Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2019:3917–3926.

Simonyan

, Zisserman

Very deep convolutional networks for large-scale image recognition [J], arXiv preprint arXiv:1409.1556, 2014.

, Zhang

, Ren

et al. Deep residual learning for image recognition [C], Proceedings of the IEEE conference on computer vision and pattern recognition. 2016:770–778.

, Su

, Huang

Cascaded partial decoder for fast and accurate salient object detection [C], Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2019:3907–3916.

10.

, Cong

, Hou

et al. Nested network with two-stream pyramid for salient object detection in optical remote sensing images [J], IEEE Transactions on Geoscience and Remote Sensing 57(11) (2019), 9156–9166.

11.

Zhao

J.X.

, Liu

J.J.

, Fan

D.P.

et al. EGNet: boundary guidance network for salient object detection [C], Proceedings of the IEEE/CVF international conference on computer vision, 2019:8779–8788.

12.

Zhao

, Pang

, Zhang

et al. Suppress and balance: A simple gated network for salient object detection [C], European conference on computer vision, Springer, Cham, 2020:35–51.

13.

Pang

, Zhao

, Zhang

et al. Multiscale interactive network for salient object detection [C], Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2020:9413–9422.

14.

Wang

, Lai

, Fu

et al. Salient object detection in the deep learning era: an in-depth survey [J], IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021.

15.

Vaswani

, Shazeer

, Parmar

et al. Attention is all you need [J], Advances in Neural Information Processing Systems, 2017, 30.

16.

Wang

, Xie

, Li

et al. Pyramid vision transformer: A versatile backbone for dense prediction without convolutions [C], Proceedings of the IEEE/CVF international conference on computer vision, 2021:568–578.

17.

Liu

, Lin

, Cao

et al. Swin transformer: Hierarchical vision transformer using shifted windows [C], Proceedings of the IEEE/CVF international conference on computer vision, 2021:10012–10022.

18.

Liu

Y.H.

Feature extraction and image recognition with convolutional neural networks [C], Journal of Physics: Conference Series. IOP Publishing 1087(6) (2018), 062032.

19.

Liu

, Cheng

M.M.

, Zhang

X.Y.

et al. DNA: Deeply supervised nonlinear aggregation for salient object detection [J], IEEE Transactions on Cybernetics, 2021.

20.

Chen

, Wang

, Tan

et al. Embedding attention and residualnetwork for accurate salient object detection [J], IEEETransactions on Cybernetics 50(5) (2018), 2050–2062.

21.

Liu

, Zhang

X.Y.

, Bian

J.W.

et al. SAMNet: Stereoscopically attentive multi-scale network for lightweight salient object detection [J], IEEE Transactions on Image Processing 30 (2021), 3804–3814.

22.

Cheng

M.M.

, Gao

, Borji

et al. A Highly Efficient Model to Study the Semantics of Salient Object Detection [J], IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021.

23.

, Ban

, Delorme

et al. Transcenter: Transformers with dense queries for multiple-object tracking [J], arXiv preprint arXiv:2103.15145, 2021.

24.

, Zhou

, Zhao

et al. et al., Swin Transformer Embedding UNet for Remote Sensing Image Semantic Segmentation [J], IEEE Transactions on Geoscience and Remote Sensing 60 (2022), 1–15.

25.

Liu

, Zhang

, Wan

et al. Visual saliency transformer [C], Proceedings of the IEEE/CVF international conference on computer vision, 2021:4722–4732.

26.

Mao

, Zhang

, Wan

et al. Transformer transforms salient object detection and camouflaged object detection [J], arXiv preprint arXiv:2104.10127, 2021.

27.

Qiu

, Liu

, Zhang

et al. Boosting Salient Object Detection with Transformer-based Asymmetric Bilateral U Net [J], arXiv preprint arXiv:2108.07851, 2021.

28.

Feng

, Lu

, Ding

Attentive feedback network for boundary-aware salient object detection [C], Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2019:1623–1632.

29.

, Su

, Huang

Stacked cross refinement network for boundary-aware salient object detection [C], Proceedings of the IEEE/CVF international conference on computer vision, 2019:7264–7273.

30.

Qin

, Zhang

, Huang

et al. Basnet: Boundary-aware salient object detection [C], Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2019:7479–7489.

31.

Zhao

, Wu

Pyramid feature attention network for saliency detection [C], Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2019:3085–3094.

32.

Wang

, Zhao

, Shen

et al. Salient object detection with pyramid attention and salient edges [C], Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2019:1448–1457.

33.

, Pan

, Liu

et al. Stacked U-shape network with channel-wise attention for salient object detection [J], IEEE Transactions on Multimedia 23 (2020), 1397–1409.

34.

Sun

, Xia

, Gao

et al. Aggregating dense and attentional multi-scale feature network for salient object detection [J], Digital Signal Processing 130 (2022), 103747.

35.

Fan

D.P.

, Ji

G.P.

, Sun

et al. Camouflaged object detection [C], Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2020:2777–2787.

36.

De Boer

P.T.

, Kroese

D.P.

, Mannor

et al. A tutorial on the cross-entropy method [J], Annals of Operations Research 134(1) (2005), 19–67.

37.

Máttyus

, Luo

, Urtasun

Deep Road mapper: extracting road topology from aerial images [C], Proceedings of the IEEE international conference on computer vision, 2017:3438–3446.

38.

Yan

, Xu

, Shi

et al. Hierarchical saliency detection [C], Proceedings of the IEEE conference on computer vision and pattern recognition, 2013:1155–1162.

39.

Yang

, Zhang

, Lu

et al. Saliency detection via graph-based manifold ranking [C], Proceedings of the IEEE conference on computer vision and pattern recognition, 2013:3166–3173.

40.

, Hou

, Koch

et al. The secrets of salient object segmentation [C], Proceedings of the IEEE conference on computer vision and pattern recognition, 2014:280–287.

41.

, Yu

Visual saliency based on multiscale deep features [C], Proceedings of the IEEE conference on computer vision and pattern recognition, 2015:5455–5463.

42.

Wang

, Lu

, Wang

et al. Learning to detect salient objects with image-level supervision [C], Proceedings of the IEEE conference on computer vision and pattern recognition, 2017:136–145.

43.

Movahedi

, Elder

J.H.

Design and perceptual validation of performance measures for salient object segmentation [C], 2010 IEEE computer society conference on computer vision and pattern recognition-workshops, IEEE, 2010:49–56.

44.

Margolin

, Zelnik-Manor

, Tal

Howto evaluate foreground maps? [C], Proceedings of the IEEE conference on computer vision and pattern recognition, 2014:248–255.

45.

Fan

D.P.

, Cheng

M.M.

, Liu

et al. Structure-measure: a new way to evaluate foreground maps [C], Proceedings of the IEEE international conference on computer vision, 2017:4548–4557.

46.

Perazzi

, Krähenbühl

, Pritch

et al. Saliency filters: Contrast based filtering for salient region detection [C], 2012 IEEE conference on computer vision and pattern recognition, IEEE, 2012:733–740.

47.

Carion

, Massa

, Synnaeve

et al. End-to-end object detection with transformers [C], European conference on computer vision, Springer, Cham, 2020:213–229.

48.

Jiang

, Wang

, Yuan

et al. Salient object detection: A discriminative regional feature integration approach [C], Proceedings of the IEEE conference on computer vision and pattern recognition, 2013:2083–2090.

49.

Zhu

, Liang

, Wei

et al. Saliency optimization from robust background detection [C], Proceedings of the IEEE conference on computer vision and pattern recognition, 2014:2814–2821.

50.

Zhao

, Ouyang

, Li

et al. Saliency detection by multi-context deep learning [C], Proceedings of the IEEE conference on computer vision and pattern recognition, 2015:1265–1274.

51.

Wang

, Lu

, Ruan

et al. Deep networks for saliency detection via local estimation and global search [C], Proceedings of the IEEE conference on computer vision and pattern recognition, 2015:3183–3192.

52.

Wei

, Wang

, Huang

F³Net: fusion, feedback and focus for salient object detection [C], Proceedings of the AAAI Conference on Artificial Intelligence 34(07) (2020), 12321–12328.

53.

Zhou

, Xie

, Lai

J.H.

et al. Interactive two-stream decoder for accurate and fast saliency detection [C], Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2020:9141–9150.

54.

Zhang

, Wang

, Lu

et al. Amulet: Aggregating multi-level convolutional features for salient object detection [C], Proceedings of the IEEE international conference on computer vision, 2017:202–211.

Multi-scale attention and boundary enhancement with long-range dependency for salient object detection

Abstract

Keywords

Get full access to this article

References