Abstract
Existing saliency detection methods have achieved great progress in extracting multi-level features, however it is a challenging problem to catch accurate long-range dependencies that can enhance the accuracy of semantic information. To address this, a Transformer-based multi-scale attention and boundary enhancement with long-range dependency (MSBE) network is proposed in this paper. A multi-scale attention enhancement module (MSAEM) is designed to reduce the redundant or noisy features and generate a high-quality feature representation by integrating multiple attentional features with diverse perspectives. The high-quality features are then fed into the triple Transformer encoder embedding module (TEM) to enhance high-level semantic features by learning long-range dependencies across layers. In the decoder part, a cross-layer feature fusion module (CLFFM) and boundary enhancement module (BEM) are designed to improve the effect of feature fusion and get accurate prediction results. Extensive experiments on six challenging public datasets demonstrate that the proposed method achieves competitive performance.
Keywords
Get full access to this article
View all access options for this article.
