Abstract
Accurately locating product target information is crucial for improving competitiveness and brand image. However, traditional methods are often inefficient and lack robustness in complex visual environments. This study proposes an improved product target information localization model, which takes a dense connection module as the backbone to extract multi-scale feature information. Then, a dynamic convolution module is employed to adaptively fuse responses from different convolution kernels, while an attention mechanism is introduced to enhance key regions. Finally, a multi-stage feature refinement module is applied to progressively optimize edge and structural information, thereby generating high-quality saliency maps and improving localization accuracy and model robustness. Compared with the baseline model without refinement, introducing three feature refinement modules increases the F-measure by 0.026, while dynamic convolution achieves an optimal F-measure of 0.951. Moreover, the combination of two feature refinement modules and dynamic convolution reduces the MAE by 0.018. Compared with four state-of-the-art models (Capsal, PiCANet, PoolNet, and DGRL), I-PFPN consistently outperforms them in F-measure and PR curve evaluations. In practice, the model completes product target localization within approximately 0.5 s per image, making it a fast and effective tool for enterprise-level applications in dynamic market environments.
Keywords
Get full access to this article
View all access options for this article.
