Abstract
Steel surface defect detection is regarded as a critical component of quality control in intelligent manufacturing, as its effectiveness directly influences product qualification rates and production costs. To address this issue, a precise defect detection model, LDSE-YOLO, is proposed in this study. Conventional spatial attention mechanisms focus solely on spatial features and fail to resolve the limitations posed by the parameter-sharing nature of convolutional kernels. Additionally, traditional feature pyramid networks lack effective multi-scale contextual modeling, while existing attention mechanisms are often restricted to a single domain, making it difficult to achieve robust object representation and background suppression under complex conditions.To this end, a Local Dynamic Convolution module (LDConv) is first introduced. Unlike static convolutions with fixed patterns, LDConv employs a dynamic weight allocation mechanism to enhance the representation of fine-grained defects. Next, a Spatial-Context Attention Module (SCAM) is proposed, which integrates dilated convolution and adaptive spatial attention to construct a feature pyramid with improved multi-scale perception. This design combines large receptive field feature extraction with dual spatial-channel attention to effectively decouple defect features from background noise in texture-rich environments.Furthermore, an Enhanced Occlusion Attention Module (EOAM) is incorporated to strengthen the representation of occluded areas, suppress background interference, and reinforce spatial-channel attention, thereby improving the detection of small and partially occluded defects. Experimental results demonstrate that the proposed LDSE-YOLO model achieves superior overall detection performance on the NEU-DET and GC10-DET benchmark datasets, with mAP@0.5 improvements of 4.3% and 2.1%, respectively, compared to mainstream baseline models.
Get full access to this article
View all access options for this article.
