Abstract:In addressing the issue of ineffective extraction of rich defect features for small-scale surface defects on steel due to their low contrast and small proportion, this paper proposes a solution for small-target defect detection. Leveraging the relationship between contextual information integration and enhanced feature fusion, we introduce the following approaches: incorporating the sliding window mechanism Swin Transformer, which integrates feature information from different blocks hierarchically and through local windows to enhance the contrast of defect features while reducing convolutional operation density; the model employs Coordinate Attention to obtain more positional information, enhancing the diversity of features related to small-target defects. Additionally, we propose the steel surface small-target defect detection model SFNet based on self-attention feature fusion, integrating features with richer semantic information across different scales using the CSP-FCN feature fusion module. Experimental results demonstrate that SFNet achieves superior detection performance on the NEU-DET and GC10-DET public datasets compared to current classic object detection models. Furthermore, the proposed model achieves an average precision improvement of 3% and 3.7%, respectively, while reducing the parameter count to half of its original size.