Abstract:To address the issues of low accuracy and insufficient generalization ability of models in complex scenarios for current transformer appearance defect detection, this paper leverages the residual structure′s merits to improve YOLOv11n with three modules. Firstly, an inverted residual attention mechanism, iEMA, is designed. It can effectively utilize the long-distance dependency and aims to improve the accuracy of transformer defect detection. Secondly, by leveraging the advantages of depthwise separable convolution in multi-scale feature extraction and the characteristics of the residual structure, an MSCB structure is designed to enhance the feature extraction and fusion capabilities of the model. Since, to address missed detections due to insufficient contextual utilization by YOLOv11′s detection head, we design the MR-Detect head. It integrates grouped convolution and residual concepts, offering rich feature representations for classification. Finally, the non-maximum suppression algorithm is combined with Inner_MPDIoU to address the limitations of traditional loss functions in detecting irregular objects and objects with large size variations. Experimental results show that compared with YOLOv11n, the improved algorithm in this paper, while ensuring real-time detection, increases the mAP@0.5 by 5.9% and the recall rate by 2.8%. It has higher detection accuracy in complex transformer operating condition detection scenarios and can more effectively detect various types of defects.