Abstract:To address the challenges faced by wall-climbing robots during bridge inspections—namely low accuracy in identifying densely clustered, visually complex concrete defects across varying scales, leading to false positives and false negatives—we propose a lightweight CDM-YOLO algorithm based on YOLOv12n. First, to tackle the difficulty in recognizing multi-scale defects, we introduce a multi-scale feature fusion network into the backbone. This enhances the backbone′s ability to extract diverse and fine-grained features, enabling the model to adapt to defects of varying scales. Second, to address the confusion between dissimilar defects, a dynamic tanh mechanism is employed in the neck to enhance feature focus, clearly distinguishing different defects and reducing false positives and negatives. Finally, for densely clustered defects, the CARAFE algorithm is applied in the neck to strengthen deep semantic information flow, optimizing the model′s ability to identify dense defects. These methods improve detection accuracy while maintaining real-time performance and lightweight characteristics. Compared to YOLOv12n, CDM-YOLO achieves a 2.43% improvement in mAP (IoU=0.5) and a 3.25% increase in recall. This demonstrates its superior handling of multi-scale and dense defects with lower false positive and false negative rates, making it suitable for wall-climbing robots and field equipment with limited computational resources.