Abstract:To improve high-performance multi-scale object detection, particularly the accuracy of small object detection, and reduce the probability of traffic accidents, this study proposes an enhanced YOLO11 model with a multi-scale context-enhanced attention mechanism for vehicle detection. Firstly, the RPCSPELAN5 structure is designed and introduced in the backbone network to replace the C3k2 module, enhancing feature extraction capability and information aggregation. Secondly, a DSM module is created and added to the neck network, which incorporates a dynamic upsampling mechanism and a simple, parameter-free attention mechanism to improve feature fusion for small objects. Finally, the neck network is further improved by adopting a Haar wavelet-based downsampling module, which enhances semantic segmentation performance and contextual continuity. Experiments on the VOC2012 and COCO datasets demonstrate significant improvements across multiple evaluation metrics. On the VOC2012 dataset, the improvements in P, R, mAP50, and mAP50.95 were 0.2%, 5.3%, 3.4% and 4.2%, respectively. On the COCO dataset, the improvements were 7.7%, 6.0%, 8.7% and 6.5%, respectively. The proposed algorithm exhibits superior performance in multi-scale object detection, particularly in small object detection accuracy, effectively enhancing vehicle detection precision and contributing to the reduction of traffic accidents.