Abstract:Vehicle target detection is crucial for intelligent driving, intelligent transportation, and public safety. However, challenges like background interference, small targets, and vehicle occlusion in dense traffic affect detection accuracy. For these problems, we propose EM-YOLO, which improves YOLOv8 by fusing boundary features and multi-scale features. First, we design a boundary-guided multi-scale feature block. It combines boundary and multi-scale features to improve the backbone network and enhance its ability to suppress background interference. Second, features lose details information as they flow through the network. Small vehicles extract fewer effective features, which worsens this issue. we propose a feature enhancement block that combines features from different layers to reduce detail loss and improve small target detection. Then, we analyze the performance drop caused by occlusion in dense vehicles and propose a detection head to address this issue. Finally, WPFIoU is constructed by combining PIoU, Focaler-IoU, and WIoU. It optimizes the training process and improves detection performance. Experimental results show that the improved model achieved a 1.9% increase in precision and a 4.1% increase in recall compared to the original model. The mAP50 and mAP50∶95 improved by 4.4% and 3.3%. Compared with other advanced methods, the proposed method outperforms in all performance metrics and has significant practical application value.