Abstract:Given the difficulty and low accuracy of small target detection in aerial images, YOLOv8s improved small target detection method, namely YOLOv8-ERD, was proposed. Initially, the YOLOv8s Neck network was optimized using the Efficient Neck feature fusion method to facilitate the efficient merging of high-level semantic information with low-level spatial information. Subsequently, the receptive-field attention convolutional operation RFAConv was incorporated to emphasize the significance of various features within the receptive field slider and to bolster the feature extraction capability. Additionally, the C2f module was replaced with the improved DyC2f module, which employs dynamic convolution DynamicConv, thereby not only reducing computational overhead but also enhancing model performance. Lastly, a small target detection layer was added to refine the recognition accuracy of diminutive targets. Experimental results show that on Visdrone2019 public data set, compared with the benchmark model YOLOv8s, the mAP@0.5 of YOLOv8-ERD has increased by 5.0%, and the accuracy P has increased by 4.0%, and it performs well in comparison with other mainstream target detection methods.