Abstract:A lightweight detection model YOLO-DAS was proposed to solve the problems of small target size, complex background interference and low efficiency of multi-scale feature fusion in UAV aerial images. A dynamic multi-scale sensing convolution module DMSConv is constructed to enhance the feature capture capability. The context-aware feature recombination upsampling ADEPT was designed to optimize the feature map reconstruction process to improve the integration accuracy of context information. The neck network is reconstructed using the bidirectional global-local spatial attention SCOPE, and the single path fusion limitation is broken through the bidirectional feature interaction. A shallow small target detection layer is added to strengthen the localization information extraction of low-level features. Based on the VisDrone2019 dataset, the model achieved 39.8% and 23.7% in mAP0.5 and MAP0.5:0.95 indexes, respectively, which increased by 8.4% and 5.1% compared with the benchmark YOLOv8n. The accuracy and recall rate increased by 8.1% and 7% simultaneously, and the number of parameters decreased by 0.49 M. It provides an effective solution for small and medium-sized target detection in UAV aerial images.