Abstract:To address the issues of missed detections and low accuracy caused by high occlusion and large scale variations in dense pedestrian detection, this paper proposes an efficient improved RT-DETR algorithm, RSH-RTDETR, for complex scene dense pedestrian detection. Firstly, the Regocn module is proposed to improve the backbone, using limited ordinary convolutions for feature extraction, followed by a linear transformation operation. Meanwhile, RepConv is used on the gradient flow branch to compensate for the performance loss caused by discarding residual blocks and enhance the feature extraction and gradient flow capabilities, achieving better detection of targets of different scales while reducing the computational load and parameter count. Secondly, a 160×160 S2 detection layer is introduced in the neck to enhance the detection ability of small-scale pedestrian targets during the feature fusion stage. Finally, the Haar wavelet downsampling module (HWD) is adopted to expand the receptive field, reduce model complexity, and improve the detection accuracy of occluded pedestrian targets. Ablation and comparison experiments were conducted on the CrowdHuman dataset, achieving an mAP50 of 86.6% and an mAP50.95 of 57.8%. Compared with the original algorithm, mAP50 was improved by 1.2% and mAP50.95 by 1.9%, with a 40% reduction in parameters. It also outperformed the RT-DETR algorithm on the Wider person dataset. Experimental results show that RSH-RTDETR improves the accuracy of dense pedestrian detection while reducing the parameter count compared to the RTDETR-R18 model, and outperforms other algorithms. The improved algorithm in this paper achieves lightweight while maintaining high accuracy, demonstrating excellent performance in dense pedestrian detection tasks in complex scenes.