Abstract:To address the low accuracy in anomaly behavior detection caused by blurry surveillance and complex road conditions, this paper proposes an optimized YOLOv11 model with multi-module collaboration. First, Dynamic Sample replaces traditional upsampling in the neck network to enhance target localization and recognition precision. Second, a redesigned Multi-Window Attention module is integrated into the final layer of the backbone network, improving the capture of anomaly features in blurry videos while suppressing noise interference. Finally, the lightweight ShuffleNetV2 is adopted as the backbone, significantly reducing model parameters while preserving feature representation capability. Through the introduction of Dynamic Sample module and Multi-Window Attention module, experimental results on the UCF101 and UCF Crime datasets demonstrate that our model improves mAP50 and mAP50.95 by 8.5% and 13.1%, respectively, compared to the original YOLOv11, effectively mitigating false negatives and false positives. By combining ShuffleNetV2, the model′s parameter count is reduced from 2.58 M to 0.82 M. Overall, the optimized YOLOv11 model better meets the demands of real-time scenarios such as traffic surveillance, balancing detection efficiency and accuracy with broad application potential.