Roadside object detection algorithm with multi-scale feature fusion and interaction
DOI:
CSTR:
Author:
Affiliation:

1.School of Computer Science and Technology, Nanjing University of Information Science and Technology,Nanjing 210044,China; 2.Key Laboratory of Embedded System and Service Computing Ministry of Education, Tongji University, Shanghai 201804, China; 3.School of Internet of Things Engineering, Wuxi University,Wuxi 214105,China

Clc Number:

TP391;TN791

Fund Project:

  • Article
  • |
  • Figures
  • |
  • Metrics
  • |
  • Reference
  • |
  • Related
  • |
  • Cited by
  • |
  • Materials
  • |
  • Comments
    Abstract:

    In view of the challenges of dense small targets, multi-scale variations, and complex weather background interference in roadside perspective target detection tasks, a multi-scale feature fusion and interaction-based target detection algorithm, MF-YOLO, is proposed. Design C2f-CAST, interact and transform features from different subspaces through star operations, and introduce MLCA to capture local, global, channel, and spatial features between distant pixels. Multi-scale information aggregation enhances attention to significant semantic information of occluded objects and eliminates background influence; to address the problem of low efficiency in context information fusion for the neck layer, we add lightweight convolution GSConv to optimize traditional convolution, and design a cross-level partial network module VoV-GSCSP to reduce model complexity and parameter count. Construct a cross-level fusion module SDFM to perform self-calibration on shallow feature maps and fuse semantic information from deep feature maps to solve the problem of missed detection of small targets; finally, the design is based on an adaptive penalty factor, a gradient adjustment function for anchor box quality combined with a dynamic clustering mechanism to improve the WPIoU loss function, enhancing the performance of bounding box regression and detection robustness. The experimental results show that MF-YOLO achieves mAP@0.5 of 85.1% and 92.3% on DAIR-V2X-I and UA-DETRAC datasets, respectively, which is 4.4% and 1.8% higher than the original YOLOv8s, with a reduction of 19.8% in computational complexity and 8.18% in parameter count. The detection speed reaches 152 fps, meeting the real-time requirements.

    Reference
    Related
    Cited by
Get Citation
Share
Article Metrics
  • Abstract:
  • PDF:
  • HTML:
  • Cited by:
History
  • Received:
  • Revised:
  • Adopted:
  • Online: January 22,2025
  • Published:
Article QR Code