基于雷达和视觉多层级信息融合的目标检测网络
DOI:
CSTR:
作者:
作者单位:

南京航空航天大学雷达成像与微波光子技术教育部重点实验室 南京 211100

作者简介:

通讯作者:

中图分类号:

TP391;TN919.8

基金项目:

国家自然科学基金(61502228)项目资助


Target detection network based on multi level information fusion of radar and vision
Author:
Affiliation:

Key Laboratory of Radar Imaging and Microwave Photonic Technology of Ministry of Education, Nanjing University of Aeronautics and Astronautics,Nanjing 211100, China

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    针对自动驾驶感知任务中由于道路环境复杂、车载雷达和摄像头数据融合不充分导致的一些高危险性动目标检测效果过差的问题,本文在Centerfusion的基础之上设计了一种雷达和视觉多层级信息融合的目标检测网络MLFusionNet。首先在输入层增加了数据级融合,将雷达回波特征以像素值的形式和图像进行拼接后再通过一个二级残差融合模块输入到编解码网络,丰富了网络的输入信息;然后在骨干网络的编码器和解码器之间设计了一种瓶颈结构的上下文模块,通过多分支的卷积结构获取特征图中更广泛的上下文信息,并通过压缩通道的方式降低参数量;最后设计了一种并行注意力融合模块,解决了特征级模态融合不充分的问题。在nuScenes数据集上的实验结果表明MLFusionNet的NDS达到了46.6%,相比较多模态网络Centerfusion汽车、卡车和行人的mAP分别提升了1.4%、3.0%和1.5%,说明网络更加关注驾驶环境中的高危险性动态目标。

    Abstract:

    In response to the problem of poor detection performance of some high-risk moving targets in autonomous driving perception tasks due to complex road environments and insufficient fusion of onboard radar and camera data, this paper designs an object detection network MLFusionNet that integrates radar and visual multi-level information based on Centerfusion. Firstly, data level fusion is added to the input layer, which concatenates the radar echo features with the image in the form of pixel values, and then inputs them into the encoding and decoding network through a secondary residual fusion module, enriching the input information of the network; then, a bottleneck structured context module was designed between the encoder and decoder of the backbone network, which obtains broader contextual information from the feature map through a multi branch convolutional structure and reduces the number of parameters through compression channels; finally, a parallel attention fusion module was designed to solve the problem of insufficient feature level modal fusion. The experimental results on the nuScenes dataset showed that the NDS of MLFusionNet reached 46.6%, which increased the mAP of cars, trucks, and pedestrians by 1.4、3.0 and 1.5 percentage points respectively compared to the multimodal network Centerfusion. This indicates that the network pays more attention to high-risk dynamic targets in the driving environment.

    参考文献
    相似文献
    引证文献
引用本文

周志伟,周建江,王佳宾,邓凯.基于雷达和视觉多层级信息融合的目标检测网络[J].电子测量技术,2024,47(24):110-117

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:
  • 最后修改日期:
  • 录用日期:
  • 在线发布日期: 2025-01-24
  • 出版日期:
文章二维码