基于注意力残差网络和混合池化的3D目标检测
DOI:
CSTR:
作者:
作者单位:

上海应用技术大学计算机科学与信息工程学院 上海 201418

作者简介:

通讯作者:

中图分类号:

TN958.98;TP391.4

基金项目:

国家自然科学基金(61672350,61170227)、教育部基金(39120K178038,14YJA880033)、国家社会科学基金(16BGL003)项目资助


3D object detection based on attention residual network and mixed pooling
Author:
Affiliation:

School of Computer Science and Information Engineering, Shanghai Institute of Technology,Shanghai 201418, China

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    针对3D目标检测任务中行人和骑行者的检测精度较低问题,以Voxel-RCNN为基准算法进行改进,提出了一种基于注意力残差网络和混合池化的3D目标检测算法来提升检测精度。首先,设计了一种融合残差网络和注意力机制的新型2D骨干网络,通过残差网络结构来增强模型对不同目标尺寸的适应性,同时引入注意力机制以聚焦于关键区域,提高特征表示能力;其次,提出了一种新型的 MLP 池化方法,同时设计了一种结合卷积的注意力池化方式,两种池化方法不仅能够有效保留小目标的局部几何细节信息,还能增强全局语义特征表达能力,从而进一步提升对复杂场景中多样性目标的捕捉能力。在公开数据集 KITTI 上的实验结果表明,Pedestrian和Cyclist类别的平均精度(mAP3D)分别达到了54.06%、76.85%,相比较于基准算法提升了3.43%、3.03%。该实验结果证明了所提方法的有效性。

    Abstract:

    Aiming at the problem of low detection accuracy of pedestrians and cyclists in 3D object detection tasks, Voxel-RCNN is used as the baseline algorithm for improvement. A 3D object detection algorithm based on residual attention network and hybrid pooling is proposed to improve the detection accuracy. Firstly, a new 2D backbone network integrating residual network and attention mechanism is designed. The residual network structure is used to enhance the adaptability of the model to different object sizes. At the same time, the attention mechanism is introduced to focus on the key area and improve the feature representation ability. Secondly, a new MLP pooling method is proposed, and an attention pooling method combined with convolution is designed. The two pooling methods can not only effectively retain the local geometric details of small objects, but also enhance the expression ability of global semantic features, thereby further improving the ability to capture diverse objects in complex scenes. Experimental results on the public dataset KITTI show that the mean average precision (mAP3D) of the Pedestrian and Cyclist categories reached 54.06% and 76.85%, respectively, which is 3.43% and 3.03% higher than the baseline algorithm. The experimental results demonstrate the effectiveness of the proposed method.

    参考文献
    相似文献
    引证文献
引用本文

王涛,薛庆水,王栋,张旭.基于注意力残差网络和混合池化的3D目标检测[J].电子测量技术,2025,48(17):44-53

复制
分享
相关视频

文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:
  • 最后修改日期:
  • 录用日期:
  • 在线发布日期: 2025-11-04
  • 出版日期:
文章二维码

重要通知公告

①《电子测量技术》期刊收款账户变更公告