面向城市复杂街景的实时语义分割算法
DOI:
CSTR:
作者:
作者单位:

东北林业大学计算机与控制工程学院 哈尔滨 150040

作者简介:

通讯作者:

中图分类号:

TP391.41;TN0

基金项目:


Real-time semantic segmentation algorithm for complex urban street scenes
Author:
Affiliation:

College of Computer Science and Control Engineering, Northeast Forestry University,Harbin 150040, China

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    实时高精度分割城市复杂街景对自动驾驶至关重要。针对现有的实时语义分割网络对高分辨率分支空间信息和细节特征捕获不足,以及高低分辨率特征融合效率低下导致信息丢失从而制约了分割精度的提升的问题,本文提出了基于多尺度部分膨胀卷积与边界协同双注意力引导融合的实时语义分割网络(MPDANet)。首先,设计多尺度部分膨胀卷积模块 (MSPDC),利用并行阶梯式膨胀率的部分膨胀卷积,从不同尺度高效捕获高分辨率分支的细节特征与空间信息,解决其信息捕获不足问题。其次,构建注意力引导特征增强金字塔模块 (AFPM),通过非对称池化层提取低分辨率分支的多尺度语义信息,并结合像素注意力机制进一步增强语义信息。最后,提出边界协同双注意力引导融合模块(BCDAF),通过并行通道空间注意力筛选关键语义与空间信息,抑制跨分辨率特征融合造成的信息丢失,并引入边界注意力提升目标边界分割效果。在Cityscapes验证集上,所提网络以295 fps的速度取得78.6%的mIoU;在CamVid测试集上,以454 fps的速度取得77.4%的mIoU。实验结果表明,本文所提的网络在保持实时性的同时,实现了对城市复杂街景的高精度分割。

    Abstract:

    Real-time high-precision segmentation of complex urban street scenes is crucial for autonomous driving. Aiming at the problems that existing real-time semantic segmentation networks have insufficient capture of spatial information and detailed features in high-resolution branches, as well as inefficient fusion of high and low-resolution features leading to information loss, which restricts the improvement of segmentation accuracy, this paper proposes a real-time semantic segmentation network based on multi-scale partial dilated convolution and boundary collaborative dual attention guided fusion(MPDANet).First, a Multi-Scale Partial Dilated Convolution Module (MSPDC) is designed. By using partial dilated convolutions with parallel ladder-type dilation rates, it efficiently captures detailed features andspatial information of high-resolution branches from different scales, addressing the problem of insufficient information capture.Second, an Attention-Guided Feature Pyramid Module (AFPM) is constructed. It extracts multi-scale semantic information from low-resolution branches through an asymmetric pooling layer and further enhances the semantic information by combining a pixel attention mechanism.Finally, a Boundary Collaborative Dual Attention Fusion Module (BCDAF) is proposed. It screens key semantic and spatial information through parallel channel-spatial attention, suppresses information loss caused by cross-resolution feature fusion, and introduces boundary attention to improve the segmentation effect of target boundaries.On the Cityscapes validation set, the proposed network achieves 78.6% mIoU at a speed of 295 fps; on the CamVid test set, it achieves 77.4% mIoU at a speed of 454 fps. Experimental results show that the network proposed in this paper achieves high-precision segmentation of complex urban street scenes while maintaining real-time performance.

    参考文献
    相似文献
    引证文献
引用本文

赵志兴,胡峻峰.面向城市复杂街景的实时语义分割算法[J].电子测量技术,2025,48(20):144-153

复制
分享
相关视频

文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:
  • 最后修改日期:
  • 录用日期:
  • 在线发布日期: 2025-12-19
  • 出版日期:
文章二维码