基于像素注意力特征融合的城市街景语义分割算法研究
DOI:
作者:
作者单位:

1.湖北工业大学电气与电子工程学院 武汉 430068; 2.湖北工业大学湖北省电网智能控制与装备工程技术研究中心 武汉 430068; 3.武汉工程大学计算机科学与工程学院 武汉 430205

作者简介:

通讯作者:

中图分类号:

TP391.4

基金项目:

国家自然科学基金(62071172,62202148)项目资助


Semantic segmentation method for urban street scenes based on pixel attention feature fusion
Author:
Affiliation:

1.School of Electrical and Electronic Engineering, Hubei University of Technology, Wuhan 430068, China; 2.Hubei Power Grid Intelligent Control and Equipment Engineering Technology Research Center, Hubei University of Technology, Wuhan 430068, China; 3.Shool of Computer Science and Engineering, Wuhan Engineering University, Wuhan 430205, China

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    针对城市街景数据集中存在小目标和大量长条形状物体,分割难度大,虽然目前编码解码结构的网络能细化分割结果,但大多数都没有充分利用空间和上下文信息,因此本文提出一种基于像素注意力特征融合的语义分割算法。首先以ResNet50作为骨干网络,利用空洞空间卷积池化金字塔和条状池化进行初步特征融合,获得多尺度特征的同时规避无用信息;然后利用像素融合注意力模块,聚合上下文信息并恢复空间信息,最后利用注意力特征细化模块消除冗余信息。该算法在CamVid数据集上进行实验,结果表明该算法在验证集上能达到 7522%的mIoU,在测试集上也能达到67.21%。相比于DeepLabv3+网络分别提升了2.51%和2.86%。

    Abstract:

    For the presence of small targets and a large number of long bar-shaped objects in urban streetscape datasets, segmentation is difficult, and although current networks with coding and decoding structures can refine segmentation results, most of them do not make full use of spatial and contextual information, so this paper proposes a semantic segmentation algorithm based on pixel attention feature fusion. Firstly, using ResNet50 as the backbone network, the initial feature fusion is carried out using the null space convolutional pooling pyramid and strip pooling to obtain multi-scale features while circumventing useless information; then the pixel fusion attention module is used to aggregate contextual information and recover spatial information, and finally the attention feature refinement module is used to eliminate redundant information. The algorithm was experimented on the CamVid dataset and the results showed that the algorithm was able to achieve 75.22% mIoU on the validation set and 67.21% on the test set. This is an improvement of 2.51% and 2.86% respectively compared to the DeepLabv3+ network.

    参考文献
    相似文献
    引证文献
引用本文

李利荣,丁江,梅冰,戴俊伟,巩朋成.基于像素注意力特征融合的城市街景语义分割算法研究[J].电子测量技术,2023,46(20):184-190

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:
  • 最后修改日期:
  • 录用日期:
  • 在线发布日期: 2024-01-23
  • 出版日期: