基于注意力机制的多视图立体重建算法
DOI:
作者:
作者单位:

1.西安科技大学通信与信息工程学院;2.西安科技大学电气与控制工程学院

作者简介:

通讯作者:

中图分类号:

TP391. 4;TN911.73

基金项目:

陕西省重点研发计划项目(2021GY-338)、西安市碑林区2023年应用技术研发储备工程项目(GX2333)资助


Multi-view stereo reconstruction algorithm based on attention mechanism
Author:
Affiliation:

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    针对多视图立体重建在光照不均匀、弱纹理、非朗伯表面等复杂场景中重建完整度差、泛化能力不足的问题,本文提出了一种基于注意力机制的多视图立体重建算法。在特征提取阶段,该算法采用基于深度可分离卷积和自注意力机制的多尺度特征提取模块,在扩大感受野的同时增强多视图间的空间特征关系,从而提升网络在复杂场景下特征的表征能力以实现更精确的特征匹配。在代价体正则化阶段,本文引入通道注意力机制来自适应调节不同通道的权重,从而减少无关信息对模型的干扰并过滤背景噪声,以提升模型的泛化能力。在DTU数据集上,本文算法的完整度和整体度分别为0.286和0.334,与基准算法CasMVSNet 相比,分别提升了25.71%和5.92%,与其他的state-of-the-art (SOTA)算法相比,在复杂场景中重建点云的结构也更加完整。在Tanks and Temples中级数据集上,重建点云综合指标F-score为61.49,这表明本文算法具有更好的鲁棒性和泛化能力。

    Abstract:

    Aiming at the problems of poor reconstruction completeness and insufficient generalization ability of multi-view stereo reconstruction in complex scenes such as uneven illumination, weak texture, and non-Lambertian surfaces, this paper proposes a multiview stereo reconstruction algorithm based on the attention mechanism. In the feature extraction stage, the algorithm adopts a multi-scale feature extraction module based on depth-separable convolution and self-attention mechanism, which enhances the spatial feature relationships among multiple views while expanding the sensory field, thus improving the network's ability to characterize features in complex scenes to achieve more accurate feature matching. In the cost volume regularization stage, this paper introduces the channel attention mechanism to adaptively adjust the weights of different channels, so as to reduce the interference of irrelevant information on the model and filter the background noise to improve the generalization ability of the model. On the DTU dataset, the completeness and overall metrics of this paper's algorithm are 0.286 and 0.334, respectively, which are improved by 25.71% and 5.92% compared to the benchmark algorithm CasMVSNet. The structure of the reconstructed point cloud is also more complete in complex scenes compared to other state-of-the-art (SOTA) algorithms. On the Tanks and Temples intermediate dataset, the reconstructed point cloud composite index F-score is 61.49, indicating that the algorithm in this paper has better robustness and generalization ability.

    参考文献
    相似文献
    引证文献
引用本文
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2024-07-04
  • 最后修改日期:2024-08-27
  • 录用日期:2024-08-28
  • 在线发布日期:
  • 出版日期: