基于改进的CVT细粒度图像识别算法研究
DOI:
CSTR:
作者:
作者单位:

1.河南理工大学电气工程与自动化学院 焦作 454000; 2.河南省煤矿装备智能检测与控制重点实验室 焦作 454003

作者简介:

通讯作者:

中图分类号:

TP391.4; TN791

基金项目:

河南省科技攻关项目(222102210230)、河南理工大学博士基金(B2018-33)项目资助


Investigation into an enhanced CVT-based algorithm for fine-grained image recognition
Author:
Affiliation:

1.School of Electrical Engineering and Automation, Henan Polytechnic University,Jiaozuo 454000, China; 2.Henan Key Laboratory of Intelligent Detection and Control of Coal Mine Equipment,Jiaozuo 454003, China

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    针对细粒度图像背景信息对目标区域干扰以及目标最具判别区域特征难以辨识的问题,本文提出了一种基于改进的CVT细粒度图像识别算法研究。首先,在CVT模型中引入目标区域定位模块,该模块通过多层次的特征聚合方法提取目标区域的特征,并通过阈值判定方式进行目标区域的确定,之后对原始图像进行等比例裁剪,以减少背景信息的干扰。其次,提出了MDCSAIA机制,采用维度转换的方法,促进通道位置相邻的空间信息和空间位置相邻的通道信息间的有效交互,从而增强网络对目标局部细节区域的感知能力。实验结果表明,与基线算法相比,该方法在CUB-200-2011、Stanford Cars和Stanford Dogs三个数据集上的识别准确率分别提高了2.1%、1.7%和1.5%。此结果验证了所提方法的有效性。

    Abstract:

    In response to the issues of background interference in fine-grained images and the challenge of identifying the most discriminative features in the target region, this paper proposes an improved CVT-based fine-grained image recognition algorithm. First, a target region localization module is introduced into the CVT model. This module extracts features of the target region using a multilevel feature aggregation method and determines the target region via threshold-based decision-making. The original image is then cropped proportionally to reduce the interference of background information. Furthermore, a mechanism called MDCSAIA (Multi-Dimensional Channel Spatial-Aware Interaction) is proposed. This mechanism employs dimensional transformation to facilitate effective interaction between spatial information of adjacent channels and channel information of adjacent spatial positions, thereby enhancing the network′s ability to perceive the local details of the target region. Experimental results show that, compared to baseline algorithms, the proposed method improves recognition accuracy by 2.1%, 1.7%, and 1.5% on the CUB-200-2011, Stanford Cars, and Stanford Dogs datasets, respectively. These results validate the effectiveness of the proposed approach.

    参考文献
    相似文献
    引证文献
引用本文

冀得魁,李冰锋,杨艺.基于改进的CVT细粒度图像识别算法研究[J].电子测量技术,2025,48(18):122-129

复制
分享
相关视频

文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:
  • 最后修改日期:
  • 录用日期:
  • 在线发布日期: 2025-11-13
  • 出版日期:
文章二维码

重要通知公告

①《电子测量技术》期刊收款账户变更公告