基于深度学习的多帧瞳孔检测算法
DOI:
CSTR:
作者:
作者单位:

杭州电子科技大学信息工程学院 杭州 310000

作者简介:

通讯作者:

中图分类号:

TP391;TN91

基金项目:

杭州电子科技大学信息工程学院科研培育基金(KYP0323015)、浙江省教育厅科研项目(Y202147108)资助


Multi-frame pupil detection algorithm based on deep learning
Author:
Affiliation:

Information Engineering College, Hangzhou Dianzi University,Hangzhou 310000, China

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    瞳孔定位在人机交互和生物医学计算应用中起着至关重要的作用。目前,许多复杂的瞳孔定位算法都是通过单幅图像来检测和定位瞳孔位置的。然而,瞳孔运动是一个连续的过程。因此,当无法在当前帧中准确检测和定位瞳孔位置时,可以通过结合前几帧的信息来推断瞳孔位置。这种方法可以更有效地处理困难和具有挑战性的情况,例如反射、睫毛和眨眼遮挡瞳孔,以及偏离中心的瞳孔位置和运动模糊。因此,该方法可以显著提高瞳孔检测的准确性和稳健性,减少定位误差。基于此,一种基于深度学习的多帧瞳孔检测算法被提出。该算法在Unet的编码解码结构的基础上引入连续人眼注视场景的多帧信息进行瞳孔检测。将卷积神经网络CNN与卷积长短期记忆网络ConvLSTM和注意力机制CBAM结构相结合,提出了一种混合语义分割网络。在瞳孔数据集上的实验表明,所提算法相较于其他算法具有更好的性能表现,其中均值交并比MIoU得分达96.78%,均方根误差RMSE为3.83,尤其在处理复杂情况下表现出色。

    Abstract:

    Pupil localization plays a crucial role in human-computer interaction and biomedical computing applications. Currently, many sophisticated pupil localization algorithms are designed to detect and locate the pupil position using one single image. However, pupil movement is a continuous process. Therefore, when the pupil position cannot be accurately detected and located in one current frame, the pupil position can be inferred by combining information of previous frames. This approach can more effectively handle difficult and challenging situations such as reflections, pupil occluded by eyelashes and blinks, as well as off-center pupil positions and motion blur. Consequently, it can significantly improve the accuracy and robustness of pupil detection, decreasing localization errors. To address these challenges, propose a pupil detection algorithm based on deep learning using multiple consecutive images. This algorithm enhances the standard Unet encoder-decoder structure by incorporating multi-frame information from continuous eye tracking scenes for improved pupil detection. By combining convolutional neural networks with convolutional long short-term memory networks and a convolutional block attention module, we introduce a hybrid semantic segmentation network. Experiments on a large-scale dataset demonstrate that the proposed method outperforms existing pupil detection algorithms, achieving a mean intersection over union score of 96.78% and a root mean square error value of 3.83, especially in challenging situations.

    参考文献
    相似文献
    引证文献
引用本文

张国静,李承家,韩敬伟,黄曼.基于深度学习的多帧瞳孔检测算法[J].电子测量技术,2025,48(9):168-176

复制
相关视频

分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:
  • 最后修改日期:
  • 录用日期:
  • 在线发布日期: 2025-05-23
  • 出版日期:
文章二维码