基于原型的正负学习用于远程监督关系抽取
DOI:
CSTR:
作者:
作者单位:

1.南京信息工程大学计算机学院、网络空间安全学院 南京 210044;2.无锡学院物联网工程学院 无锡 214105; 3.江南大学人工智能与计算机学院 无锡 214122

作者简介:

通讯作者:

中图分类号:

TP391.1; TN911.4

基金项目:


Positive and negative learning with prototype for distant supervision relation extraction
Author:
Affiliation:

1.School of Computer Science & School of Cyberspace Security, Nanjing University of Information Science & Technology,Nanjing 210044, China; 2.School of IoT Engineering, Wuxi University,Wuxi 214105, China; 3.School of Artificial Intelligence and Computer Science, Jiangnan University,Wuxi 214122, China

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    基于多示例学习框架的远程监督关系抽取方法大多依赖于启发式生成的污染标签,侧重于在句袋级别进行关系预测。然而,这些方法在句子级别的预测中表现不佳,而句子级别的预测更适用于理解性任务,如问答和知识图谱补全。为解决上述问题,本文提出了一种新型的远程监督关系抽取方法,该方法通过正负学习在句子级别训练模型,以区分噪声数据并加速收敛。同时,构建了一个约束图,用于编码关系与实体类型之间的约束,并通过辅助损失向关系原型优化,从而实现不同关系之间的信息传播,使得模型能够学习到更为本质且可解释的句子表示。本文方法不仅能够识别噪声数据,还可以通过迭代修正其标签,以改进远程数据的质量,进一步提高模型性能。本方法在NYT数据集的句子级关系抽取任务中表现出色,精确度达77.69%,较当前最优基准模型提升6.47%,在噪声标注测试集上的F1分数高达85.88%,验证了其卓越的去噪能力。消融实验结果表明,约束图对关系原型优化的贡献为11.02%。实验结果表明,该方法在句子级别的关系抽取任务中显著优于现有方法,不仅有效减少了噪声影响,还显著提升了模型性能,为远程监督关系抽取任务提供了一个高效的解决方案。

    Abstract:

    Distant supervision relation extraction methods based on the multi-instance learning framework mostly rely on contaminated labels that are heuristically generated, and focus on predicting relations at bag-level. However, they show unsatisfactory performance on sentence-level prediction which is more friendly with comprehend sentence tasks, like question answering and knowledge graph completion. To solve the above problems, a novel distant supervision relation extraction method is proposed in this paper, in which we train the model at sentence-level via positive learning and negative learning to separate noisy data and enable faster convergence. Meanwhile, a constraint graph is constructed to encode the re-strictions between relations and entity types and is optimized by an auxiliary loss towards relation prototype, which allows information propagation among different relations that makes the model can learn essential and interpretable sentence representation. We not only identify noisy data but also revise the labels of them iteratively to refine the quality of distant data and further enhance model performance. This method performs well in the sentence-level relation extraction task of the NYT dataset, with an accuracy of 77.69%, which is 6.47% higher than the current optimal baseline model. The F1 score on the noisy annotated test set is as high as 85.88%, verifying its excellent denoising ability. The ablation experiment results show that the contribution of the constraint graph to the optimization of the relation prototype is 11.02%. The experimental results show that this method significantly outperforms the existing methods in the sentence-level relation extraction task, not only effectively reducing the impact of noise, but also significantly improving the model performance, providing an efficient solution for the remote supervision relation extraction task.

    参考文献
    相似文献
    引证文献
引用本文

徐国梁,陈祺东,徐宇璇.基于原型的正负学习用于远程监督关系抽取[J].电子测量技术,2025,48(15):91-100

复制
分享
相关视频

文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:
  • 最后修改日期:
  • 录用日期:
  • 在线发布日期: 2025-09-29
  • 出版日期:
文章二维码

重要通知公告

①《电子测量技术》期刊收款账户变更公告