基于多尺度注意力特征融合的恶意URL检测研究
作者:
作者单位:

兰州理工大学计算机与通信学院 兰州 730050

中图分类号:

TN 391

基金项目:

国家自然科学基金(62166025)项目资助


Research on malicious URL detection based on multi-scale attention feature fusion
Author:
Affiliation:

School of Computer Science and Communication, Lanzhou University of Technology,Lanzhou 730050, China

  • 摘要
  • | |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • | |
  • 文章评论
    摘要:

    针对当前恶意URL检测模型在处理复杂结构和多样化字符组合的URL时,存在特征提取单一和检测精度不高的问题,提出了一种基于多尺度注意力特征融合的恶意URL检测模型。首先,采用Character Embeddings和DistilBERT方法分别对字符和单词进行编码,以捕获URL字符串中字符级和词级特征表示。其次,通过改进卷积神经网络(CNN)提取不同尺度的字符结构特征和词级语义特征,并结合双向长短期记忆网络(BiLSTM)进一步提取深层次序列特征。此外,为了实现字符级与词级多尺度特征的动态融合,创新性地引入注意力特征融合模块(AFF),有效降低信息冗余并提升对长距离序列特征的提取能力。实验结果表明,所提模型与其他基准模型相比,准确率提升了0.32%~4.7%,F1分数提升了0.46%~5.5%,并在ISCX-URL2016等数据集上也达到了较好的测效果。

    Abstract:

    To address the issues of single feature extraction and low detection accuracy in current malicious URL detection models when handling URLs with complex structures and diverse character combinations, this paper proposes a malicious URL detection model based on multi-scale attention feature fusion. First, Character Embeddings and DistilBERT are employed to encode characters and words separately, capturing both character-level and word-level feature representations in URL strings. Next, an improved convolutional neural network (CNN) is used to extract multi-scale character structural features and word-level semantic features, while a bidirectional long short-term memory (BiLSTM) network is employed to further extract deep sequence features. Additionally, an innovative attention feature fusion (AFF) module is introduced to dynamically fuse multi-scale features at both the character and word levels, effectively reducing information redundancy and enhancing the extraction of long-range sequence features. Experimental results show that the proposed model outperforms other baseline models, with accuracy improvements ranging from 0.32% to 4.7% and F1 score improvements from 0.46% to 5.5%, achieving excellent detection performance on datasets such as ISCX-URL2016.

    参考文献
    相似文献
    引证文献
引用本文

马栋林,陈伟杰,赵宏,宋佳佳.基于多尺度注意力特征融合的恶意URL检测研究[J].电子测量技术,2024,47(20):15-23

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 在线发布日期: 2025-01-06
文章二维码