Research on malicious URL detection based on multi-scale attention feature fusion
DOI:
CSTR:
Author:
Affiliation:

School of Computer Science and Communication, Lanzhou University of Technology,Lanzhou 730050, China

Clc Number:

TN 391

Fund Project:

  • Article
  • |
  • Figures
  • |
  • Metrics
  • |
  • Reference
  • |
  • Related
  • |
  • Cited by
  • |
  • Materials
  • |
  • Comments
    Abstract:

    To address the issues of single feature extraction and low detection accuracy in current malicious URL detection models when handling URLs with complex structures and diverse character combinations, this paper proposes a malicious URL detection model based on multi-scale attention feature fusion. First, Character Embeddings and DistilBERT are employed to encode characters and words separately, capturing both character-level and word-level feature representations in URL strings. Next, an improved convolutional neural network (CNN) is used to extract multi-scale character structural features and word-level semantic features, while a bidirectional long short-term memory (BiLSTM) network is employed to further extract deep sequence features. Additionally, an innovative attention feature fusion (AFF) module is introduced to dynamically fuse multi-scale features at both the character and word levels, effectively reducing information redundancy and enhancing the extraction of long-range sequence features. Experimental results show that the proposed model outperforms other baseline models, with accuracy improvements ranging from 0.32% to 4.7% and F1 score improvements from 0.46% to 5.5%, achieving excellent detection performance on datasets such as ISCX-URL2016.

    Reference
    Related
    Cited by
Get Citation
Share
Article Metrics
  • Abstract:
  • PDF:
  • HTML:
  • Cited by:
History
  • Received:
  • Revised:
  • Adopted:
  • Online: January 06,2025
  • Published:
Article QR Code