Speaker verification method based on dilated convolution and multi-scale attention mechanism
DOI:
CSTR:
Author:
Affiliation:

1.School of Information and Communication, Guilin University of Electronic Technology,Guilin 541004, China; 2.Key Laboratory of Cognitive Radio and Information Processing, Ministry of Education, Guilin University of Electronic Technology,Guilin 541004, China

Clc Number:

TN912.34

Fund Project:

  • Article
  • |
  • Figures
  • |
  • Metrics
  • |
  • Reference
  • |
  • Related
  • |
  • Cited by
  • |
  • Materials
  • |
  • Comments
    Abstract:

    To address the limitations of the CAM++ model in feature extraction and recognition performance under complex acoustic conditions, this paper proposes TF-DCAM, a speaker verification model integrating dilated convolution and temporal-frequency multi-scale attention mechanisms. The model enhances feature representation through dilated residual convolution and a time-frequency adaptive refocusing unit to suppress redundant information. A temporal-frequency multi-scale attention module is introduced to improve sensitivity to key information via channel attention and cross-dimensional interaction. An adaptive masking temporal convolution module is further incorporated to model long-term dependencies effectively. Finally, a combination of contrastive loss functions is applied to jointly optimize the speaker embedding space. Experiments conducted on the CN-Celeb dataset show that TF-DCAM reduces EER and minDCF by 14.98% and 10.98% respectively, compared with the baseline. The model also demonstrates strong cross-lingual generalization on the VoxCeleb1 dataset. Results indicate that the proposed method significantly improves speaker verification performance and robustness while maintaining model efficiency.

    Reference
    Related
    Cited by
Get Citation
Related Videos

Share
Article Metrics
  • Abstract:
  • PDF:
  • HTML:
  • Cited by:
History
  • Received:
  • Revised:
  • Adopted:
  • Online: January 09,2026
  • Published:
Article QR Code