JOURNAL ARTICLE

Local Information Modeling with Self-Attention for Speaker Verification

Bing HanZhengyang ChenYanmin Qian

Year: 2022 Journal:   ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) Pages: 6727-6731

Abstract

Transformer based on self attention mechanism has demonstrated its state-of-the-art performance in most natural language processing (NLP) tasks, but it's not very competitive when applied for speaker verification in previous works. Generally, speaker identity is mostly reflected by the relationship between adjacent tokens, whose extraction mainly depends on local modeling ability. However, the self-attention module, as the key component of transformer, can help the model make full use of global information but insufficient to capture the local information. To tackle this limitation, in this paper, we strengthen the local information modeling from two different aspects: restricting the attention context to be local and introducing convolution operation into transformer. Experiments conducted on Voxceleb illustrate that our proposed methods can notably improve system performance, verifying the significance of local information for speaker verification task.

Keywords:
Computer science Transformer Component (thermodynamics) Artificial intelligence Language model Context model Feature extraction Task analysis Natural language Task (project management) Speech recognition Natural language processing Voltage Engineering

Metrics

20
Cited By
2.35
FWCI (Field Weighted Citation Impact)
34
Refs
0.89
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Speech Recognition and Synthesis
Physical Sciences →  Computer Science →  Artificial Intelligence
Natural Language Processing Techniques
Physical Sciences →  Computer Science →  Artificial Intelligence
Topic Modeling
Physical Sciences →  Computer Science →  Artificial Intelligence

Related Documents

© 2026 ScienceGate Book Chapters — All rights reserved.