JOURNAL ARTICLE

Global–Local Self-Attention Based Transformer for Speaker Verification

Fei XieDalong ZhangChengming Liu

Year: 2022 Journal:   Applied Sciences Vol: 12 (19)Pages: 10154-10154   Publisher: Multidisciplinary Digital Publishing Institute

Abstract

Transformer models are now widely used for speech processing tasks due to their powerful sequence modeling capabilities. Previous work determined an efficient way to model speaker embeddings using the Transformer model by combining transformers with convolutional networks. However, traditional global self-attention mechanisms lack the ability to capture local information. To alleviate these problems, we proposed a novel global–local self-attention mechanism. Instead of using local or global multi-head attention alone, this method performs local and global attention in parallel in two parallel groups to enhance local modeling and reduce computational cost. To better handle local location information, we introduced locally enhanced location encoding in the speaker verification task. The experimental results of the VoxCeleb1 test set and the VoxCeleb2 dev set demonstrated the improved effect of our proposed global–local self-attention mechanism. Compared with the Transformer-based Robust Embedding Extractor Baseline System, the proposed speaker Transformer network exhibited better performance in the speaker verification task.

Keywords:
Computer science Transformer Embedding Artificial intelligence Speech recognition Engineering Voltage

Metrics

9
Cited By
1.76
FWCI (Field Weighted Citation Impact)
32
Refs
0.83
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Speech Recognition and Synthesis
Physical Sciences →  Computer Science →  Artificial Intelligence
Speech and Audio Processing
Physical Sciences →  Computer Science →  Signal Processing
Music and Audio Processing
Physical Sciences →  Computer Science →  Signal Processing

Related Documents

JOURNAL ARTICLE

Local-Global Self-Attention for Transformer-Based Object Tracking

Langkun ChenLong GaoaYan JiangYunsong LiGang HeaJifeng Ningc

Journal:   IEEE Transactions on Circuits and Systems for Video Technology Year: 2024 Vol: 34 (12)Pages: 12316-12329
JOURNAL ARTICLE

Local Information Modeling with Self-Attention for Speaker Verification

Bing HanZhengyang ChenYanmin Qian

Journal:   ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) Year: 2022 Pages: 6727-6731
BOOK-CHAPTER

Speaker Verification with Disentangled Self-attention

Junjie GuoZhiyuan MaHaodong ZhaoGongshen LiuXiaoyong Li

Lecture notes in computer science Year: 2021 Pages: 27-39
JOURNAL ARTICLE

Multi-View Self-Attention Based Transformer for Speaker Recognition

Rui WangJunyi AoLong ZhouShujie LiuZhihua WeiTom KoQing LiYu Zhang

Journal:   ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) Year: 2022 Pages: 6732-6736
© 2026 ScienceGate Book Chapters — All rights reserved.