JOURNAL ARTICLE

Boundary Discriminative Large Margin Cosine Loss for Text-independent Speaker Verification

Abstract

Deep neural network based speaker embeddings have attracted much attention in text-independent speaker verification task. In addition to the network architecture, an appropriate design of the loss function is crucial for the deep discriminative embedding extractor. Inspired by the success of Large Margin Cosine Loss (LMCL) in face recognition, we propose an enhanced LMCL named boundary discriminative LMCL (BD-LMCL) to emphasize the discriminative information inherited in the speaker boundaries. Unlike LMCL, where all training samples contribute equally for the objective function, only the samples around the speaker boundaries are considered during the network training with BD-LMCL. Specifically, those samples close to the boundaries are dynamically selected using top-k zero-one loss. Experimental results on a short duration corpus Android Cellphone and NIST SRE 2012 demonstrate better performance compared to LMCL and other popular loss functions.

Keywords:
Discriminative model Speech recognition Computer science NIST Margin (machine learning) Pattern recognition (psychology) Speaker recognition Artificial intelligence Embedding Boundary (topology) Machine learning Mathematics

Metrics

24
Cited By
3.53
FWCI (Field Weighted Citation Impact)
27
Refs
0.94
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Speech Recognition and Synthesis
Physical Sciences →  Computer Science →  Artificial Intelligence
Speech and Audio Processing
Physical Sciences →  Computer Science →  Signal Processing
Music and Audio Processing
Physical Sciences →  Computer Science →  Signal Processing
© 2026 ScienceGate Book Chapters — All rights reserved.