JOURNAL ARTICLE

Orthogonal Training for Text-Independent Speaker Verification

Abstract

In this paper we propose orthogonal training schemes to improve the effectiveness of cosine similarity measurements in text-independent speaker verification (SV) tasks. Compared to the PLDA backend, cosine similarity is simple to compute, and it does not require extra data or time to build a separate model. The use of cosine similarity measurement is also highly desirable for building end-to-end SV systems. However, the cosine similarity has an underlying assumption that the dimensions of the speaker embeddings are orthogonal, which is usually not satisfied in current SV systems. The training scheme applies singular vector decomposition (SVD) to the weight matrix of the speaker embedding extraction layer in our time delay neural network (TDNN)-based SV system, and replaces the original weight matrix by the matrix constructed from the left unitary matrix and the singular value matrix. Then the reconstructed matrix in the extraction layer is held constant and the remaining network is fine-tuned with an orthogonality regularizer. We further investigate orthogonal training from scratch, with orthogonality regularization incorporated throughout the network training. Experimental results show that our orthogonal training methods can significantly improve the system performance with a cosine similarity backend.

Keywords:
Orthogonality Cosine similarity Singular value decomposition Computer science Artificial neural network Trigonometric functions Matrix (chemical analysis) Similarity (geometry) Singular value Orthogonal matrix Algorithm Discrete cosine transform Speaker recognition Pattern recognition (psychology) Speech recognition Artificial intelligence Mathematics

Metrics

5
Cited By
0.44
FWCI (Field Weighted Citation Impact)
25
Refs
0.67
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Speech Recognition and Synthesis
Physical Sciences →  Computer Science →  Artificial Intelligence
Speech and Audio Processing
Physical Sciences →  Computer Science →  Signal Processing
Music and Audio Processing
Physical Sciences →  Computer Science →  Signal Processing
© 2026 ScienceGate Book Chapters — All rights reserved.