JOURNAL ARTICLE

Fast speaker change detection for broadcast news transcription and indexing

Abstract

In this paper, we describe a new speaker change detection algorithm designed for fast transcription and audio indexing of spoken broadcast news. We have designed a two-stage algorithm that begins with a gender-independent phone-class recognition pass. We collapse the phoneme inventory to only 4 broad classes and include 4 different models for non-speech, resulting in a small fast decoder that runs in less than 0.1 times real-time. The second stage of the SCD algorithm hypothesizes a speaker change boundary between every phone in the labeled input. The phone level time resolution in our approach permits the algorithm to run quickly while maintaining the same accuracy as a frame level approach. Applying the new algorithms to a large sample of broadcast news programs resulted in improvements in speaker change detection accuracy, speech recognition accuracy, and speed.

Keywords:
Computer science Search engine indexing Phone Speech recognition Speaker diarisation Transcription (linguistics) Voice activity detection Frame (networking) Change detection Speaker recognition Artificial intelligence Speech processing Telecommunications

Metrics

92
Cited By
7.86
FWCI (Field Weighted Citation Impact)
4
Refs
0.97
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Speech and Audio Processing
Physical Sciences →  Computer Science →  Signal Processing
Speech Recognition and Synthesis
Physical Sciences →  Computer Science →  Artificial Intelligence
Music and Audio Processing
Physical Sciences →  Computer Science →  Signal Processing
© 2026 ScienceGate Book Chapters — All rights reserved.