Speaker change detection with privacy-preserving audio cues

Sree Hari Krishnan Parthasarathi; Mathew Magimai.-Doss; Daniel Gática-Pérez; Hervé Bourlard

doi:10.1145/1647314.1647385

ScienceGate Book Chapters

JOURNAL ARTICLE

Speaker change detection with privacy-preserving audio cues

Sree Hari Krishnan Parthasarathi Mathew Magimai.-Doss Daniel Gática-Pérez Hervé Bourlard

Year: 2009 Pages: 343-346

DOI: 10.1145/1647314.1647385

Get Full-Text PDF Get Analytical Report

Abstract

In this paper we investigate a set of privacy-sensitive audio features for speaker change detection (SCD) in multiparty conversations. These features are based on three different principles: characterizing the excitation source information using linear prediction residual, characterizing subband spectral information shown to contain speaker information, and characterizing the general shape of the spectrum. Experiments show that the performance of the privacy-sensitive features is comparable or better than that of the state-of-the-art full-band spectral-based features, namely, mel frequency cepstral coefficients, which suggests that socially acceptable ways of recording conversations in real-life is feasible.

Keywords:

Computer science Mel-frequency cepstrum Speech recognition Set (abstract data type) Cepstrum Linear prediction Residual Speaker recognition Artificial intelligence Feature extraction Algorithm

Metrics

Cited By

1.74

FWCI (Field Weighted Citation Impact)

Refs

0.85

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Speech and Audio Processing

Physical Sciences → Computer Science → Signal Processing

Speech Recognition and Synthesis

Physical Sciences → Computer Science → Artificial Intelligence

Music and Audio Processing

Physical Sciences → Computer Science → Signal Processing

Speaker change detection with privacy-preserving audio cues

Abstract

Metrics

Citation History

Topics

Related Documents

Target Active Speaker Detection with Audio-visual Cues

Wordless Sounds: Robust Speaker Diarization Using Privacy-Preserving Audio Representations

SafeEar: Content Privacy-Preserving Audio Deepfake Detection

Privacy-Preserving Speaker Authentication

An Efficient Speaker Diarization using Privacy Preserving Audio Features Based of Speech/Non Speech Detection