JOURNAL ARTICLE

Clustering Driven Multi-Hop Graph Attention Network for Speaker Diarization

Abstract

Recently, the segmented utterance-level modeling approach based on Graph Attention Network (GAT) has been proved to be effective in Clustering-based Speaker Diarization (CSD). However, these existing methods only rely on the message passing by a single neighbor per layer, ignoring the influence of sub-region and global information. In this paper, we propose clustering driven multi-hop Graph Attention Network (CD-MGAT) with the multi-hop neighbor module and the clustering-oriented prototype module, which effectively explores the sub-region and global information for each segmented utterance. Specifically, the developed modules can adaptively interact with each other by clustering-consistency loss, which ensures the consistency of learning between the prototype and speaker embedding. Extensive experiments demonstrate the effectiveness of our solution on the AMI datasets.

Keywords:
Cluster analysis Computer science Graph Consistency (knowledge bases) Embedding Utterance Speaker diarisation Data mining Artificial intelligence Theoretical computer science Speaker recognition

Metrics

1
Cited By
0.26
FWCI (Field Weighted Citation Impact)
15
Refs
0.57
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Speech Recognition and Synthesis
Physical Sciences →  Computer Science →  Artificial Intelligence
Topic Modeling
Physical Sciences →  Computer Science →  Artificial Intelligence
Natural Language Processing Techniques
Physical Sciences →  Computer Science →  Artificial Intelligence
© 2026 ScienceGate Book Chapters — All rights reserved.