JOURNAL ARTICLE

Adaptive Hierarchical Pooling for Weakly-supervised Sound Event Detection

Lijian GaoLing ZhouQirong MaoMing Dong

Year: 2022 Journal:   Proceedings of the 30th ACM International Conference on Multimedia Pages: 1779-1787

Abstract

In Weakly-supervised Sound Event Detection (WSED), the ground truth of training data contains the presence or absence of each sound event only at the clip-level (i.e., no frame-level annotations). Recently, WSED has been formulated under the multi-instance learning framework, and a critical component within this formulation is the design of the temporal pooling function. In this paper, we propose an adaptive hierarchical pooling (HiPool) for WSED, which combines the advantages of max pooling in audio tagging and weighted average pooling in audio localization through a novel hierarchical structure and learns event-wise optimal pooling functions through continuous relaxation-based joint optimization. Extensive experiments on benchmark datasets show that HiPool outperforms the current pooling methods and greatly improves the performance of WSED. HiPool also has great generality - ready to be plugged into any WSED models.

Keywords:
Pooling Benchmark (surveying) Computer science Generality Event (particle physics) Artificial intelligence Ground truth Machine learning Pattern recognition (psychology)

Metrics

6
Cited By
0.70
FWCI (Field Weighted Citation Impact)
29
Refs
0.67
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Music and Audio Processing
Physical Sciences →  Computer Science →  Signal Processing
Speech and Audio Processing
Physical Sciences →  Computer Science →  Signal Processing
Music Technology and Sound Studies
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition

Related Documents

JOURNAL ARTICLE

Adaptive Pooling Operators for Weakly Labeled Sound Event Detection

Brian McFeeJustin SalamonJuan Pablo Bello

Journal:   IEEE/ACM Transactions on Audio Speech and Language Processing Year: 2018 Vol: 26 (11)Pages: 2180-2193
JOURNAL ARTICLE

Frequency-dependent auto-pooling function for weakly supervised sound event detection

Sichen LiuFeiran YangYin CaoJun Yang

Journal:   EURASIP Journal on Audio Speech and Music Processing Year: 2021 Vol: 2021 (1)
© 2026 ScienceGate Book Chapters — All rights reserved.