JOURNAL ARTICLE

Power Pooling: An Adaptive Pooling Function for Weakly Labelled Sound Event Detection

Abstract

Access to large corpora with strongly labelled sound events is expensive and difficult in engineering applications. Many researches turn to address the problem of how to detect both the types and the timestamps of sound events with weak labels that only specify the types. This task can be treated as a multiple instance learning (MIL) problem, and a key to it in the sound event detection (SED) task is the design of a pooling function. The linear softmax pooling function achieves state-of-the-art performance since it can vary both the signs and the magnitudes of gradients. However, linear softmax pooling cannot flexibly deal with sound events of different time scales. In this paper, we propose a power pooling function which can automatically adapt to various sound events. By adding a trainable parameter to each event, power pooling can provide more accurate gradients for frames in a clip than other pooling functions. On both weakly supervised and semi-supervised SED datasets, the proposed power pooling function outperforms linear softmax pooling on both coarse-grained and fine-grained metrics. Specifically, it improves the event-based F1 score by 11.4% and 10.2% relatively on the two datasets. While this paper focuses on SED applications, the proposed method can be applied to MIL tasks in other domains.

Keywords:
Pooling Softmax function Computer science Event (particle physics) Artificial intelligence Function (biology) Machine learning Task (project management) Pattern recognition (psychology) Deep learning Engineering

Metrics

10
Cited By
1.15
FWCI (Field Weighted Citation Impact)
26
Refs
0.79
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Music and Audio Processing
Physical Sciences →  Computer Science →  Signal Processing
Speech and Audio Processing
Physical Sciences →  Computer Science →  Signal Processing
Music Technology and Sound Studies
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition

Related Documents

JOURNAL ARTICLE

Adaptive Pooling Operators for Weakly Labeled Sound Event Detection

Brian McFeeJustin SalamonJuan Pablo Bello

Journal:   IEEE/ACM Transactions on Audio Speech and Language Processing Year: 2018 Vol: 26 (11)Pages: 2180-2193
JOURNAL ARTICLE

Adaptive Hierarchical Pooling for Weakly-supervised Sound Event Detection

Lijian GaoLing ZhouQirong MaoMing Dong

Journal:   Proceedings of the 30th ACM International Conference on Multimedia Year: 2022 Pages: 1779-1787
JOURNAL ARTICLE

Frequency-dependent auto-pooling function for weakly supervised sound event detection

Sichen LiuFeiran YangYin CaoJun Yang

Journal:   EURASIP Journal on Audio Speech and Music Processing Year: 2021 Vol: 2021 (1)
DISSERTATION

Sound event detection with weakly labelled data

Qiuqiang Kong

University:   Surrey Research Insight Open Access (The University of Surrey) Year: 2020
© 2026 ScienceGate Book Chapters — All rights reserved.