Duration Robust Weakly Supervised Sound Event Detection

Heinrich Dinkel; Kai Yu

doi:10.1109/icassp40776.2020.9053459

ScienceGate Book Chapters

JOURNAL ARTICLE

Duration Robust Weakly Supervised Sound Event Detection

Heinrich Dinkel Kai Yu

Year: 2020 Pages: 311-315

DOI: 10.1109/icassp40776.2020.9053459

Get Full-Text PDF Get Analytical Report

Abstract

Task 4 of the DCASE2018 challenge demonstrated that substantially more research is needed for a real-world application of sound event detection. Analyzing the challenge results it can be seen that most successful models are biased towards predicting long (e.g., over 5s) clips. This work aims to investigate the performance impact of fixed-sized window median filter post-processing and advocate the use of double thresholding as a more robust and predictable post-processing method. Further, four different temporal subsampling methods within the CRNN framework are proposed: mean-max, alpha-mean-max, Lp-norm and convolutional. We show that for this task subsampling the temporal resolution by a neural network enhances the F1 score as well as its robustness towards short, sporadic sound events. Our best single model achieves 30.1% F1 on the evaluation set and the best fusion model 32.5%, while being robust to event length variations.

Keywords:

Computer science Robustness (evolution) Artificial intelligence Thresholding Pattern recognition (psychology) Convolutional neural network Speech recognition Image (mathematics)

Metrics

Cited By

3.69

FWCI (Field Weighted Citation Impact)

Refs

0.94

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Music and Audio Processing

Physical Sciences → Computer Science → Signal Processing

Speech and Audio Processing

Physical Sciences → Computer Science → Signal Processing

Music Technology and Sound Studies

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

Duration Robust Weakly Supervised Sound Event Detection

Abstract

Metrics

Citation History

Topics

Related Documents

Towards Duration Robust Weakly Supervised Sound Event Detection

Weakly-Supervised Sound Event Detection with Self-Attention

Adaptive Hierarchical Pooling for Weakly-supervised Sound Event Detection

Improving Weakly Supervised Sound Event Detection with Causal Intervention

Background-aware Modeling for Weakly Supervised Sound Event Detection