A Multi-grained based Attention Network for Semi-supervised Sound Event Detection

Ying Hu; Xiujuan Zhu; Yunlong Li; Hao Huang; Liang He

doi:10.21437/interspeech.2022-767

ScienceGate Book Chapters

JOURNAL ARTICLE

A Multi-grained based Attention Network for Semi-supervised Sound Event Detection

Ying Hu Xiujuan Zhu Yunlong Li Hao Huang Liang He

Year: 2022 Journal: Interspeech 2022 Pages: 1531-1535

DOI: 10.21437/interspeech.2022-767

Get Full-Text PDF Get Analytical Report

Abstract

Sound event detection (SED) is an interesting but challenging task due to the\nscarcity of data and diverse sound events in real life. This paper presents a\nmulti-grained based attention network (MGA-Net) for semi-supervised sound event\ndetection. To obtain the feature representations related to sound events, a\nresidual hybrid convolution (RH-Conv) block is designed to boost the vanilla\nconvolution's ability to extract the time-frequency features. Moreover, a\nmulti-grained attention (MGA) module is designed to learn temporal resolution\nfeatures from coarse-level to fine-level. With the MGA module,the network could\ncapture the characteristics of target events with short- or long-duration,\nresulting in more accurately determining the onset and offset of sound events.\nFurthermore, to effectively boost the performance of the Mean Teacher (MT)\nmethod, a spatial shift (SS) module as a data perturbation mechanism is\nintroduced to increase the diversity of data. Experimental results show that\nthe MGA-Net outperforms the published state-of-the-art competitors, achieving\n53.27% and 56.96% event-based macro F1 (EB-F1) score, 0.709 and 0.739\npolyphonic sound detection score (PSDS) on the validation and public set\nrespectively.\n

Keywords:

Computer science Convolution (computer science) Speech recognition Artificial intelligence Offset (computer science) Pattern recognition (psychology) Artificial neural network

Metrics

Cited By

1.40

FWCI (Field Weighted Citation Impact)

Refs

0.82

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Music and Audio Processing

Physical Sciences → Computer Science → Signal Processing

Speech and Audio Processing

Physical Sciences → Computer Science → Signal Processing

Music Technology and Sound Studies

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

A Multi-grained based Attention Network for Semi-supervised Sound Event Detection

Abstract

Metrics

Citation History

Topics

Related Documents

Semi-Supervised Sound Event Detection Based on Multi-Scale Attention

Weakly Labeled Semi-Supervised Sound Event Detection with Multi-Scale Residual Attention

Sparse Self-Attention for Semi-Supervised Sound Event Detection

Regression-based Sound Event Detection with Semi-supervised Learning

Semi-supervised Sound Event Detection Based on Meta Learning