JOURNAL ARTICLE

Weakly Labeled Semi-Supervised Sound Event Detection with Multi-Scale Residual Attention

Abstract

Different sound events have different time-frequency scale characteristics, which are useful for sound event detection (SED), but not yet effectively exploited. In this paper, we aim to adaptively select multi-scale feature information that is conducive to classification of sound events. We propose a novel module, namely multi-scale residual attention (MSRA), which is composed of multi-scale residual convolutional block and selective multiscale attention block. Multi-scale residual convolution block extracts features at multiple scales, among which selective multiscale attention block adaptively selects the features that are helpful for event classification. Experimental results prove that our method outperforms the state-of-the-art model by 3.7% on Task 4 of the DCASE 2018 Challenge dataset.

Keywords:
Residual Block (permutation group theory) Computer science Scale (ratio) Convolution (computer science) Pattern recognition (psychology) Event (particle physics) Artificial intelligence Feature (linguistics) Feature extraction Speech recognition Convolutional neural network Algorithm Mathematics Artificial neural network Geography

Metrics

2
Cited By
0.14
FWCI (Field Weighted Citation Impact)
40
Refs
0.45
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Music and Audio Processing
Physical Sciences →  Computer Science →  Signal Processing
Speech and Audio Processing
Physical Sciences →  Computer Science →  Signal Processing
Music Technology and Sound Studies
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
© 2026 ScienceGate Book Chapters — All rights reserved.