SHEN Yaxin, GAO Lijian , MAO Qirong
Existing semi-supervised sound event detection methods directly utilize strongly labeled synthetic samples,weakly labeled real samples,and unlabeled real samples for training to alleviate the issue of insufficient labeled samples.However,there is an inevitable distribution gap between synthetic and real domains,which can interfere with the direction of model gradient optimization,thereby restricting generalization ability of these models.To address this challenge,a novel semi-supervised sound event detection learning paradigm,meta mean teacher(MMT),is proposed based on meta-learning.Specifically,for each batch of trai-ning data,it is divided into a meta-training set consisting of synthetic samples and a meta-test set consisting of real samples.The meta-gradient calculated on the meta-training set serves as guidance for updating the meta-test gradient,allowing the model to perceive and learn more generalized knowledge.Experimental results on the DCASE2021 Task 4 dataset show that,compared to the official baseline,the proposed learning paradigm MMT has a relative improvement of 8.9%,6.6%,and 1.1% in the F1,PSDS1,and PSDS2 metrics,respectively.Compared to the current state-of-the-art methods in the field,the proposed learning paradigm MMT still demonstrates a significant performance advantage.
Chia‐Chuan LiuChia-Ping ChenChung-Li LuBo-Cheng ChanYu-Han ChengHsiang-Feng ChuangWei-Yu Chen
Rui TaoLong YanKazushige OuchiXiangdong Wang
Xu ZhengYan SongJie YanLi-Rong DaiIan McLoughlinLin Liu
Ziqiang ShiLiu LiuHuibin LinRujie LiuAnyan Shi
Liwei LinXiangdong WangHong LiuYueliang Qian