Zezheng XuTing JiangChao LiJiacheng Yu
Convolutional neural networks (CNN) have made remarkable achievements in speech enhancement. However, the convolution operation is difficult to obtain the global context of the feature map due to its locality. To solve the above problem, we propose an attention-augmented fully convolutional neural network for monaural speech enhancement. More specifically, the method is to integrate a new two-dimensional relative selfattention mechanism into fully convolutional networks. Besides, we utilize Huber Loss as the loss function, which is more robust to noise. Experimental results indicate that compared with the optimally modified log-spectral amplitude (OMLSA) estimator and other CNN-based models, our proposed network has better performance in five indicators, and can well balance noise suppression and speech distortion. What is more, we also embed the proposed attention mechanism into other convolutional networks and get satisfactory results, showing that this mechanism has great generalization ability.
Shadi PirhosseinlooJonathan S. Brumberg
Tian LanYilan LyuGuoqiang HuiRefuoe MokhosiSen LiQiao Liu
Zehua ZhangLu ZhangXuyi ZhuangYukun QianMingjiang Wang
Yang XianYang SunWenwu WangSyed Mohsen Naqvi