JOURNAL ARTICLE

Shortcut-Based Fully Convolutional Network for Speech Enhancement

Abstract

The goal of speech enhancement is to improve the quality and intelligibility of noisy speeches. One solution for this task is Fully Convolutional Network (FCN), which can effectively model temporal structure with less parameters. However, deep FCN is hard to be trained and may suffer the loss of detailed information due to consecutive pooling operations. To solve this problem, we introduce two different shortcut mechanisms to better preserve information from shallow layers. In addition, we discard the pooling layers and corresponding upsampling layers of FCN to avoid information compression. Experimental results show that the proposed model can achieve higher performance compared to other baselines in both seen and unseen noise conditions.

Keywords:
Upsampling Pooling Computer science Speech enhancement Intelligibility (philosophy) Speech recognition Task (project management) Artificial intelligence Information loss Noise reduction Image (mathematics)

Metrics

0
Cited By
0.00
FWCI (Field Weighted Citation Impact)
30
Refs
0.16
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Topics

Speech and Audio Processing
Physical Sciences →  Computer Science →  Signal Processing
Advanced Adaptive Filtering Techniques
Physical Sciences →  Engineering →  Computational Mechanics
Blind Source Separation Techniques
Physical Sciences →  Computer Science →  Signal Processing
© 2026 ScienceGate Book Chapters — All rights reserved.