JOURNAL ARTICLE

Deep Convolutional Neural Network with Transfer Learning for Environmental Sound Classification

Abstract

Environmental sound classification (ESC) is an important issue. However, due to the lack of datasets, high-accuracy ESC has always been challenging. In this paper, we propose a new convolutional neural network (CNN) model using transfer learning technology for ESC task. First, we represent sound as RGB image, where the red channel corresponds to the Log-Mel spectrogram, the green channel corresponds to the scalogram, and the blue channel corresponds to the Mel frequency cepstrum coefficient (MFCC). Second, we train a CNN architecture based on Xception model which has a better performance on the JFT dataset. Test results show that the proposed approach is with a better performance on the ESC accuracy.

Keywords:
Spectrogram Convolutional neural network Computer science Transfer of learning Mel-frequency cepstrum Artificial intelligence Pattern recognition (psychology) Channel (broadcasting) Deep learning Speech recognition Cepstrum RGB color model Feature extraction Task (project management) Engineering Telecommunications

Metrics

17
Cited By
1.73
FWCI (Field Weighted Citation Impact)
18
Refs
0.85
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Music and Audio Processing
Physical Sciences →  Computer Science →  Signal Processing
Speech and Audio Processing
Physical Sciences →  Computer Science →  Signal Processing
Animal Vocal Communication and Behavior
Life Sciences →  Biochemistry, Genetics and Molecular Biology →  Developmental Biology
© 2026 ScienceGate Book Chapters — All rights reserved.