Shuffling and Mixing Data Augmentation for Environmental Sound Classification

Tadanobu Inoue; Phongtharin Vinayavekhin; Shiqiang Wang; David Wood; Asim Munawar; Bong Jun Ko; Nancy Greco; Ryuki Tachibana

doi:10.33682/wgyb-bt40

ScienceGate Book Chapters

JOURNAL ARTICLE

Shuffling and Mixing Data Augmentation for Environmental Sound Classification

Tadanobu Inoue Phongtharin Vinayavekhin Shiqiang Wang David Wood Asim Munawar Bong Jun Ko Nancy Greco Ryuki Tachibana

Year: 2019 Pages: 109-113

DOI: 10.33682/wgyb-bt40

Get Full-Text PDF Get Analytical Report

Abstract

Smart speakers have been recently adopted and widely used in consumer homes, largely as a communication interface between human and machines. In addition, these speakers can be used to monitor sounds other than human voice, for example, to watch over elderly people living alone, and to notify if there are changes in their usual activities that may affect their health. In this paper, we focus on the sound classification using machine learning, which usually requires a lot of training data to achieve good accuracy. Our main contribution is a data augmentation technique that generates new sound by shuffling and mixing two existing sounds of the same class in the dataset. This technique creates new variations on both the temporal sequence and the density of the sound events. We show in DCASE 2018 Task 5 that the proposed data augmentation method with our proposed convolutional neural network (CNN) achieves an average of macro-averaged F1 score of 89.95% over 4 folds of the development dataset. This is a significant improvement from the baseline result of 84.50%. In addition, we also verify that our proposed data augmentation technique can improve the classification performance on the Urban Sound 8K dataset.

Keywords:

Computer science Shuffling Task (project management) Convolutional neural network Baseline (sea) Artificial intelligence Mixing (physics) Focus (optics) Sound (geography) Speech recognition Machine learning Pattern recognition (psychology) Engineering

Metrics

Cited By

1.81

FWCI (Field Weighted Citation Impact)

Refs

0.86

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Music and Audio Processing

Physical Sciences → Computer Science → Signal Processing

Speech and Audio Processing

Physical Sciences → Computer Science → Signal Processing

Animal Vocal Communication and Behavior

Life Sciences → Biochemistry, Genetics and Molecular Biology → Developmental Biology

Shuffling and Mixing Data Augmentation for Environmental Sound Classification

Abstract

Metrics

Citation History

Topics

Related Documents

Data augmentation guided knowledge distillation for environmental sound classification

Metric learning based data augmentation for environmental sound classification

Data Augmentation Using Generative Adversarial Network for Environmental Sound Classification

Environmental sound classification using convolutional recurrent neural network and data augmentation

Deep Convolutional Neural Networks and Data Augmentation for Environmental Sound Classification