In recent years, the complexity of music production has gradually decreased results in many people create music and upload their created music to streaming media. The huge music streaming media has caused people to spend much time searching for specific music. Therefore, the technique of quick classification of music genres is very important in today's society. As machine learning and deep learning technologies maturing, the Convolutional Neural Networks (CNN) are applied to many fields, and various CNN-based variants have emerged one after another. The traditional music genre classification requires relevant professional knowledge to manually extract features from time series data. Deep learning has been proven to be effective and efficient in time series data. In order to save the user's time when searching for different styles of music, we applied CNN's advantages and characteristics in audio to implement a music genre classification model. In the pre-processing, we use Librosa is used to convert the original audio files into their corresponding Mel spectrums. The converted Mel spectrum is then fed into the proposed CNN model for training. The majority voting is applied to the decisions made by the 10 classifiers, and the average accuracy obtained on the GTZAN dataset is 84%.
Andrew BawitlungSandeep Kumar Dash
Noopur SrivastavaShivam RuhilGaurav Kaushal
Weibin ZhangWenkang LeiXiangmin XuXiaofeng Xing
Rakesh KumarMeenu GuptaRitu PrajapatiAmit Kumar