JOURNAL ARTICLE

Audio style conversion using deep learning

Aakash EzhilanR. DheekkshaS. Shridevi

Year: 2021 Journal:   International Journal of Applied Science and Engineering Vol: 18 (5)Pages: 1-8

Abstract

Style transfer is one of the most popular uses of neural networks. It has been thoroughly researched, such as extracting the style from famous paintings and applying it to other images thus creating synthetic paintings. Generative adversarial networks (GANs) are used to achieve this. This paper explores the many ways in which the same results can be achieved with audio related tasks, for which a plethora of new applications can be found. Analysis of different techniques used to transfer styles of audios, specifically changing the gender of the audio is implemented. The Crowd sourced high-quality UK and Ireland English Dialect speech data set was used. In this paper, the input is the male or female wave form and the opposite gender’s waveform is synthesized by the network, with the content spoken remaining the same. Different architectures are explored, from naive techniques and directly training audio waveforms against convolution neural networks (CNN) to using extensive algorithms researched for image style conversion and generation of spectrograms (using GANs) to be trained on CNNs. This research has a broader scope when used in converting music from one genre to another, identification of synthetic voices, curating voices for AIs based on preference etc.

Keywords:
Computer science Spectrogram Scope (computer science) Convolutional neural network Style (visual arts) Set (abstract data type) Generative grammar Speech recognition Artificial intelligence Artificial neural network Natural language processing Multimedia Art Visual arts

Metrics

2
Cited By
0.29
FWCI (Field Weighted Citation Impact)
0
Refs
0.54
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Speech and Audio Processing
Physical Sciences →  Computer Science →  Signal Processing
Music and Audio Processing
Physical Sciences →  Computer Science →  Signal Processing
Speech Recognition and Synthesis
Physical Sciences →  Computer Science →  Artificial Intelligence

Related Documents

JOURNAL ARTICLE

Audio style conversion using deep learning

Aakash EzhilanR. DheekkshaS. Shridevi

Journal:   Greater South Information System Year: 2021
JOURNAL ARTICLE

Audio style conversion using deep learning

Aakash EzhilanR. DheekkshaS. Shridevi

Journal:   Greater South Information System Year: 2021
BOOK-CHAPTER

Audio to Text Conversion Using Deep Learning

K. S. KalaivaniS. M. ThissyakkannaR. SanjayS. K. Nirenjhanram

Communications in computer and information science Year: 2025 Pages: 69-81
JOURNAL ARTICLE

Font Style Conversion Based on Deep Learning

Da LvYijun Liu

Journal:   Proceedings of the 2018 International Conference on Network, Communication, Computer Engineering (NCCE 2018) Year: 2018
© 2026 ScienceGate Book Chapters — All rights reserved.