JOURNAL ARTICLE

Nonintrusive Perceptual Audio Quality Assessment for User-Generated Content Using Deep Learning

Deebha MumtazVinit JakhetiyaKaran NathwaniBadri Narayan SubudhiSharath Chandra Guntuku

Year: 2021 Journal:   IEEE Transactions on Industrial Informatics Vol: 18 (11)Pages: 7780-7789   Publisher: Institute of Electrical and Electronics Engineers

Abstract

With the boom of social media communication, teleconferencing, and online classes, audiovisual communication over bandwidth strained networks has become an integral part of our lives. Consequently, the growing demand for the quality of experience necessitates developing algorithms to measure and enrich user experience. Prior studies have mainly focused on assessing speech quality and intelligibility with reference to audio quality assessment, while other categories in user-generated multimedia (UGM) are less explored. Moreover, frequency-domain properties of speech and UGM audio are significantly different from each other. Furthermore, there is a lack of a standard dataset for the quality assessment of UGM. Considering these limitations, in this article, we first develop the IIT-JMU-UGM audio dataset consisting of 1150 audio clips, with diverse context, content, and types of degradation commonly observed in real-world scenarios and annotated with the subjective quality scores. Finally, we propose a non-intrusive audio quality assessment metric using a stacked gated-recurrent-unit-based deep learning framework. The proposed model outperforms several baseline methods, including state-of-the-art non-intrusive and intrusive approaches. The resulting Pearson's correlation coefficient of 0.834 indicates that the proposed method efficiently mirrors human auditory perception.

Keywords:
Computer science Intelligibility (philosophy) Video quality Speech recognition Sound quality Multimedia Quality of experience Metric (unit) Artificial intelligence Quality of service Telecommunications

Metrics

11
Cited By
1.30
FWCI (Field Weighted Citation Impact)
39
Refs
0.82
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Speech and Audio Processing
Physical Sciences →  Computer Science →  Signal Processing
Hearing Loss and Rehabilitation
Life Sciences →  Neuroscience →  Cognitive Neuroscience
Music and Audio Processing
Physical Sciences →  Computer Science →  Signal Processing
© 2026 ScienceGate Book Chapters — All rights reserved.