JOURNAL ARTICLE

Audiodec: An Open-Source Streaming High-Fidelity Neural Audio Codec

Abstract

A good audio codec for live applications such as telecommunication is\ncharacterized by three key properties: (1) compression, i.e.\\ the bitrate that\nis required to transmit the signal should be as low as possible; (2) latency,\ni.e.\\ encoding and decoding the signal needs to be fast enough to enable\ncommunication without or with only minimal noticeable delay; and (3)\nreconstruction quality of the signal. In this work, we propose an open-source,\nstreamable, and real-time neural audio codec that achieves strong performance\nalong all three axes: it can reconstruct highly natural sounding 48~kHz speech\nsignals while operating at only 12~kbps and running with less than 6~ms\n(GPU)/10~ms (CPU) latency. An efficient training paradigm is also demonstrated\nfor developing such neural audio codecs for real-world scenarios. Both\nobjective and subjective evaluations using the VCTK corpus are provided. To sum\nup, AudioDec is a well-developed plug-and-play benchmark for audio codec\napplications.\n

Keywords:
Codec Computer science Adaptive Multi-Rate audio codec Speech coding Decoding methods Latency (audio) Speech recognition High fidelity Audio signal Codec2 Data compression Real-time computing Voice activity detection Computer hardware Speech processing Linear predictive coding Artificial intelligence Telecommunications Engineering

Metrics

52
Cited By
13.96
FWCI (Field Weighted Citation Impact)
49
Refs
0.99
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Speech and Audio Processing
Physical Sciences →  Computer Science →  Signal Processing
Speech Recognition and Synthesis
Physical Sciences →  Computer Science →  Artificial Intelligence
Music and Audio Processing
Physical Sciences →  Computer Science →  Signal Processing
© 2026 ScienceGate Book Chapters — All rights reserved.