JOURNAL ARTICLE

Generating Sample-Based Musical Instruments Using Neural Audio Codec Language Models

Shahan NercessianJohannes ImortNinon DevisFrederik Blang

Year: 2024 Journal:   Zenodo (CERN European Organization for Nuclear Research)   Publisher: European Organization for Nuclear Research

Abstract

In this paper, we propose and investigate the use of neural audio codec language models for the automatic generation of sample-based musical instruments based on text or reference audio prompts. Our approach extends a generative audio framework to condition on pitch across an 88-key spectrum, velocity, and a combined text/audio embedding. We identify maintaining timbral consistency within the generated instruments as a major challenge. To tackle this issue, we introduce three distinct conditioning schemes. We analyze our methods through objective metrics and human listening tests, demonstrating that our approach can produce compelling musical instruments. Specifically, we introduce a new objective metric to evaluate the timbral consistency of the generated instruments and adapt the average Contrastive Language-Audio Pretraining (CLAP) score for the text-to-instrument case, noting that its naive application is unsuitable for assessing this task. Our findings reveal a complex interplay between timbral consistency, the quality of generated samples, and their correspondence to the input prompt.

Keywords:
Codec Active listening Consistency (knowledge bases) Metric (unit) Quality (philosophy) Sound quality Musical Language model

Metrics

0
Cited By
0.00
FWCI (Field Weighted Citation Impact)
0
Refs
0.42
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Topics

Music Technology and Sound Studies
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Music and Audio Processing
Physical Sciences →  Computer Science →  Signal Processing
Generative Adversarial Networks and Image Synthesis
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition

Related Documents

JOURNAL ARTICLE

Generating Sample-Based Musical Instruments Using Neural Audio Codec Language Models

Shahan NercessianJohannes ImortNinon DevisFrederik Blang

Journal:   Zenodo (CERN European Organization for Nuclear Research) Year: 2024
JOURNAL ARTICLE

Generating Sample-Based Musical Instruments Using Neural Audio Codec Language Models

Shahan NercessianJohannes ImortNinon DevisFrederik Blang

Journal:   Zenodo (CERN European Organization for Nuclear Research) Year: 2024
JOURNAL ARTICLE

Generating Sample-Based Musical Instruments Using Neural Audio Codec Language Models

Shahan NercessianJohannes ImortNinon DevisFrederik Blang

Journal:   Zenodo (CERN European Organization for Nuclear Research) Year: 2024
JOURNAL ARTICLE

Exploring Neural Audio Codec-based Contrastive Language-Audio Pre-training

Huang, Du

Journal:   Zenodo (CERN European Organization for Nuclear Research) Year: 2024
© 2026 ScienceGate Book Chapters — All rights reserved.