Efficient Adapter Tuning of Pre-Trained Speech Models for Automatic Speaker Verification

Mufan Sang; John H. L. Hansen

doi:10.1109/icassp48485.2024.10446686

ScienceGate Book Chapters

JOURNAL ARTICLE

Efficient Adapter Tuning of Pre-Trained Speech Models for Automatic Speaker Verification

Mufan Sang John H. L. Hansen

Year: 2024 Pages: 12131-12135

DOI: 10.1109/icassp48485.2024.10446686

Get Full-Text PDF Get Analytical Report

Abstract

With excellent generalization ability, self-supervised speech models have shown impressive performance on various downstream speech tasks in the pre-training and fine-tuning paradigm. However, as the growing size of pre-trained models, fine-tuning becomes practically unfeasible due to heavy computation and storage overhead, as well as the risk of overfitting. Adapters are lightweight modules inserted into pre-trained models to facilitate parameter-efficient adaptation. In this paper, we propose an effective adapter framework designed for adapting self-supervised speech models to the speaker verification task. With a parallel adapter design, our proposed framework inserts two types of adapters into the pre-trained model, allowing the adaptation of latent features within intermediate Transformer layers and output embeddings from all Transformer layers. We conduct comprehensive experiments to validate the efficiency and effectiveness of the proposed framework. Experimental results on the VoxCeleb1 dataset demonstrate that the proposed adapters surpass fine-tuning and other parameter-efficient transfer learning methods, achieving superior performance while updating only 5% of the parameters.

Keywords:

Computer science Transformer Overfitting Adapter (computing) Speech recognition Computation Generalization Artificial intelligence Transfer of learning Machine learning Artificial neural network Computer hardware Algorithm Engineering

Metrics

Cited By

3.19

FWCI (Field Weighted Citation Impact)

Refs

0.88

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Speech Recognition and Synthesis

Physical Sciences → Computer Science → Artificial Intelligence

Natural Language Processing Techniques

Physical Sciences → Computer Science → Artificial Intelligence

Music and Audio Processing

Physical Sciences → Computer Science → Signal Processing

Efficient Adapter Tuning of Pre-Trained Speech Models for Automatic Speaker Verification

Abstract

Metrics

Citation History

Topics

Related Documents

UniPET-SPK: A Unified Framework for Parameter-Efficient Tuning of Pre-Trained Speech Models for Robust Speaker Verification

Hadamard Adapter: An Extreme Parameter-Efficient Adapter Tuning Method for Pre-trained Language Models

Parameter-Efficient Adapter Based on Pre-trained Models for Speech Translation

Efficient Integrated Features Based on Pre-trained Models for Speaker Verification

Disentangling Speaker and Content in Pre-trained Speech Models with Latent Diffusion for Robust Speaker Verification