JOURNAL ARTICLE

VPPT: Visual Pre-Trained Prompt Tuning Framework for Few-Shot Image Classification

Abstract

Large-scale pre-trained transformers have recently achieved remarkable success in several computer vision tasks. However, it remains highly challenging to fully fine-tune models for downstream tasks, due to the expensive computational and storage cost. Recently, Parameter-Efficient Tuning (PETuning) techniques, e.g., Visual Prompt Tuning (VPT), have significantly reduced the computation cost by inserting lightweight prompt modules including prompt tokens or adapter layers, into the pre-trained models and tuning these prompt modules with a small number of trainable parameters, while keeping the transformer backbone freeze. Although encouraging results were achieved, existing PETuning methods cannot perform well under the few-shot learning settings (i.e., extremely limited training data, with only 1 or 2 shots per class), due to the scarce supervision signal. To this end, we first empirically identify the poor performance is mainly due to the inappropriate way of initializing prompt modules, which has also been verified in the pre-trained language models. Next, we propose a Visual Pre-trained Prompt Tuning framework (VPPT), which pre-trains the prompt modules first and then leverages the pre-trained modules along with the pre-trained transformer backbone to perform prompt tuning on downstream tasks. Extensive experiments show that our VPPT framework achieves 16.08% average accuracy absolute improvement under 1 shot setting on five fine-grained visual classification datasets, compared with the previous PETuning techniques, e.g., VPT, in few-shot image classification.

Keywords:
Computer science Transformer One shot Initialization Artificial intelligence Computation Machine learning Pattern recognition (psychology) Contextual image classification Computer vision Image (mathematics) Voltage Algorithm Engineering

Metrics

4
Cited By
1.02
FWCI (Field Weighted Citation Impact)
32
Refs
0.75
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Domain Adaptation and Few-Shot Learning
Physical Sciences →  Computer Science →  Artificial Intelligence
Advanced Neural Network Applications
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Advanced Image and Video Retrieval Techniques
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition

Related Documents

JOURNAL ARTICLE

PPT: Pre-trained Prompt Tuning for Few-shot Learning

Yuxian GuXu HanZhiyuan LiuMinlie Huang

Journal:   Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) Year: 2022 Pages: 8410-8423
JOURNAL ARTICLE

MVP: Meta Visual Prompt Tuning for Few-Shot Remote Sensing Image Scene Classification

Junjie ZhuYiying LiK.C. YangNaiyang GuanZunlin FanChunping QiuXiaodong Yi

Journal:   IEEE Transactions on Geoscience and Remote Sensing Year: 2024 Vol: 62 Pages: 1-13
JOURNAL ARTICLE

Prompt tuning with preference ranking for few-shot pre-trained decision transformer

Shengchao HuLi ShenYa ZhangDacheng Tao

Journal:   Science China Information Sciences Year: 2026 Vol: 69 (1)
JOURNAL ARTICLE

Few-shot medical relation extraction via prompt tuning enhanced pre-trained language model

Guoxiu HeChen Huang

Journal:   Neurocomputing Year: 2025 Vol: 633 Pages: 129752-129752
JOURNAL ARTICLE

Build a Good Human-Free Prompt Tuning: Jointly Pre-Trained Template and Verbalizer for Few-Shot Classification

Mouxiang ChenHan FuChenghao LiuXiaoyun Joy WangZhuo LiJianling Sun

Journal:   IEEE Transactions on Knowledge and Data Engineering Year: 2025 Vol: 37 (5)Pages: 2253-2265
© 2026 ScienceGate Book Chapters — All rights reserved.