JOURNAL ARTICLE

Background-Enhanced Visual Prompting Transformer for Generalized Few-Shot Semantic Segmentation

Man LiXiaodong Ma

Year: 2025 Journal:   Electronics Vol: 14 (7)Pages: 1389-1389   Publisher: Multidisciplinary Digital Publishing Institute

Abstract

Generalized few-shot semantic segmentation (GFSS), which requires strong segmentation performance on novel classes while retaining the performance on base classes, is attracting increasing attention. Recent studies have demonstrated the effectiveness of applying visual prompts to solve GFSS problems, but there are still unresolved issues. Due to the confusion between the backgrounds and novel classes foreground during base class pre-training, the learned base visual prompts will mislead the novel visual prompts during novel class fine-tuning, leading to sub-optimal results. This paper proposes a background-enhanced visual prompting Transformer (Beh-VPT) to solve the problem. Specifically, we innovatively propose background visual prompts, which can learn potential novel class information in the background during base class pre-training and transfer the information to novel visual prompts during novel class fine-tuning via our proposed Hybrid Causal Attention Module. Additionally, we propose a background-enhanced segmentation head that is used in conjunction with background prompts to enhance the model’s capacity for learning novel classes. Considering the GFSS settings that take into account both base and novel classes, we introduce Singular Value Fine-Tuning in the non-meta learning paradigm to further unleash the full potential of the model. Extensive experiments show that the proposed method achieves state-of-the-art performance for GFSS on PASCAL-5i and COCO-20i datasets. For example, considering both base and novel classes, the improvements in mIoU range from 0.47% to 1.08% (COCO-20i) in the one-shot and five-shot scenarios, respectively. In addition, our method does not cause a fallback of mIoU in base classes relative to the baseline.

Keywords:
Segmentation Transformer Computer science Artificial intelligence Computer vision Shot (pellet) Computer graphics (images) Engineering Materials science Electrical engineering Voltage

Metrics

0
Cited By
0.00
FWCI (Field Weighted Citation Impact)
52
Refs
0.07
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Topics

Advanced Neural Network Applications
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Domain Adaptation and Few-Shot Learning
Physical Sciences →  Computer Science →  Artificial Intelligence
Advanced Image and Video Retrieval Techniques
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
© 2026 ScienceGate Book Chapters — All rights reserved.