JOURNAL ARTICLE

Panoptic SegFormer: Delving Deeper into Panoptic Segmentation with Transformers

Zhiqi LiWenhai WangEnze XieZhiding YuAnima AnandkumarJosé M. AlvarezPing LuoTong Lu

Year: 2022 Journal:   2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Pages: 1270-1279

Abstract

Panoptic segmentation involves a combination of joint semantic segmentation and instance segmentation, where image contents are divided into two types: things and stuff. We present Panoptic SegFormer, a general framework for panoptic segmentation with transformers. It contains three innovative components: an efficient deeply-supervised mask decoder, a query decoupling strategy, and an improved postprocessing method. We also use Deformable DETR to efficiently process multiscale features, which is a fast and efficient version of DETR. Specifically, we supervise the attention modules in the mask decoder in a layer-wise manner. This deep supervision strategy lets the attention modules quickly focus on meaningful semantic regions. It improves performance and reduces the number of required training epochs by half compared to Deformable DETR. Our query decoupling strategy decouples the responsibilities of the query set and avoids mutual interference between things and stuff. In addition, our post-processing strategy improves performance without additional costs by jointly considering classification and segmentation qualities to resolve conflicting mask overlaps. Our approach increases the accuracy 6.2% PQ over the baseline DETR model. Panoptic SegFormer achieves state-of-the-art results on COCO testdev with 56.2% PQ. It also shows stronger zero-shot robustness over existing methods.

Keywords:
Computer science Segmentation Robustness (evolution) Artificial intelligence Computer vision Image segmentation

Metrics

137
Cited By
9.39
FWCI (Field Weighted Citation Impact)
67
Refs
0.98
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Advanced Neural Network Applications
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Domain Adaptation and Few-Shot Learning
Physical Sciences →  Computer Science →  Artificial Intelligence
Advanced Image and Video Retrieval Techniques
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition

Related Documents

JOURNAL ARTICLE

Time-Space Transformers for Video Panoptic Segmentation

Andra PetrovaiSergiu Nedevschi

Journal:   2022 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) Year: 2022
JOURNAL ARTICLE

Delving Deeper Into Astromorphic Transformers

Md Zesun Ahmed MiaMalyaban BalAbhronil Sengupta

Journal:   IEEE Transactions on Cognitive and Developmental Systems Year: 2025 Vol: 17 (6)Pages: 1436-1446
JOURNAL ARTICLE

CMT-DeepLab: Clustering Mask Transformers for Panoptic Segmentation

Qihang YuHuiyu WangDahun KimSiyuan QiaoMaxwell D. CollinsYukun ZhuHartwig AdamAlan YuilleLiang-Chieh Chen

Journal:   2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Year: 2022 Pages: 2550-2560
© 2026 ScienceGate Book Chapters — All rights reserved.