JOURNAL ARTICLE

Panoptic Segmentation-Based Attention for Image Captioning

Wenjie CaiZheng XiongXianfang SunPaul L. RosinLongcun JinXinyi Peng

Year: 2020 Journal:   Applied Sciences Vol: 10 (1)Pages: 391-391   Publisher: Multidisciplinary Digital Publishing Institute

Abstract

Image captioning is the task of generating textual descriptions of images. In order to obtain a better image representation, attention mechanisms have been widely adopted in image captioning. However, in existing models with detection-based attention, the rectangular attention regions are not fine-grained, as they contain irrelevant regions (e.g., background or overlapped regions) around the object, making the model generate inaccurate captions. To address this issue, we propose panoptic segmentation-based attention that performs attention at a mask-level (i.e., the shape of the main part of an instance). Our approach extracts feature vectors from the corresponding segmentation regions, which is more fine-grained than current attention mechanisms. Moreover, in order to process features of different classes independently, we propose a dual-attention module which is generic and can be applied to other frameworks. Experimental results showed that our model could recognize the overlapped objects and understand the scene better. Our approach achieved competitive performance against state-of-the-art methods. We made our code available.

Keywords:
Closed captioning Computer science Artificial intelligence Segmentation Process (computing) Feature (linguistics) Image (mathematics) Representation (politics) Task (project management) Object (grammar) Computer vision Pattern recognition (psychology) Linguistics

Metrics

5
Cited By
0.31
FWCI (Field Weighted Citation Impact)
69
Refs
0.54
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Multimodal Machine Learning Applications
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Advanced Image and Video Retrieval Techniques
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Advanced Neural Network Applications
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
© 2026 ScienceGate Book Chapters — All rights reserved.