JOURNAL ARTICLE

Cascade Attention: Multiple Feature Based Learning for Image Captioning

Abstract

Most recent researches in image captioning adopt attention mechanism based on encoder-decoder framework, where the attention module aligns input features for the decoder and boosts performance consequently. A common defect of traditional attention methods is that the inequality among different types of inputs is ignored, resulting in under-exploitation of certain informative features. In this paper, we propose a novel cascade attention module, which processes different types of input in a sequential manner. The cascade attention module enables inputs of higher priorities to affect the attention of other inputs so as to emphasize such inequality. We implement our model by introducing global feature of the image to the captioning process of R-CNN based frameworks, where such feature is rich of context information but takes few effects via traditional attention module. Experimental results demonstrate that our proposed method is able to exploit feature of different types, acquiring improvements on multiple automatic measurements.

Keywords:
Closed captioning Computer science Feature (linguistics) Cascade Context (archaeology) Encoder Exploit Process (computing) Artificial intelligence Image (mathematics) Feature extraction Pattern recognition (psychology) Machine learning Engineering

Metrics

4
Cited By
0.21
FWCI (Field Weighted Citation Impact)
24
Refs
0.54
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Multimodal Machine Learning Applications
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Advanced Image and Video Retrieval Techniques
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Domain Adaptation and Few-Shot Learning
Physical Sciences →  Computer Science →  Artificial Intelligence

Related Documents

JOURNAL ARTICLE

ATTENTION BASED IMAGE CAPTIONING USING DEEP LEARNING

Tejaswini NakirekantiD. Deepika

Journal:   International Journal of Innovative Research in Advanced Engineering Year: 2021 Vol: 8 (12)Pages: 379-387
JOURNAL ARTICLE

Auxiliary feature extractor and dual attention-based image captioning

Qian ZhaoGuichang Wu

Journal:   Signal Image and Video Processing Year: 2024 Vol: 18 (4)Pages: 3615-3626
BOOK-CHAPTER

Multiple-Level Feature-Based Network for Image Captioning

Kaidi ZhengChen ZhuShaopeng LuYonggang Liu

Lecture notes in computer science Year: 2018 Pages: 94-103
© 2026 ScienceGate Book Chapters — All rights reserved.