Cascade Attention: Multiple Feature Based Learning for Image Captioning

Jiahe Shi; Yali Li; Shengjin Wang

doi:10.1109/icip.2019.8803149

ScienceGate Book Chapters

JOURNAL ARTICLE

Cascade Attention: Multiple Feature Based Learning for Image Captioning

Jiahe Shi Yali Li Shengjin Wang

Year: 2019 Pages: 1970-1974

DOI: 10.1109/icip.2019.8803149

Get Full-Text PDF Get Analytical Report

Abstract

Most recent researches in image captioning adopt attention mechanism based on encoder-decoder framework, where the attention module aligns input features for the decoder and boosts performance consequently. A common defect of traditional attention methods is that the inequality among different types of inputs is ignored, resulting in under-exploitation of certain informative features. In this paper, we propose a novel cascade attention module, which processes different types of input in a sequential manner. The cascade attention module enables inputs of higher priorities to affect the attention of other inputs so as to emphasize such inequality. We implement our model by introducing global feature of the image to the captioning process of R-CNN based frameworks, where such feature is rich of context information but takes few effects via traditional attention module. Experimental results demonstrate that our proposed method is able to exploit feature of different types, acquiring improvements on multiple automatic measurements.

Keywords:

Closed captioning Computer science Feature (linguistics) Cascade Context (archaeology) Encoder Exploit Process (computing) Artificial intelligence Image (mathematics) Feature extraction Pattern recognition (psychology) Machine learning Engineering

Metrics

Cited By

0.21

FWCI (Field Weighted Citation Impact)

Refs

0.54

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Multimodal Machine Learning Applications

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

Advanced Image and Video Retrieval Techniques

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

Domain Adaptation and Few-Shot Learning

Physical Sciences → Computer Science → Artificial Intelligence

Cascade Attention: Multiple Feature Based Learning for Image Captioning

Abstract

Metrics

Citation History

Topics

Related Documents

Image Captioning Method Based on Layer Feature Attention

ATTENTION BASED IMAGE CAPTIONING USING DEEP LEARNING

Remote Sensing Based Advance Image Captioning Improved Feature Attention

Auxiliary feature extractor and dual attention-based image captioning

Multiple-Level Feature-Based Network for Image Captioning