JOURNAL ARTICLE

Multimodal attention-based transformer for video captioning

M. HemalathaCharu Chandra

Year: 2023 Journal:   Applied Intelligence Vol: 53 (20)Pages: 23349-23368   Publisher: Springer Science+Business Media
Keywords:
Computer science Closed captioning Transformer Encoder Artificial intelligence Convolutional neural network Embedding Block (permutation group theory) Pattern recognition (psychology) Computer vision Image (mathematics)

Metrics

9
Cited By
1.64
FWCI (Field Weighted Citation Impact)
63
Refs
0.81
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Multimodal Machine Learning Applications
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Video Analysis and Summarization
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Advanced Image and Video Retrieval Techniques
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition

Related Documents

BOOK-CHAPTER

Multimodal Interaction Fusion Network Based on Transformer for Video Captioning

Hui XuPengpeng ZengAbdullah Aman Khan

Communications in computer and information science Year: 2022 Pages: 21-36
JOURNAL ARTICLE

UAT: Universal Attention Transformer for Video Captioning

Heeju ImYong Suk Choi

Journal:   Sensors Year: 2022 Vol: 22 (13)Pages: 4817-4817
BOOK-CHAPTER

Diffusion-Based Multimodal Video Captioning

J. KainulainenZixin GuoJorma Laaksonen

Lecture notes in computer science Year: 2024 Pages: 148-165
© 2026 ScienceGate Book Chapters — All rights reserved.