JOURNAL ARTICLE

Multimodal-enhanced hierarchical attention network for video captioning

Maosheng ZhongYoude ChenHao ZhangHao XiongZhixiang Wang

Year: 2023 Journal:   Multimedia Systems Vol: 29 (5)Pages: 2469-2482   Publisher: Springer Science+Business Media
Keywords:
Computer science Closed captioning Decoding methods Modalities Transformer Encoder Redundancy (engineering) Modality (human–computer interaction) Artificial intelligence Context (archaeology) Speech recognition Image (mathematics) Algorithm

Metrics

3
Cited By
0.55
FWCI (Field Weighted Citation Impact)
45
Refs
0.60
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Multimodal Machine Learning Applications
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Video Analysis and Summarization
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Human Pose and Action Recognition
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition

Related Documents

JOURNAL ARTICLE

Syntax-Guided Hierarchical Attention Network for Video Captioning

Jincan DengLiang LiBeichen ZhangShuhui WangZheng-Jun ZhaQingming Huang

Journal:   IEEE Transactions on Circuits and Systems for Video Technology Year: 2021 Vol: 32 (2)Pages: 880-892
JOURNAL ARTICLE

Stacked Multimodal Attention Network for Context-Aware Video Captioning

Yi ZhengYuejie ZhangRui FengTao ZhangWeiguo Fan

Journal:   IEEE Transactions on Circuits and Systems for Video Technology Year: 2021 Vol: 32 (1)Pages: 31-42
© 2026 ScienceGate Book Chapters — All rights reserved.