JOURNAL ARTICLE

Multimodal representation fusion method for dense video captioning

Haojie FangYonggang LiYingjian Li

Year: 2025 Journal:   Knowledge-Based Systems Vol: 324 Pages: 113856-113856   Publisher: Elsevier BV
Keywords:
Closed captioning Representation (politics) Computer science Fusion Artificial intelligence Natural language processing Computer vision Speech recognition Linguistics Image (mathematics)

Metrics

0
Cited By
0.00
FWCI (Field Weighted Citation Impact)
52
Refs
0.18
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Topics

Multimodal Machine Learning Applications
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Human Pose and Action Recognition
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Advanced Image and Video Retrieval Techniques
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition

Related Documents

JOURNAL ARTICLE

Improving Dense Video Captioning with a Transformer-based Multimodal Fusion Model

Yixuan LiuZiwei ZhouShen HuiHaoyuan MaHong‐Ju LiZhibo Zhang

Journal:   Journal of industry and engineering management. Year: 2024 Vol: 2 (4)Pages: 33-40
JOURNAL ARTICLE

Event-Centric Hierarchical Representation for Dense Video Captioning

Teng WangHuicheng ZhengMingjing YuQian TianHaifeng Hu

Journal:   IEEE Transactions on Circuits and Systems for Video Technology Year: 2020 Vol: 31 (5)Pages: 1890-1900
JOURNAL ARTICLE

Dense Video Captioning With Early Linguistic Information Fusion

Nayyer AafaqAjmal MianNaveed AkhtarWei LiuMubarak Shah

Journal:   IEEE Transactions on Multimedia Year: 2022 Vol: 25 Pages: 2309-2322
© 2026 ScienceGate Book Chapters — All rights reserved.