BOOK-CHAPTER

Multimodal Large Language Models for Video Understanding

Yi WangJiashuo YuYinan HeLimin WangYu Qiao

Year: 2025 Advances in computer vision and pattern recognition Pages: 59-91   Publisher: Springer International Publishing
Keywords:
Computer science

Metrics

0
Cited By
0.00
FWCI (Field Weighted Citation Impact)
51
Refs
0.63
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Topics

Multimodal Machine Learning Applications
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Human Pose and Action Recognition
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Video Analysis and Summarization
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition

Related Documents

JOURNAL ARTICLE

Surveillance Video-and-Language Understanding: From Small to Large Multimodal Models

Tongtong YuanXuange ZhangBo LiuKun LiuJian Gang JinZhenzhen Jiao

Journal:   IEEE Transactions on Circuits and Systems for Video Technology Year: 2024 Vol: 35 (1)Pages: 300-314
JOURNAL ARTICLE

Large Language Models (LLMs) for Video Understanding

Journal:   IEEE Transactions on Circuits and Systems for Video Technology Year: 2024 Vol: 34 (10)Pages: 9758-9758
© 2026 ScienceGate Book Chapters — All rights reserved.