JOURNAL ARTICLE

Multi-Semantic Alignment Co-Reasoning Network for Video Question Answering

Abstract

Video question answering challenges models on understanding textual questions with varying complexity and searching for clues from visual content with different hierarchical semantics. In this paper, we propose a novel Multi-Semantic Alignment Co-Reasoning Network (MACN) to accomplish an interactive inference between the question and the video input. The design of our MACN comprises two modules of Question-Centric Interaction (QCI) and Contextual Semantic Reasoning (CSR). Specifically, QCI establishes a question-centric heterogeneous graph model to align visual content at different temporal scales with questions to enable the extraction of visual representations under better textual understanding. CSR exploits self-attention mechanisms to extract the contextual dependencies of visual semantics at different hierarchies to achieve co-reasoning of answer clues. Experiments on three benchmarks demonstrate that our proposed method is superior to previous state-of-the-art performance.

Keywords:
Computer science Question answering Semantics (computer science) Inference Exploit Artificial intelligence Graph Natural language processing Theoretical computer science Programming language

Metrics

1
Cited By
0.18
FWCI (Field Weighted Citation Impact)
18
Refs
0.42
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Multimodal Machine Learning Applications
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Advanced Image and Video Retrieval Techniques
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Domain Adaptation and Few-Shot Learning
Physical Sciences →  Computer Science →  Artificial Intelligence

Related Documents

JOURNAL ARTICLE

Video Question Answering With Semantic Disentanglement and Reasoning

Jin LiuGuoxiang WangJialong XieFengyu ZhouHuijuan Xu

Journal:   IEEE Transactions on Circuits and Systems for Video Technology Year: 2023 Vol: 34 (5)Pages: 3663-3673
JOURNAL ARTICLE

Reasoning with Heterogeneous Graph Alignment for Video Question Answering

Jiang PinYahong Han

Journal:   Proceedings of the AAAI Conference on Artificial Intelligence Year: 2020 Vol: 34 (07)Pages: 11109-11116
JOURNAL ARTICLE

Collaborative Aware Bidirectional Semantic Reasoning for Video Question Answering

Xize WuJiasong WuLei ZhuLotfi SenhadjiHuazhong Shu

Journal:   IEEE Transactions on Circuits and Systems for Video Technology Year: 2024 Vol: 35 (3)Pages: 2074-2086
© 2026 ScienceGate Book Chapters — All rights reserved.