JOURNAL ARTICLE

Distillation-Centric Approaches in Visual Question Answering with Mixture of Experts

H. HoangTung D. LeNguyen Tien Huy

Year: 2025 Journal:   Research and Development on Information and Communication Technology Pages: 5-5

Abstract

Recent advancements in computer vision and natural language processing were applied to the Visual Question Answering task. Nonetheless, a significant proportion of models exhibiting high accuracy possess extensive architectural components. This has a significant impact on the process of bringing the technology to practical applications such as assistive devices for the blind and visually impaired, and other related fields. Our research focuses on compressing the Visual Question Answering model on the Vietnamese dataset by utilizing the knowledge distillation method. Furthermore, in order to enhance precision, we have also developed a Mixture of ViVQA Experts system that will adapt to each type of question for improving accuracy while increasing only a few parameters and not wasting time retraining the entire system from scratch. With a total of 204M parameters, this approach has reduced the size by 24.51% compared to the original model while only reducing accuracy by 6.59\% on the overall test set. More specifically, we have made accuracy improvements on each question type: "number" increased by 1.35% and "color" increased by 0.48\% compared to our distillation model. The code and pretrained models are available at: anonymous.

Keywords:

Metrics

0
Cited By
0.00
FWCI (Field Weighted Citation Impact)
0
Refs
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Topics

Related Documents

JOURNAL ARTICLE

Distillation-Centric Approaches in Visual Question Answering with Mixture of Experts

H. HoangTung D. LeNguyen Tien Huy

Journal:   Research and Development on Information and Communication Technology Year: 2025 Pages: 5-5
JOURNAL ARTICLE

Adaptive Momentum Mixture-of-Experts for Continual Visual Question Answering

Tianyu HuaiJie ZhouQin ChenQingchun BaiZe ZhouXipeng QiuLiang He

Journal:   IEEE Transactions on Circuits and Systems for Video Technology Year: 2025 Pages: 1-1
BOOK-CHAPTER

Answer Distillation for Visual Question Answering

Zhiwei FangJing LiuQu TangYong LiHanqing Lu

Lecture notes in computer science Year: 2019 Pages: 72-87
© 2026 ScienceGate Book Chapters — All rights reserved.