Embodied Question Answering Multi-Step Ahead Image Prediction with Mixture of Experts for Embodied Question Answering

Yuya Kamiwano; Kanata Suzuki; Naoya Chiba; Hiroki MORI; Tetsuya Ogata

doi:10.1299/jsmermd.2023.2a1-h03

ScienceGate Book Chapters

JOURNAL ARTICLE

Embodied Question Answering Multi-Step Ahead Image Prediction with Mixture of Experts for Embodied Question Answering

Yuya Kamiwano Kanata Suzuki Naoya Chiba Hiroki MORI Tetsuya Ogata

Year: 2023 Journal: The Proceedings of JSME annual Conference on Robotics and Mechatronics (Robomec) Vol: 2023 (0)Pages: 2A1-H03 Publisher: Japan Society Mechanical Engineers

DOI: 10.1299/jsmermd.2023.2a1-h03

Get Full-Text PDF Get Analytical Report

Abstract

In this study, we proposed a subtask that combines multiple scales of visual field prediction and investigated its effectiveness for Embodied Question Answering (EQA). In EQA, it is desirable to be able to automatically select a prediction scale according to the situation, because the path to the target object depends on the instructions given. However, previous studies have only examined subtask learning with a limited prediction scale and target. We propose a mixture of experts model in which multiple expert networks predict future images of different time steps, and a higher-level gating network estimates the distribution of each expert's output. By sequentially adjusting the output of the expert network, the proposed method enables robot navigation considering multiple prediction scales. Comparison experiments on the EQA MP3D dataset show that the proposed method improves the model's prediction accuracy regardless of the distance to the target.

Keywords:

Computer science

Metrics

Cited By

0.00

FWCI (Field Weighted Citation Impact)

Refs

0.25

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Topics

Robotics and Automated Systems

Physical Sciences → Engineering → Control and Systems Engineering

Innovation in Digital Healthcare Systems

Health Sciences → Health Professions → Health Information Management

Video Analysis and Summarization

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

Embodied Question Answering Multi-Step Ahead Image Prediction with Mixture of Experts for Embodied Question Answering

Abstract

Metrics

Topics

Related Documents

Multi-Timestep-Ahead Prediction with Mixture of Experts for Embodied Question Answering

Embodied Question Answering

Multi-Target Embodied Question Answering

Knowledge-Based Embodied Question Answering

Multi-agent Embodied Question Answering in Interactive Environments