Sample Efficient Reinforcement Learning via Large Vision Language Model Distillation

Dong‐Hoon Lee; Tung M. Luu; Young-Hwan Lee; Chang D. Yoo

doi:10.1109/icassp49660.2025.10888998

ScienceGate Book Chapters

JOURNAL ARTICLE

Sample Efficient Reinforcement Learning via Large Vision Language Model Distillation

Dong‐Hoon Lee Tung M. Luu Young-Hwan Lee Chang D. Yoo

Year: 2025 Pages: 1-5

DOI: 10.1109/icassp49660.2025.10888998

Get Full-Text PDF Get Analytical Report

Abstract

Recent research highlights the potential of multimodal foundation models in tackling complex decision-making challenges. However, their large parameters make real-world deployment resource-intensive and often impractical for constrained systems. Reinforcement learning (RL) shows promise for task-specific agents but suffers from high sample complexity, limiting practical applications. To address these challenges, we introduce LVLM to Policy (LVLM2P), a novel framework that distills knowledge from large vision-language models (LVLM) into more efficient RL agents. Our approach leverages the LVLM as a teacher, providing instructional actions based on trajectories collected by the RL agent, which helps reduce less meaningful exploration in the early stages of learning, thereby significantly accelerating the agent's learning progress. Additionally, by leveraging the LVLM to suggest actions directly from visual observations, we eliminate the need for manual textual descriptors of the environment, enhancing applicability across diverse tasks. Experiments show that LVLM2P significantly enhances the sample efficiency of baseline RL algorithms.

Keywords:

Distillation Reinforcement learning Computer science Sample (material) Artificial intelligence Machine learning Sample complexity Chemistry Chromatography

Metrics

Cited By

0.00

FWCI (Field Weighted Citation Impact)

Refs

0.05

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Topics

Multimodal Machine Learning Applications

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

Sample Efficient Reinforcement Learning via Large Vision Language Model Distillation

Abstract

Metrics

Topics

Related Documents

Sample-efficient Model-based Reinforcement Learning

VLMs-Guided Representation Distillation for Efficient Vision-Based Reinforcement Learning

Reward Generation via Large Vision-Language Model in Offline Reinforcement Learning

Sample Trajectory Selection Method Based on Large Language Model in Reinforcement Learning

Reinforcement Learning Friendly Vision-Language Model for Minecraft