JOURNAL ARTICLE

Robust Driving Policy Learning with Guided Meta Reinforcement Learning

Abstract

Although deep reinforcement learning (DRL) has shown promising results for autonomous navigation in interactive traffic scenarios, existing work typically adopts a fixed behavior policy to control social vehicles in the training environment. This may cause the learned driving policy to overfit the environment, making it difficult to interact well with vehicles with different, unseen behaviors. In this work, we introduce an efficient method to train diverse driving policies for social vehicles as a single meta-policy. By randomizing the interaction-based reward functions of social vehicles, we can generate diverse objectives and efficiently train the meta-policy through guiding policies that achieve specific objectives. We further propose a training strategy to enhance the robustness of the ego vehicle's driving policy using the environment where social vehicles are controlled by the learned meta-policy. Our method successfully learns an ego driving policy that generalizes well to unseen situations with out-of-distribution (OOD) social agents' behaviors in a challenging uncontrolled T-intersection scenario.

Keywords:
Reinforcement learning Computer science Meta learning (computer science) Policy learning Artificial intelligence Machine learning Engineering Systems engineering

Metrics

7
Cited By
1.74
FWCI (Field Weighted Citation Impact)
35
Refs
0.82
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Traffic control and management
Physical Sciences →  Engineering →  Control and Systems Engineering
Transportation and Mobility Innovations
Physical Sciences →  Engineering →  Automotive Engineering
Traffic Prediction and Management Techniques
Physical Sciences →  Engineering →  Building and Construction

Related Documents

JOURNAL ARTICLE

Meta Reinforcement Learning with Hebbian Learning

Di Wang

Journal:   2022 IEEE 13th Annual Ubiquitous Computing, Electronics & Mobile Communication Conference (UEMCON) Year: 2022 Pages: 0052-0058
JOURNAL ARTICLE

Nondominated Policy-Guided Learning in Multi-Objective Reinforcement Learning

Man-Je KimHyunsoo ParkChang Wook Ahn

Journal:   Electronics Year: 2022 Vol: 11 (7)Pages: 1069-1069
© 2026 ScienceGate Book Chapters — All rights reserved.