Goal-conditioned offline reinforcement learning through state space partitioning

Mianchu Wang; Yue Jin; Giovanni Montana

doi:10.1007/s10994-023-06500-z

ScienceGate Book Chapters

JOURNAL ARTICLE

Goal-conditioned offline reinforcement learning through state space partitioning

Mianchu Wang Yue Jin Giovanni Montana

Year: 2024 Journal: Machine Learning Vol: 113 (5)Pages: 2435-2465 Publisher: Springer Science+Business Media

DOI: 10.1007/s10994-023-06500-z

Get Full-Text PDF Get Analytical Report

Abstract

Abstract Offline reinforcement learning (RL) aims to create policies for sequential decision-making using exclusively offline datasets. This presents a significant challenge, especially when attempting to accomplish multiple distinct goals or outcomes within a given scenario while receiving sparse rewards. Prior methods using advantage weighting for offline goal-conditioned learning improve policies monotonically. However, they still face challenges from distribution shift and multi-modality that arise due to conflicting ways to reach a goal. This issue is especially challenging in long-horizon tasks, where the presence of multiple, often conflicting, solutions makes it hard to identify a single optimal policy for transitioning from a state to a desired goal. To address these challenges, we introduce a complementary advantage-based weighting scheme that incorporates an additional source of inductive bias. Given a value-based partitioning of the state space, the contribution of actions expected to lead to target regions that are easier to reach, compared to the final goal, is further increased. Our proposed approach, Dual-Advantage Weighted Offline Goal-conditioned RL, outperforms several competing offline algorithms in widely used benchmarks. Furthermore, we provide a theoretical guarantee that the learned policy will not be inferior to the underlying behavior policy.

Keywords:

Reinforcement learning Computer science Weighting Artificial intelligence Machine learning State space Space (punctuation) Mathematics

Metrics

Cited By

0.64

FWCI (Field Weighted Citation Impact)

Refs

0.63

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Reinforcement Learning in Robotics

Physical Sciences → Computer Science → Artificial Intelligence

Adaptive Dynamic Programming Control

Physical Sciences → Computer Science → Computational Theory and Mathematics

Advanced Multi-Objective Optimization Algorithms

Physical Sciences → Computer Science → Computational Theory and Mathematics

Goal-conditioned offline reinforcement learning through state space partitioning

Abstract

Metrics

Citation History

Topics

Related Documents

Hierarchical Planning Through Goal-Conditioned Offline Reinforcement Learning

Offline Goal-Conditioned Reinforcement Learning with Distributional Perspective

Curriculum Goal-Conditioned Imitation for Offline Reinforcement Learning

State Representation Learning for Goal-Conditioned Reinforcement Learning

Offline Goal-Conditioned Model-Based Reinforcement Learning in Pixel-Based Environment