JOURNAL ARTICLE

Tidy-up Task Planner based on Q-learning

Min-Gyu YangKuk-Hyun AhnJae-Bok Song

Year: 2021 Journal:   The Journal of Korea Robotics Society Vol: 16 (1)Pages: 56-63

Abstract

A central problem in learning in complex environments is balancing exploration of untested actions against exploitation of actions that are known to be good.The benefit of exploration can be estimated using the classical notion of Value of Information-the expected improvement in future decision quality that might arise from the information acquired by exploration.Estimating this quantity requires an assessment of the agent's uncertainty about its current value estimates for states.In this paper, we adopt a Bayesian approach to maintaining this uncertain information.We extend Watkins' Q-learning by maintaining and propagating probability distributions over the Q-values.These distributions are used to compute a myopic approximation to the value of information for each action and hence to select the action that best balances exploration and exploitation.We establish the convergence properties of our algorithm and show experimentally that it can exhibit substantial improvements over other well-known model-free exploration strategies.

Keywords:
Value of information Computer science Convergence (economics) Planner Q-learning Action (physics) Task (project management) Bayesian probability Reinforcement learning Value (mathematics) Artificial intelligence Quality (philosophy) Mathematical optimization Machine learning Mathematics Engineering Economics

Metrics

0
Cited By
0.00
FWCI (Field Weighted Citation Impact)
20
Refs
0.03
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Topics

Machine Learning and Algorithms
Physical Sciences →  Computer Science →  Artificial Intelligence
Reinforcement Learning in Robotics
Physical Sciences →  Computer Science →  Artificial Intelligence
AI-based Problem Solving and Planning
Physical Sciences →  Computer Science →  Artificial Intelligence

Related Documents

JOURNAL ARTICLE

Tidy-Up Tasks Using Trajectory-based Imitation Learning

Doo-Jun KimHyun-Jun JoJae‐Bok Song

Journal:   2021 21st International Conference on Control, Automation and Systems (ICCAS) Year: 2021 Pages: 496-499
JOURNAL ARTICLE

Tidy up

Journal:   The New Scientist Year: 2025 Vol: 266 (3547)Pages: 46-46
JOURNAL ARTICLE

Tidy-up time!

Journal:   Early Years Educator Year: 2006 Vol: 8 (3)Pages: 60-60
JOURNAL ARTICLE

Time to tidy up

Karen Hart

Journal:   Practical Pre-School Year: 2017 Vol: 2017 (Sup197)Pages: 7-8
JOURNAL ARTICLE

Reinforcement learning task planner for construction task, assisted by LLM

Guzmán-Merino, MiguelPlönnigs, Jörn

Journal:   Strathprints: The University of Strathclyde institutional repository (University of Strathclyde) Year: 2025
© 2026 ScienceGate Book Chapters — All rights reserved.