Offline Quantum Reinforcement Learning in a Conservative Manner

Zhihao Cheng; Kaining Zhang; Li Shen; Dacheng Tao

doi:10.1609/aaai.v37i6.25872

ScienceGate Book Chapters

JOURNAL ARTICLE

Offline Quantum Reinforcement Learning in a Conservative Manner

Zhihao Cheng Kaining Zhang Li Shen Dacheng Tao

Year: 2023 Journal: Proceedings of the AAAI Conference on Artificial Intelligence Vol: 37 (6)Pages: 7148-7156 Publisher: Association for the Advancement of Artificial Intelligence

DOI: 10.1609/aaai.v37i6.25872

Get Full-Text PDF Get Analytical Report

Abstract

Recently, to reap the quantum advantage, empowering reinforcement learning (RL) with quantum computing has attracted much attention, which is dubbed as quantum RL (QRL). However, current QRL algorithms employ an online learning scheme, i.e., the policy that is run on a quantum computer needs to interact with the environment to collect experiences, which could be expensive and dangerous for practical applications. In this paper, we aim to solve this problem in an offline learning manner. To be more specific, we develop the first offline quantum RL (offline QRL) algorithm named CQ2L (Conservative Quantum Q-learning), which learns from offline samples and does not require any interaction with the environment. CQ2L utilizes variational quantum circuits (VQCs), which are improved with data re-uploading and scaling parameters, to represent Q-value functions of agents. To suppress the overestimation of Q-values resulting from offline data, we first employ a double Q-learning framework to reduce the overestimation bias; then a penalty term that encourages generating conservative Q-values is designed. We conduct abundant experiments to demonstrate that the proposed method CQ2L can successfully solve offline QRL tasks that the online counterpart could not.

Keywords:

Reinforcement learning Computer science Upload Offline learning Quantum computer Quantum Online and offline Scheme (mathematics) Q-learning Theoretical computer science Artificial intelligence Algorithm Online learning Mathematics Quantum mechanics

Metrics

Cited By

0.87

FWCI (Field Weighted Citation Impact)

Refs

0.66

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Quantum Computing Algorithms and Architecture

Physical Sciences → Computer Science → Artificial Intelligence

Quantum Information and Cryptography

Physical Sciences → Computer Science → Artificial Intelligence

Neural Networks and Reservoir Computing

Physical Sciences → Computer Science → Artificial Intelligence

Offline Quantum Reinforcement Learning in a Conservative Manner

Abstract

Metrics

Citation History

Topics

Related Documents

Conservative network for offline reinforcement learning

Stable Conservative Q-Learning for Offline Reinforcement Learning

Adaptable Conservative Q-Learning for Offline Reinforcement Learning

Conservative In-Distribution Q-Learning for Offline Reinforcement Learning

DOMAIN: Mildly Conservative Model-Based Offline Reinforcement Learning