Budgeted Policy Learning for Task-Oriented Dialogue Systems

Zhirui Zhang; Xiujun Li; Jianfeng Gao; Enhong Chen

doi:10.18653/v1/p19-1364

ScienceGate Book Chapters

JOURNAL ARTICLE

Budgeted Policy Learning for Task-Oriented Dialogue Systems

Zhirui Zhang Xiujun Li Jianfeng Gao Enhong Chen

Year: 2019 Pages: 3742-3751

DOI: 10.18653/v1/p19-1364

Get Full-Text PDF Get Analytical Report

Abstract

This paper presents a new approach that extends Deep Dyna-Q (DDQ) by incorporating a Budget-Conscious Scheduling (BCS) to best utilize a fixed, small amount of user interactions (budget) for learning task-oriented dialogue agents. BCS consists of (1) a Poisson-based global scheduler to allocate budget over different stages of training; (2) a controller to decide at each training step whether the agent is trained using real or simulated experiences; (3) a user goal sampling module to generate the experiences that are most effective for policy learning. Experiments on a movie-ticket booking task with simulated and real users show that our approach leads to significant improvements in success rate over the state-of-the-art baselines given the fixed budget.

Keywords:

Computer science Scheduling (production processes) Ticket Reinforcement learning Task (project management) Budget constraint Poisson distribution Human–computer interaction Distributed computing Artificial intelligence Real-time computing Computer security Engineering Operations management

Metrics

Cited By

3.07

FWCI (Field Weighted Citation Impact)

Refs

0.93

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Speech and dialogue systems

Physical Sciences → Computer Science → Artificial Intelligence

Topic Modeling

Physical Sciences → Computer Science → Artificial Intelligence

Intelligent Tutoring Systems and Adaptive Learning

Physical Sciences → Computer Science → Artificial Intelligence

Budgeted Policy Learning for Task-Oriented Dialogue Systems

Abstract

Metrics

Citation History

Topics

Related Documents

Domain Complexity and Policy Learning in Task-Oriented Dialogue Systems

Continual Learning for Task-Oriented Dialogue Systems

Continual Learning in Task-Oriented Dialogue Systems

Task-wrapped Continual Learning in Task-Oriented Dialogue Systems

Transfer reinforcement learning for task-oriented dialogue systems