Zhi ChenLu ChenXiaoyuan LiuKai Yu
The task-oriented spoken dialogue system (SDS) aims to assist a human user in\naccomplishing a specific task (e.g., hotel booking). The dialogue management is\na core part of SDS. There are two main missions in dialogue management:\ndialogue belief state tracking (summarising conversation history) and dialogue\ndecision-making (deciding how to reply to the user). In this work, we only\nfocus on devising a policy that chooses which dialogue action to respond to the\nuser. The sequential system decision-making process can be abstracted into a\npartially observable Markov decision process (POMDP). Under this framework,\nreinforcement learning approaches can be used for automated policy\noptimization. In the past few years, there are many deep reinforcement learning\n(DRL) algorithms, which use neural networks (NN) as function approximators,\ninvestigated for dialogue policy.\n
Miloš S. StankovićMiloš BekoMiloš PavlovićIlija PopadićSrđan Stanković
Pei-Hao SuPaweł BudzianowskiStefan UltesMilica GašićSteve Young
Lu ChenZhi ChenBowen TanSishan LongMilica GašićKai Yu
Michael T. RosensteinAndrew G. BartoJennie SiAndy BartoWarren B. PowellDonald C. Wunsch