Liang PangYanyan LanJiafeng GuoJun XuLixin SuXueqi Cheng
This paper is concerned with open-domain question answering (i.e., OpenQA). Recently, some works have viewed this problem as a reading comprehension (RC) task, and directly applied successful RC models to it. However, the performances of such models are not so good as that in the RC task. In our opinion, the perspective of RC ignores three characteristics in OpenQA task: 1) many paragraphs without the answer span are included in the data collection; 2) multiple answer spans may exist within one given paragraph; 3) the end position of an answer span is dependent with the start position. In this paper, we first propose a new probabilistic formulation of OpenQA, based on a three-level hierarchical structure, i.e., the question level, the paragraph level and the answer span level. Then a Hierarchical Answer Spans Model (HASQA) is designed to capture each probability. HAS-QA has the ability to tackle the above three problems, and experiments on public OpenQA datasets show that it significantly outperforms traditional RC baselines and recent OpenQA baselines.
Tieke HeLi YuZhipeng ZouQing Wu
Manoj Ghuhan ArivazhaganLan LiuPeng QiXinchi ChenWilliam Yang WangZhiheng Huang
Ye LiuKazuma HashimotoYingbo ZhouSemih YavuzCaiming XiongPhilip S. Yu
Cunli MaoLina LiZhengtao YuLu HanJianyi GuoXiong-Li Lei
Jun SuzukiYutaka SasakiEisaku Maeda