BOOK-CHAPTER

Open-Domain Question Answering Framework Using Wikipedia

Saleem AmeenHyunsuk ChungSoyeon Caren HanByeong Ho Kang

Year: 2016 Lecture notes in computer science Pages: 623-635   Publisher: Springer Science+Business Media

Abstract

This paper explores the feasibility of implementing a model for an open domain, automated question and answering framework that leverages Wikipedias knowledgebase. While Wikipedia implicitly comprises answers to common questions, the disambiguation of natural language and the difficulty of developing an information retrieval process that produces answers with specificity present pertinent challenges. However, observational analysis suggests that it is possible to discount the syntactical and lexical structure of a sentence in contexts where questions contain a specific target entity (words that identify a person, location or organisation) and that correspondingly query a property related to it. To investigate this, we implemented an algorithmic process that extracted the target entity from the question using CRF based named entity recognition (NER) and utilised all remaining words as<i> potential</i> properties. Using DBPedia, an ontological database of Wikipedias knowledge, we searched for the closest matching property that would produce an answer by applying standardised string matching algorithms including the Levenshtein distance, similar text and Dices coefficient. Our experimental results illustrate that using Wikipedia as a knowledgebase produces high precision for questions that contain a singular unambiguous entity as the subject, but lowered accuracy for questions where the entity exists as part of the object.

Keywords:
Computer science Levenshtein distance Question answering Information retrieval Domain (mathematical analysis) Entity linking Open domain Sentence Natural language processing Matching (statistics) Named-entity recognition Process (computing) Property (philosophy) Subject (documents) Object (grammar) Natural language Artificial intelligence Knowledge base World Wide Web Task (project management)

Metrics

0
Cited By
0.00
FWCI (Field Weighted Citation Impact)
9
Refs
0.26
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Topics

Topic Modeling
Physical Sciences →  Computer Science →  Artificial Intelligence
Natural Language Processing Techniques
Physical Sciences →  Computer Science →  Artificial Intelligence
Wikis in Education and Collaboration
Social Sciences →  Social Sciences →  Communication

Related Documents

JOURNAL ARTICLE

Open domain question answering using Wikipedia-based knowledge model

Pum-Mo RyuMyung-Gil JangHyunki Kim

Journal:   Information Processing & Management Year: 2014 Vol: 50 (5)Pages: 683-692
JOURNAL ARTICLE

Open-Domain Question Answering

Danqi ChenWen-tau Yih

Year: 2020 Pages: 34-37
BOOK-CHAPTER

L2R-QA: An Open-Domain Question Answering Framework

Tieke HeLi YuZhipeng ZouQing Wu

Lecture notes in computer science Year: 2019 Pages: 151-162
JOURNAL ARTICLE

Open-Domain Question–Answering

John Prager

Journal:   Foundations and Trends® in Information Retrieval Year: 2007 Vol: 1 (2)Pages: 91-231
© 2026 ScienceGate Book Chapters — All rights reserved.