Domain-specific keyphrase extraction

Yifang Wu; Quanzhi Li; Razvan Stefan Bot; Xin Chen

doi:10.1145/1099554.1099628

ScienceGate Book Chapters

JOURNAL ARTICLE

Domain-specific keyphrase extraction

Yifang Wu Quanzhi Li Razvan Stefan Bot Xin Chen

Year: 2005 Pages: 283-284

DOI: 10.1145/1099554.1099628

Get Full-Text PDF Get Analytical Report

Abstract

Document keyphrases provide semantic metadata characterizing documents and producing an overview of the content of a document. They can be used in many text-mining and knowledge management related applications. This paper describes a Keyphrase Identification Program (KIP), which extracts document keyphrases by using prior positive samples of human identified domain keyphrases to assign weights to the candidate keyphrases. The logic of our algorithm is: the more keywords a candidate keyphrase contains and the more significant these keywords are, the more likely this candidate phrase is a keyphrase. To obtain prior positive inputs, KIP first populates its glossary database using manually identified keyphrases and keywords. It then checks the composition of all noun phrases of a document, looks up the database and calculates scores for all these noun phrases. The ones having higher scores will be extracted as keyphrases.

Keywords:

Computer science Noun phrase Natural language processing Phrase Information retrieval Domain (mathematical analysis) Metadata Artificial intelligence Glossary Noun World Wide Web Linguistics

Metrics

Cited By

1.53

FWCI (Field Weighted Citation Impact)

Refs

0.85

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Advanced Text Analysis Techniques

Physical Sciences → Computer Science → Artificial Intelligence

Domain-specific keyphrase extraction

Abstract

Metrics

Citation History

Topics

Related Documents

Software Keyphrase Extraction with Domain-Specific Features

DIKEA: Domain-Independent Keyphrase Extraction Algorithm

Domain-specific keyphrase extraction and near-duplicate article detection based on ontology

A New Domain Independent Keyphrase Extraction System

Capturing Global Informativeness in Open Domain Keyphrase Extraction