Abstract

In this paper we describe a new named entity extraction system. Our system is based on a manually developed set of rules that rely heavily upon some crucial lexical information, linguistic constraints of English, and contextual information. This system achieves state of art results in the protein name detection task, which is what many of the current name extraction systems do. We discuss the need for detection of chemical names and show that we not only obtain a high degree of success in recognizing chemicals but that this task can help improve the precision of protein name detection as well. We use context and surrounding words for categorization of named entities and find the results obtained are encouraging.

Keywords:
Categorization Computer science Context (archaeology) Information extraction Task (project management) Natural language processing Biomedical text mining Information retrieval Analytics Set (abstract data type) Artificial intelligence Named-entity recognition Data science Text mining Archaeology Engineering Programming language Geography

Metrics

117
Cited By
4.74
FWCI (Field Weighted Citation Impact)
8
Refs
0.95
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Biomedical Text Mining and Ontologies
Life Sciences →  Biochemistry, Genetics and Molecular Biology →  Molecular Biology
Semantic Web and Ontologies
Physical Sciences →  Computer Science →  Artificial Intelligence
Topic Modeling
Physical Sciences →  Computer Science →  Artificial Intelligence
© 2026 ScienceGate Book Chapters — All rights reserved.