JOURNAL ARTICLE

Knowledge-intensive and entity-centric natural language processing

Zhang, Wenzheng

Year: 2025 Journal:   Rutgers University Community Repository (Rutgers University)   Publisher: Rutgers, The State University of New Jersey

Abstract

Accessing large-scale external knowledge while maintaining a consistent understanding of real world entities is essential for modern natural language processing (NLP) systems. This thesis investigates two fundamental capabilities that support this objective: knowledge-intensive language processing, which enables models to retrieve and integrate external information, and entity centric language understanding, which facilitates identifying, linking, and reasoning about entities in context.We first explore knowledge-intensive language processing through the lens of retrieval-based methods. We present a theoretical and empirical analysis of hard negatives in the Noise Contrastive Estimation (NCE) training objective, improve multi-task retrieval by promoting task specialization and propose a retrieval-augmented generation framework that allows models to express their information needs implicitly, eliminating the need for human-specified queries.Next, we focus on entity-centric language understanding. We introduce a novel approach that reframes entity linking as an inverse open-domain question answering problem, addressing the challenge of predicting mentions without knowing their corresponding entities, and naturally extending NCE to support multi-label retrieval. We also propose a simple yet effective sequence-to-sequence model for coreference resolution, which maps input text to linearized coreference annotations and achieves strong performance with no task-specific model design.These contributions advance the development of NLP systems that can reason more effectively over external knowledge and entities, enabling stronger performance on a wide range of information-seeking and understanding tasks.

Keywords:
Coreference Question answering Focus (optics) Natural language Language understanding Natural language understanding Task (project management) Language model

Metrics

0
Cited By
0.00
FWCI (Field Weighted Citation Impact)
0
Refs
0.82
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Topics

Topic Modeling
Physical Sciences →  Computer Science →  Artificial Intelligence
Text and Document Classification Technologies
Physical Sciences →  Computer Science →  Artificial Intelligence
Sentiment Analysis and Opinion Mining
Physical Sciences →  Computer Science →  Artificial Intelligence

Related Documents

JOURNAL ARTICLE

Neural methods for entity-centric knowledge extraction and reasoning in natural language

Bhowmik, Rajarshi

Journal:   Rutgers University Community Repository (Rutgers University) Year: 2021
JOURNAL ARTICLE

Revealing the technology development of natural language processing: A Scientific entity-centric perspective

Heng ZhangChengzhi ZhangYuzhuo Wang

Journal:   Information Processing & Management Year: 2023 Vol: 61 (1)Pages: 103574-103574
JOURNAL ARTICLE

Knowledge-intensive natural language generation

Paul S. Jacobs

Journal:   Artificial Intelligence Year: 1987 Vol: 33 (3)Pages: 325-378
© 2026 ScienceGate Book Chapters — All rights reserved.