Knowledge-intensive and entity-centric natural language processing

Zhang, Wenzheng

doi:10.7282/t3-tddn-sb28

ScienceGate Book Chapters

JOURNAL ARTICLE

Knowledge-intensive and entity-centric natural language processing

Zhang, Wenzheng

Year: 2025 Journal: Rutgers University Community Repository (Rutgers University) Publisher: Rutgers, The State University of New Jersey

DOI: 10.7282/t3-tddn-sb28

Get Full-Text PDF Get Analytical Report

Abstract

Accessing large-scale external knowledge while maintaining a consistent understanding of real world entities is essential for modern natural language processing (NLP) systems. This thesis investigates two fundamental capabilities that support this objective: knowledge-intensive language processing, which enables models to retrieve and integrate external information, and entity centric language understanding, which facilitates identifying, linking, and reasoning about entities in context.We first explore knowledge-intensive language processing through the lens of retrieval-based methods. We present a theoretical and empirical analysis of hard negatives in the Noise Contrastive Estimation (NCE) training objective, improve multi-task retrieval by promoting task specialization and propose a retrieval-augmented generation framework that allows models to express their information needs implicitly, eliminating the need for human-specified queries.Next, we focus on entity-centric language understanding. We introduce a novel approach that reframes entity linking as an inverse open-domain question answering problem, addressing the challenge of predicting mentions without knowing their corresponding entities, and naturally extending NCE to support multi-label retrieval. We also propose a simple yet effective sequence-to-sequence model for coreference resolution, which maps input text to linearized coreference annotations and achieves strong performance with no task-specific model design.These contributions advance the development of NLP systems that can reason more effectively over external knowledge and entities, enabling stronger performance on a wide range of information-seeking and understanding tasks.

Keywords:

Coreference Question answering Focus (optics) Natural language Language understanding Natural language understanding Task (project management) Language model

Metrics

Cited By

0.00

FWCI (Field Weighted Citation Impact)

Refs

0.82

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Topics

Topic Modeling

Physical Sciences → Computer Science → Artificial Intelligence

Text and Document Classification Technologies

Physical Sciences → Computer Science → Artificial Intelligence

Sentiment Analysis and Opinion Mining

Physical Sciences → Computer Science → Artificial Intelligence

Knowledge-intensive and entity-centric natural language processing

Abstract

Metrics

Topics

Related Documents

Event-Centric Natural Language Processing

Neural methods for entity-centric knowledge extraction and reasoning in natural language

An entity-centric approach to manage court judgments based on Natural Language Processing

Revealing the technology development of natural language processing: A Scientific entity-centric perspective

Knowledge-intensive natural language generation