Knowledge Enhanced Natural Language Generation

Tang, Chen

doi:10.15126/thesis.901167

ScienceGate Book Chapters

JOURNAL ARTICLE

Knowledge Enhanced Natural Language Generation

Tang, Chen

Year: 2024 Journal: Surrey Open Research repository (University of Surrey) Publisher: University of Surrey

DOI: 10.15126/thesis.901167

Get Full-Text PDF Get Analytical Report

Abstract

<br>Natural Language Generation (NLG) represents a challenging subdomain within Natural Language Processing (NLP), necessitating diverse language skills encompassing semantic comprehension, logical reasoning, and language generation. In recent years, the advent of new deep learning methodologies has yielded notable breakthroughs within the NLG domain, particularly with the emergence of large language models (LLMs). Therefore, current research on LLMs for NLG predominantly seeks to optimise knowledge integration in text generation, serving as the focal point of this thesis. This thesis investigates knowledge-enhanced Natural Language Generation across several distinct yet representative domains, including three studies on Story Generation, one on Text Summarisation, three on Dialogue Generation, and one on Tongue Twister Generation. While Story Generation and Text Summarisation pertain to long-form text generation, the others focus on short-form text generation, encompassing dialogue generation targeting conventional language speaking, and tongue twisters for creative language generation. Furthermore, the studies included herein span open-domain NLG and domain-specific NLG, collectively delving into a diverse range of categories. The studies in this thesis aim to address overarching challenges persisting across NLG applications, including (1) Limitations in effectively understanding input semantics and generating relevant and coherent output for NLG tasks, and (2) Inadequate background knowledge encompassing terminologies and logical relationships not readily accessible from training data. These challenges could be mitigated by injecting task-required knowledge into base language models, \textit{e.g.} inject commonsense concepts and their relations to support commonsense dialogue generation. In this thesis, a range of novel techniques are investigated to better incorporate background or task-oriented knowledge for the inference of language models. <br>These techniques can be summarised into the following directions: (1) introduce additional labels obtained by dependency parsing or terminology recognition; (2) New dataset and fine-tuning LLMs on a specific domain; (3) Modifying the architecture of neural networks to make it perform better on text generation, especially exploiting heterogeneous representations of textual and graph knowledge. In addition, many experiments are conducted to demonstrate the effectiveness of our methodologies in either the training stage or the inference stage into language models during text generation. The results indicate that a proper utilisation of knowledge can substantially improve the performance of various language models in generating text. By explicitly modelling knowledge systems, \textit{e.g.} conceptual knowledge and domain-specific background knowledge, we anticipate that the language model can evolve into a complex system capable of comprehensive abilities akin to those of humans, such as reading comprehension, logical reasoning, and creativity. The details of these techniques and experiments are presented in the main content of this thesis.

Keywords:

Natural language generation Natural language Knowledge base Natural language understanding Computational linguistics Semantics (computer science) Question answering Inference Point (geometry)

Metrics

Cited By

0.00

FWCI (Field Weighted Citation Impact)

Refs

0.35

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Topics

Topic Modeling

Physical Sciences → Computer Science → Artificial Intelligence

Speech and dialogue systems

Physical Sciences → Computer Science → Artificial Intelligence

Multimodal Machine Learning Applications

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

Knowledge Enhanced Natural Language Generation

Abstract

Metrics

Topics

Related Documents

Knowledge-intensive natural language generation

Knowledge-Enriched Natural Language Generation

Knowledge-enhanced natural language processing

Knowledge structures for natural language generation

Knowledge acquisition for natural language generation