BLESS: Benchmarking Large Language Models on Sentence Simplification

Tannon Kew; Alison Chi; Laura Vásquez-Rodríguez; Sweta Agrawal; Dennis Aumiller; Fernando Alva-Manchego; Matthew Shardlow

doi:10.18653/v1/2023.emnlp-main.821

ScienceGate Book Chapters

JOURNAL ARTICLE

BLESS: Benchmarking Large Language Models on Sentence Simplification

Tannon Kew Alison Chi Laura Vásquez-Rodríguez Sweta Agrawal Dennis Aumiller Fernando Alva-Manchego Matthew Shardlow

Year: 2023 Pages: 13291-13309

DOI: 10.18653/v1/2023.emnlp-main.821

Get Full-Text PDF Get Analytical Report

Abstract

We present BLESS, a comprehensive performance benchmark of the most recent state-of-the-art Large Language Models (LLMs) on the task of text simplification (TS). We examine how well off-the-shelf LLMs can solve this challenging task, assessing a total of 44 models, differing in size, architecture, pre-training methods, and accessibility, on three test sets from different domains (Wikipedia, news, and medical) under a few-shot setting. Our analysis considers a suite of automatic metrics, as well as a large-scale quantitative investigation into the types of common edit operations performed by the different models. Furthermore, we perform a manual qualitative analysis on a subset of model outputs to better gauge the quality of the generated simplifications. Our evaluation indicates that the best LLMs, despite not being trained on TS perform comparably with state-of-the-art TS baselines. Additionally, we find that certain LLMs demonstrate a greater range and diversity of edit operations. Our performance benchmark will be available as a resource for the development of future TS methods and evaluation metrics.

Keywords:

Benchmarking Computer science Natural language processing Sentence Linguistics Artificial intelligence Natural language Philosophy Management Economics

Metrics

Cited By

3.32

FWCI (Field Weighted Citation Impact)

Refs

0.91

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Text Readability and Simplification

Physical Sciences → Computer Science → Artificial Intelligence

Natural Language Processing Techniques

Physical Sciences → Computer Science → Artificial Intelligence

Interpreting and Communication in Healthcare

Health Sciences → Health Professions → General Health Professions

BLESS: Benchmarking Large Language Models on Sentence Simplification

Abstract

Metrics

Citation History

Topics

Related Documents

Benchmarking medical large language models

Medical Reports Simplification Using Large Language Models

Sentence simplification for spoken language understanding

Keep Eyes on the Sentence: An Interactive Sentence Simplification System for English Learners Based on Eye Tracking and Large Language Models

An In-depth Evaluation of Large Language Models in Sentence Simplification with Error-based Human Assessment