Improving Deep Assertion Generation via Fine-Tuning Retrieval-Augmented Pre-trained Language Models

Quanjun Zhang; Chunrong Fang; Yi Zheng; Yaxin Zhang; Yuan Zhao; Rubing Huang; Jianyi Zhou; Yun Yang; Tao Zheng; Zhenyu Chen

doi:10.1145/3721128

ScienceGate Book Chapters

JOURNAL ARTICLE

Improving Deep Assertion Generation via Fine-Tuning Retrieval-Augmented Pre-trained Language Models

Quanjun Zhang Chunrong Fang Yi Zheng Yaxin Zhang Yuan Zhao Rubing Huang Jianyi Zhou Yun Yang Tao Zheng Zhenyu Chen

Year: 2025 Journal: ACM Transactions on Software Engineering and Methodology Publisher: Association for Computing Machinery

DOI: 10.1145/3721128

Get Full-Text PDF Get Analytical Report

Abstract

Unit testing validates the correctness of the units of the software system under test and serves as the cornerstone in improving software quality and reliability. To reduce manual efforts in writing unit tests, some techniques have been proposed to generate test assertions automatically, including deep learning (DL)-based, retrieval-based, and integration-based ones. Among them, recent integration-based approaches inherit from both DL-based and retrieval-based approaches and are considered state-of-the-art. Despite being promising, such integration-based approaches suffer from inherent limitations, such as retrieving assertions with lexical matching while ignoring meaningful code semantics, and generating assertions with a limited training corpus. In this paper, we propose a novel Retri eval-Augmented Deep Assertion Gen eration approach, namely RetriGen, based on a hybrid assertion retriever and a pre-trained language model (PLM)-based assertion generator. Given a focal-test, RetriGen first builds a hybrid assertion retriever to search for the most relevant test-assert pair from external codebases. The retrieval process takes both lexical similarity and semantical similarity into account via a token-based and an embedding-based retriever, respectively. RetriGen then treats assertion generation as a sequence-to-sequence task and designs a PLM-based assertion generator to predict a correct assertion with historical test-assert pairs and the retrieved external assertion. Although our concept is general and can be adapted to various off-the-shelf encoder-decoder PLMs, we implement RetriGen to facilitate assertion generation based on the recent CodeT5 model. We conduct extensive experiments to evaluate RetriGen against six state-of-the-art approaches across two large-scale datasets and two metrics. The experimental results demonstrate that RetriGen achieves 57.66% and 73.24% in terms of accuracy and CodeBLEU, outperforming all baselines with an average improvement of 50.66% and 14.14%, respectively. Furthermore, RetriGen generates 1598 and 1818 unique correct assertions that all baselines fail to produce, 3.71X and 4.58X more than the most recent approach EditAS . We also demonstrate that adopting other PLMs can provide substantial advancement, e.g., four additionally-utilized PLMs outperform EditAS by 7.91% \(\sim\) 12.70% accuracy improvement, indicating the generalizability of RetriGen. Overall, our study highlights the promising future of fine-tuning off-the-shelf PLMs to generate accurate assertions by incorporating external knowledge sources.

Keywords:

Computer science Assertion Artificial intelligence Natural language processing Programming language

Metrics

Cited By

16.49

FWCI (Field Weighted Citation Impact)

Refs

0.94

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Software Testing and Debugging Techniques

Physical Sciences → Computer Science → Software

Topic Modeling

Physical Sciences → Computer Science → Artificial Intelligence

Natural Language Processing Techniques

Physical Sciences → Computer Science → Artificial Intelligence

Improving Deep Assertion Generation via Fine-Tuning Retrieval-Augmented Pre-trained Language Models

Abstract

Metrics

Citation History

Topics

Related Documents

Retrieval-Augmented Fine-Tuning for Improving Retrieve-and-Edit Based Assertion Generation

Revisiting and Improving Retrieval-Augmented Deep Assertion Generation

Improving Retrieval-Augmented Deep Assertion Generation via Joint Training

Pruning Pre-trained Language Models Without Fine-Tuning

Fine-Tuning Pre-Trained Language Models with Gaze Supervision