JOURNAL ARTICLE

Reinforcement-Learning-Guided Source Code Summarization Using Hierarchical Attention

Wenhua WangYuqun ZhangYulei SuiYao WanZhou ZhaoJian WuPhilip S. YuGuandong Xu

Year: 2020 Journal:   IEEE Transactions on Software Engineering Vol: 48 (1)Pages: 102-119   Publisher: IEEE Computer Society

Abstract

Code summarization (aka comment generation) provides a high-level natural language description of the function performed by code, which can benefit the software maintenance, code categorization and retrieval. To the best of our knowledge, the state-of-the-art approaches follow an encoder-decoder framework which encodes source code into a hidden space and later decodes it into a natural language space. Such approaches suffer from the following drawbacks: (a) they are mainly input by representing code as a sequence of tokens while ignoring code hierarchy; (b) most of the encoders only input simple features (e.g., tokens) while ignoring the features that can help capture the correlations between comments and code; (c) the decoders are typically trained to predict subsequent words by maximizing the likelihood of subsequent ground truth words, while in real world, they are excepted to generate the entire word sequence from scratch. As a result, such drawbacks lead to inferior and inconsistent comment generation accuracy. To address the above limitations, this paper presents a new code summarization approach using hierarchical attention network by incorporating multiple code features, including type-augmented abstract syntax trees and program control flows. Such features, along with plain code sequences, are injected into a deep reinforcement learning (DRL) framework (e.g., actor-critic network) for comment generation. Our approach assigns weights (pays “attention”) to tokens and statements when constructing the code representation to reflect the hierarchical code structure under different contexts regarding code features (e.g., control flows and abstract syntax trees). Our reinforcement learning mechanism further strengthens the prediction results through the actor network and the critic network, where the actor network provides the confidence of predicting subsequent words based on the current state, and the critic network computes the reward values of all the possible extensions of the current state to provide global guidance for explorations. Eventually, we employ an advantage reward to train both networks and conduct a set of experiments on a real-world dataset. The experimental results demonstrate that our approach outperforms the baselines by around 22 to 45 percent in BLEU-1 and outperforms the state-of-the-art approaches by around 5 to 60 percent in terms of S-BLEU and C-BLEU.

Keywords:
Computer science Automatic summarization Reinforcement learning Artificial intelligence Code (set theory) Language model Code generation Natural language Source code Natural language processing Abstract syntax tree Syntax Programming language Key (lock)

Metrics

98
Cited By
16.14
FWCI (Field Weighted Citation Impact)
111
Refs
0.99
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Software Engineering Research
Physical Sciences →  Computer Science →  Information Systems
Topic Modeling
Physical Sciences →  Computer Science →  Artificial Intelligence
Software Testing and Debugging Techniques
Physical Sciences →  Computer Science →  Software

Related Documents

JOURNAL ARTICLE

Improving automatic source code summarization via deep reinforcement learning

Wan, Yao

Journal:   Zenodo (CERN European Organization for Nuclear Research) Year: 2018
JOURNAL ARTICLE

The Source Code Comment Generation Based on Deep Reinforcement Learning and Hierarchical Attention

Daoyang MingWeicheng Xiong

Journal:   Academic Journal of Science and Technology Year: 2023 Vol: 8 (1)Pages: 74-81
JOURNAL ARTICLE

Code Structure–Guided Transformer for Source Code Summarization

Shuzheng GaoCuiyun GaoYulan HeJichuan ZengLunyiu NieXin XiaMichael R. Lyu

Journal:   ACM Transactions on Software Engineering and Methodology Year: 2022 Vol: 32 (1)Pages: 1-32
© 2026 ScienceGate Book Chapters — All rights reserved.