Code completion refers to automatically generating the missing parts of code based on existing code snippets. Code completion can help programmers improve their coding efficiency and reduce errors. Providing accurate suggestions at the desired following token location in the completion recommendation list can significantly assist users. We have advanced the research on the next token prediction in code completion. Our work utilizes GPT-2 as the underlying architecture, and research has shown the effectiveness of incorporating the structural information of abstract syntax trees (ASTs) in code prediction. Particularly, predicting the type information of the next token has shown significant improvements. Our work proposes an algorithm to segment abstract syntax trees while preserving their structural characteristics. We use Tree-LSTM to extract the structural information of ASTs. We conducted experiments on a standard dataset and compared the effects of removing different components from the approach to validate its effectiveness.
Jiahao LiLinbo ZhuBowen LvJun Ding
Chen LinZhichao OuyangJunqing ZhuangJianqiang ChenHui LiRongxin Wu
Tiancheng HuZijing XuYilin FangYueming WuBin YuanDeqing ZouHai Jin
Baojiang CuiJiansong LiTao GuoJianxin WangDing Ma