JOURNAL ARTICLE

MPT‐embedding: An unsupervised representation learning of code for software defect prediction

Ke ShiYang LuGuangliang LiuZhenchun WeiJingfei Chang

Year: 2020 Journal:   Journal of Software Evolution and Process Vol: 33 (4)   Publisher: Wiley

Abstract

Abstract Software project defect prediction can help developers allocate debugging resources. Existing software defect prediction models are usually based on machine learning methods, especially deep learning. Deep learning‐based methods tend to build end‐to‐end models that directly use source code‐based abstract syntax trees (ASTs) as input. They do not pay enough attention to the front‐end data representation. In this paper, we propose a new framework to represent source code called multiperspective tree embedding (MPT‐embedding), which is an unsupervised representation learning method. MPT‐embedding parses the nodes of ASTs from multiple perspectives and encodes the structural information of a tree into a vector sequence. Experiments on both cross‐project defect prediction (CPDP) and within‐project defect prediction (WPDP) show that, on average, MPT‐embedding provides improvements over the state‐of‐the‐art method.

Keywords:
Computer science Embedding Abstract syntax Source code Artificial intelligence Representation (politics) Machine learning Debugging Software Deep learning Tree (set theory) Code (set theory) Feature learning Syntax Programming language Natural language processing

Metrics

24
Cited By
3.96
FWCI (Field Weighted Citation Impact)
53
Refs
0.94
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Software Engineering Research
Physical Sciences →  Computer Science →  Information Systems
Software System Performance and Reliability
Physical Sciences →  Computer Science →  Computer Networks and Communications
Software Reliability and Analysis Research
Physical Sciences →  Computer Science →  Software
© 2026 ScienceGate Book Chapters — All rights reserved.