JOURNAL ARTICLE

C source code clustering using bisecting K-means

Ahmad RidhaArizal Notyasa

Year: 2023 Journal:   AIP conference proceedings Vol: 2931 Pages: 020003-020003   Publisher: American Institute of Physics

Abstract

Due to the COVID-19 pandemic, education institutions have to rely on e-learning tools, including in programming courses. Automatic graders can be used to speed up the process of evaluating the correctness. Unfortunately, answers for coding exercises can be easily plagiarized. Manual grading of all student submissions may notbe feasible. Therefore, a system that can help detecting similar codes is needed. The detection can be done by grouping similar source codes based on their structure. This method is used in previous research by using automatic K-means iterations algorithm. That algorithm, although produced decent clusters, had a long execution time. The purpose of this research is to improve the time efficiency and clusters result quality by using bisecting K-means algorithm. The results showed a significant improvement in execution time from 11.68 seconds to 6.64 seconds. Bisecting K-means also produced fewer clusters with slightly better Rand Index than K-means iterations. We also conduct experiments using 2-grams to 6-grams and confirm that 4-grams result in the best performance.

Keywords:
Computer science Correctness Cluster analysis Coding (social sciences) Source code Grading (engineering) Code (set theory) Process (computing) k-means clustering Economic shortage Data mining Algorithm Artificial intelligence Programming language Set (abstract data type) Mathematics

Metrics

0
Cited By
0.00
FWCI (Field Weighted Citation Impact)
18
Refs
0.20
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Topics

Software Testing and Debugging Techniques
Physical Sciences →  Computer Science →  Software
Software Engineering Research
Physical Sciences →  Computer Science →  Information Systems
Advanced Malware Detection Techniques
Physical Sciences →  Computer Science →  Signal Processing

Related Documents

JOURNAL ARTICLE

Text Document clustering using partial Fractionation and Bisecting K-means

Harpreet SinghSarabpreet KaurChetna Kaushal

Journal:   2022 10th International Conference on Reliability, Infocom Technologies and Optimization (Trends and Future Directions) (ICRITO) Year: 2022 Vol: 1 Pages: 1-6
JOURNAL ARTICLE

Parallel bisecting k-means with prediction clustering algorithm

Yanjun LiSoon M. Chung

Journal:   The Journal of Supercomputing Year: 2006 Vol: 39 (1)Pages: 19-37
JOURNAL ARTICLE

Bisecting K-Means for Clustering Web Log data

Ruchika PatilAmreen Khan

Journal:   International Journal of Computer Applications Year: 2015 Vol: 116 (19)Pages: 36-41
BOOK-CHAPTER

An Improved Bisecting K-Means Text Clustering Method

Zi YeKun LiangZhiyuan ZhangChunfeng WangZhe Peng

Advances in intelligent systems and computing Year: 2019 Pages: 155-162
© 2026 ScienceGate Book Chapters — All rights reserved.