Ishaya GamboClavers ChabiSimon YangeRhoda IkonoTheresa Omodunbi
Context: Mining an unstructured software repository entails the analysis and thorough understanding of data for producing quality software products to users' satisfaction. However, the vast number of information coming into the repository makes it difficult to get timely and error-free information due to the data's unstructured and noisy nature. Consequently, hindering the timely completion of a development project and leading to an inevitable delay in the delivery time. Objective: The chapter aims at developing a recommendation system that will help software developers produce a new product of good quality. The overarching goal is to understand and resolve the challenges, complexities and peculiarities of data in software repositories. Hence, software developers can locate useful data in a software development project without wasting of time. Methods: We adopted a quantitative research approach by experimenting with a developed tool called System Analysis and Mining List Recommendation System (SAMLRS). We used PyDriller to collect data from GitHub; precisely, one thousand (1,000) projects were extracted from GitHub. The dataset was processed and analyzed using Google BigQuery and Natural Language Processing (NLP). We formulated a model that handles the unstructured data in the repositories using Artificial Neural Network (ANN) technique. The model was specified in Unified Modeling Language, and the developed system was implemented using Python programming language. We used Recall, Precision and Execution time as parameters to evaluate the performance of the model. Results: The result showed a timely recommended list of requirements documents upon programmers' requests. Out of 1,000 repositories extracted from GitHub with the row dataset, more than 700 repositories were well structured. We obtained a performance improvement of 75% in terms of structuring data in the repository. Also, we got 84% performance improvement in terms of data recommendation with 1.98 seconds of execution time. Our results imply that programmers can locate functional requirements and design documents more effectively and efficiently.
Qais A. ShredaAbualsoud Hanani
Mikalai KrapivinAliaksandr AutayeuMaurizio MarcheseEnrico BlanzieriNicola Segata