JOURNAL ARTICLE

Predicting the Borrower’s Genuineness in Loan Repayment through Big Data Analytics

Abstract

Banks play a pivotal role in facilitating economic activities, allocating financial resources, and managing risks. A fundamental function of banks is the provision of loans. This research is centered on the subject of "Predicting Borrower's Integrity in Loan Repayment," aimed at mitigating risks and ensuring prudent financial decision-making. To conduct our predictive analysis, we leveraged a comprehensive loan lending dataset provided by Lending Club Bank. This dataset consists of 2.2 million records, each associated with 151 distinct features. Performing machine learning predictions on such a substantial dataset, totaling 1.3 gigabytes, presents a formidable challenge. Consequently, we harnessed machine learning techniques and the power of Apache Spark as our primary tool for handling big data. For optimal utilization of Spark's capabilities, we engaged Google Cloud's Dataproc platform. Through feature selection techniques, we identified 28 significant features from the original 151. Notably, data transformation was applied to the selected features for model understanding. Logistic Regression and Random Forest Classification models were employed for the prediction of loan statuses, categorizing them as either 'fully paid' or 'charged off.' These models achieved impressive accuracies of 95.9 percent and 86 percent, respectively. This research contributes significantly to the evolution of loan assessment practices and the refinement of risk management strategies within the banking sector.

Keywords:
Analytics Loan Big data Computer science Data science Econometrics Actuarial science Business Data mining Finance Economics

Metrics

2
Cited By
1.18
FWCI (Field Weighted Citation Impact)
28
Refs
0.82
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Financial Distress and Bankruptcy Prediction
Social Sciences →  Business, Management and Accounting →  Accounting
Impact of AI and Big Data on Business and Society
Social Sciences →  Decision Sciences →  Management Science and Operations Research
Insurance and Financial Risk Management
Social Sciences →  Economics, Econometrics and Finance →  Economics and Econometrics

Related Documents

JOURNAL ARTICLE

Predicting Loan Repayment: A Machine Learning Approach

Dr.Vikas SinghalPrashant Tiwari

Journal:   International Research Journal of Computer Science Year: 2025 Vol: 12 (04)Pages: 125-130
JOURNAL ARTICLE

Does repayment burden of student loan lower the job expectation? : Focusing on the student loan borrower’s reservation wage

Jinkwon LeeJae-Woon Hwang

Journal:   The Korean Society for the Economics and Finance of Education Year: 2017 Vol: 26 (3)Pages: 107-134
JOURNAL ARTICLE

Modeling on a Flexible Loan Repayment Method Based on the Borrower’s Asset with Boundary

进 梁

Journal:   Finance Year: 2020 Vol: 10 (02)Pages: 95-103
© 2026 ScienceGate Book Chapters — All rights reserved.