Abstract
This study presents a holistic predictive analytics framework to advance loan approval and risk assessment within financial institutions. We used the synthetic dataset of 20,000 records from Kaggle to apply the following classification and regression models: Logistic Regression, XGBoost, Decision Tree and Random Forest. The results show that Logistic Regression is the most accurate in classifying loan approvals with 95.7% accuracy, and Random Forest excels at predicting the risk scores with an R-squared score of 0.87. The most dominant variables impacting the results are TotalDebtToIncomeRatio, BankruptcyHistory and CreditScore. These observations indicate that such data-driven strategies will enhance the loan approval process, add reliability and enhance the risk management aspects using key financial indicators. This study finds the gap in comparative studies on classification and regression model types used for such assessments and highlights the significance of feature importance analysis for such transparencies in decision-making. For financial institutions, implications are adopting predictive models to minimize human error, shortening processing time while keeping efficiency intact. Further research can build on this framework by exploring more advanced techniques within machine learning, such as deep learning. In addition, it optimizes predictive accuracy and decision support abilities in an ever-changing dynamic finance landscape.
Keywords
Get full access to this article
View all access options for this article.
