dc.description.abstract |
The main aim of this study is to compare machine learning models with traditional statistical models in predicting credit risk for a commercial bank. Furthermore, the evaluation is conducted on varying levels of data balancing to determine the impact of data balancing on the performance of the models under study. The Logistic Regression is considered the statistical baseline model, while the machine learning techniques in relation to the literature reviewed are k-NN, SVM, Decision Tree, MLP, and RBFNN. Logistic Regression showed consistent AUC values around 0,72, while SVM excelled at higher balance levels with an AUC of 0,73. The MLP model was superior in a fully bal-anced dataset, achieving a 0,78 AUC. However, Decision Tree and k-NN’s performance varied with dataset balance, and RBFNN underperformed. The analysis concludes that no single model is universally superior. Therefore, the choice of credit risk models by financial institutions should be based on the specifics of the data and predictive requirements, considering prediction errors’ financial impacts. |
en |