A machine learning-based credit risk prediction engine system using a stacked classifier and a filter-based feature selection method

Loading...
Thumbnail Image

Authors

Emmanuel, Ileberi
Sun, Yanxia
Wang, Zenghui

Issue Date

2024-02-01

Type

Journal Article

Language

Keywords

Research Projects

Organizational Units

Journal Issue

Alternative Title

Abstract

Abstract Credit risk prediction is a crucial task for financial institutions. The technological advancements in machine learning, coupled with the availability of data and computing power, has given rise to more credit risk prediction models in financial institutions. In this paper, we propose a stacked classifier approach coupled with a filter-based feature selection (FS) technique to achieve efficient credit risk prediction using multiple datasets. The proposed stacked model includes the following base estimators: Random Forest (RF), Gradient Boosting (GB), and Extreme Gradient Boosting (XGB). Furthermore, the estimators in the Stacked architecture were linked sequentially to extract the best performance. The filter- based FS method that is used in this research is based on information gain (IG) theory. The proposed algorithm was evaluated using the accuracy, the F1-Score and the Area Under the Curve (AUC). Furthermore, the Stacked algorithm was compared to the following methods: Artificial Neural Network (ANN), Decision Tree (DT), and k-Nearest Neighbour (KNN). The experimental results show that stacked model obtained AUCs of 0.934, 0.944 and 0.870 on the Australian, German and Taiwan datasets, respectively. These results, in conjunction with the accuracy and F1-score metrics, demonstrated that the proposed stacked classifier outperforms the individual estimators and other existing methods.

Description

Citation

Journal of Big Data. 2024 Feb 01;11(1):23

Publisher

License

Journal

Volume

Issue

PubMed ID

DOI

ISSN

EISSN