A machine learning-based credit risk prediction engine system using a stacked classifier and a filter-based feature selection method

Emmanuel, Ileberi; Sun, Yanxia; Wang, Zenghui

dc.contributor.author	Emmanuel, Ileberi
dc.contributor.author	Sun, Yanxia
dc.contributor.author	Wang, Zenghui
dc.date.accessioned	2024-03-04T10:29:48Z
dc.date.available	2024-03-04T10:29:48Z
dc.date.issued	2024-02-01
dc.identifier.citation	Journal of Big Data. 2024 Feb 01;11(1):23
dc.identifier.uri	https://doi.org/10.1186/s40537-024-00882-0
dc.identifier.uri	https://hdl.handle.net/10500/30919
dc.description.abstract	Abstract Credit risk prediction is a crucial task for financial institutions. The technological advancements in machine learning, coupled with the availability of data and computing power, has given rise to more credit risk prediction models in financial institutions. In this paper, we propose a stacked classifier approach coupled with a filter-based feature selection (FS) technique to achieve efficient credit risk prediction using multiple datasets. The proposed stacked model includes the following base estimators: Random Forest (RF), Gradient Boosting (GB), and Extreme Gradient Boosting (XGB). Furthermore, the estimators in the Stacked architecture were linked sequentially to extract the best performance. The filter- based FS method that is used in this research is based on information gain (IG) theory. The proposed algorithm was evaluated using the accuracy, the F1-Score and the Area Under the Curve (AUC). Furthermore, the Stacked algorithm was compared to the following methods: Artificial Neural Network (ANN), Decision Tree (DT), and k-Nearest Neighbour (KNN). The experimental results show that stacked model obtained AUCs of 0.934, 0.944 and 0.870 on the Australian, German and Taiwan datasets, respectively. These results, in conjunction with the accuracy and F1-score metrics, demonstrated that the proposed stacked classifier outperforms the individual estimators and other existing methods.
dc.title	A machine learning-based credit risk prediction engine system using a stacked classifier and a filter-based feature selection method
dc.type	Journal Article
dc.date.updated	2024-03-04T10:29:48Z
dc.language.rfc3066	en
dc.rights.holder	The Author(s)