Exploring the accuracy-explainability trade-off on credit scoring classifiers

Mtiyane, Sibusiso

dc.contributor.advisor	Malan, Katherine M.
dc.contributor.advisor	Jankowitz, M. D.
dc.contributor.author	Mtiyane, Sibusiso
dc.date.accessioned	2024-08-26T14:06:02Z
dc.date.available	2024-08-26T14:06:02Z
dc.date.issued	2024-02
dc.identifier.uri	https://hdl.handle.net/10500/31537
dc.description	Text in English with summaries in Afrikaans and Tswana	en
dc.description.abstract	Recent research has highlighted the significance of accuracy and explainability of classification models applied across various disciplines. A wide range of classification models and combinations of models have been extensively studied to determine those with superior performance. These studies demonstrate that models that tend to be more accurate are also difficult to understand; there appears to be a trade-off between accuracy and explainability. Consequently, this has led to an increased focus on explainable artificial intelligence, a field of research concerned with explaining model predictions. Although explainable artificial intelligence is an area of research with growing popularity in the science community, there are still limited case studies that explore its applications in credit default risk. Credit default risk refers to the potential financial loss or risk that is incurred by a credit provider when an obligor fails to meet their debt obligations. To quantify, mitigate and manage the risk associated with granting credit proactively, credit providers utilise scoring classifiers to assess the risk of credit applicants prior to granting credit. Furthermore, credit risk providers are legally required to explain predictions of scoring classifiers. Popular classifiers used in credit risk include logistic regression, discriminant analysis, decision trees, random forests, bootstrap aggregation, neural networks, support vector machines and gradient boosting algorithms. Logistic regression and discriminant analysis are widely adopted in the financial industry because they perform reasonably well and are inherently interpretable. However, these approaches are giving way to alternative approaches that offer improved accuracy in risk assessment, even though these alternatives lack interpretability; they are less comprehensible and are often regarded as black boxes. This lack of interpretability has resulted in a reluctance to adopt these alternative techniques in credit granting. The aim of this study is to remove the aforementioned barrier of using black box models by utilising explainable artificial intelligence methods, such as Shapley additive explanations and local interpretable model-agnostic explanations. The study also examines the accuracy-explainability trade-off of different classifiers by developing and evaluating eight classification models on two publicly available credit datasets. Eight classification models were constructed, including decision trees, logistic regression, linear discriminant analysis, support vector machines, artificial neural networks, bootstrap aggregation, random forest, and light gradient boosting classifier. Their performance and interpretability were assessed after training and tuning the hyperparameters for optimal comparison on training, testing and validation subsets of the data. Performance accuracy was measured using the area under the curve on 30 random subsets generated from the validation data. Furthermore, the Kruskal Wallis test and Dunn’s multi-comparison test were used to rank the predictive models by accuracy and to determine if the differences in mean accuracy are statistically significant. The interpretability of these classifiers was conducted for both transparent and black box models. To achieve these ends, key preprocessing steps were developed to reduce the complexities of local and global model interpretation. In addition, Shapley additive explanations and local interpretable model-agnostic explanations were utilised to analyse the relative importance of features and the impact on predictions. The experiments show that the artificial neural network, ensembles and other treebased algorithms significantly outperform logistic regression and linear discriminant analysis in the first case study. However, contradictory results are obtained for the second case study, as the performance of the classifiers are relatively comparable. This indicates that model performance depends on the data from which the models are constructed. These two case studies show that the perceived trade-off between accuracy and explainability does not always hold true. Furthermore, Shapley additive explanations yielded results that are consistent with the intrinsic interpretability results of the transparent methods. This post-hoc interpretability enables us to understand how the predictions are made and what factors contributed to the prediction. This is important to create a reliable and trustworthy framework that uses black box models for credit decisions. The research highlights the benefits of using alternative methods for credit risk scoring, showing that the performance can vary significantly. It also demonstrates the effectiveness of Shapley additive explanations and local interpretable modelagnostic explanations to explain predictions of black box classifiers. However, it identifies challenges in using the Shapley additive explanations. The mean absolute value may be sensitive to outliers, which could have an impact on feature importance. Therefore, further work is required to enhance the efficiency of calculating Shapley additive explanations’ values for linear classifiers and some ensembles.	en
dc.description.abstract	Onlangse navorsing het die belangrikheid uitgelig van die akkuraatheid en verduidelikbaarheid van klassifikasiemodelle wat dwarsoor verskeie dissiplines toegepas word. ’n Wye reeks klassifikasiemodelle en modelkombinasies is omvattend bestudeer om daardie modelle met voortreflike prestasie te bepaal. Hierdie studies het gedemonstreer dat modelle wat neig om meer akkuraat te wees, ook moeilik is om te verstaan; dit kom voor of daar ’n kompromie is tussen akkuraatheid en verduidelikbaarheid. Dit het gevolglik aanleiding gegee tot ’n verhoogde fokus op verduidelikbare kunsmatige intelligensie, ’n navorsingsveld wat met die verduideliking van modelvoorspellings gemoeid is. Alhoewel verduidelikbare kunsmatige intelligensie ’n navorsingsgebied is wat besig is om in gewildheid toe te neem binne die wetenskapgemeenskap, is daar steeds beperkte gevallestudies wat die toepassing daarvan op kredietwanbetalingsrisiko ondersoek. Kredietwanbetalingsrisiko verwys na die potensi¨ele finansi¨ele verlies of risiko waaraan ’n kredietverskaffer blootgestel word wanneer ’n skuldenaar in gebreke bly om hul skuldverpligtinge na te kom. Ten einde die risiko wat met kredietverskaffing geassosieer word proaktief te kwantifiseer, versag en bestuur, moet kredietverskaffers kredietgraderingsklassifiseerders gebruik om die moontlike risiko te evalueer wat kredietaansoekers inhou, voordat krediet toegestaan word. Voorts is kredietrisikoverskaffers volgens wet verplig om die voorspellings van kredietgraderingsklasifiseerders te verduidelik. Gewilde klassifiseerders wat in kredietrisiko gebruik word, sluit logistieke regressie, diskriminantanalise, besluitnemingsbome, ewekansige woude, skoenlussamevoeging, neurale netwerke, ondersteuningsvektormasjiene en gradi¨entversterkingsalgoritmes in. Logistieke regressie en diskriminantanalise is algemeen deur die finansi¨ele bedryf aanvaar aangesien hulle redelik goed presteer en inherent verduidelikbaar is. Hierdie benaderings skep egter ruimte vir alternatiewe benaderings wat verbeterde akkuraatheid ten opsigte van risiko-assessering bied selfs al gaan hierdie alternatiewe benaderings mank aan interpreteerbaarheid; hulle is nie so verstaanbaar nie en word dikwels as swartkissies (black boxes) gesien. Hierdie gebrek aan interpreteerbaarheid het tot gevolg dat daar ’n traagheid is om hierdie alternatiewe kredietverleningstegnieke aan te neem. Hierdie studie het ten doel om die voorafgenoemde versperring tot die gebruik van swartkissiemodelle te verwyder deur verduidelikbare kunsmatige intelligensiemetodes soos Shapely se additiewe verduidelikings en plaaslike interpreteerbare model-agnostiese verklarings te gebruik. Die studie ondersoek ook die akkuraatheidverduidelikbaarheidskompromie van verskillende klassifiseerders deur agt klassifikasiemodelle vir twee openbaar beskikbare kredietdatastelle te ontwikkel en te evalueer. Agt klassifikasiemodelle is saamgestel, naamlik besluitnemingsbome, logistieke regressie, liniˆere diskriminantanalise, ondersteuningsvektormasjiene, kunsmatige neurale netwerke, skoenlussamevoeging, ewekansige woud en ligte gradi¨entversterkingsklassifiseerder. Hul prestasie en interpreteerbaarheid is geassesseer na opleiding en instelling van die hiperparameters vir optimale vergelyking van opleiding, toetsing en geldigverklaring van deelversamelings van die data. Prestasie-akkuraatheid is gemeet deur van die area onder die kurwe van 30 ewekansige deelversamelings wat uit die geldigverklaarde data gegenereer is, gebruik te maak. Voorts is daar van die Kruskal Wallis-toets en Dunn se multivergelykingstoets gebruik gemaak om die voorspellingsmodelle ten opsigte van akkuraatheid te klassifiseer en te bepaal of die verskille in gemidddelde akkuraatheid statisties beduidend is. Die interpreteerbaarheid van hierdie klassifiseerders is vir beide deursigtige en swartkassiemodelle uitgevoer. Om hierdie resultate te verkry, is belangrike voorverwerkingstappe ontwikkel om die kompleksiteite van plaaslike sowel as globale modelinterpretasie te verminder. Daarbenewens is Shapley se additiewe verduidelikings en plaaslike interpreteerbare model-agnostiese verduidelikings ook ingespan om die relatiewe belangrikheid van kenmerke en die impak op voorspellings te ontleed. Die eksperimente toon dat die kunsmatige neurale netwerk, ensembles en ander boomgebaseerde algoritmes in die eerste gevallestudie beduidend beter as die logistieke regressie en liniˆere diskriminantanalise presteer het. Die tweede gevallestudie het egter teenstrydige resultate opgelewer. In die tweede gevallestudie is die prestasie van die klassifiseerders relatief vergelykbaar. Dit is ’n aanduiding dat modelprestasie afhanklik is van die data waaruit die modelle saamgestel is. Hierdie twee gevallestudies toon dat die waargenome kompromie tussen akkuraatheid en verduidelikbaarheid nie altyd waar is nie. Boonop het die Shapley additiewe verduidelikings resultate opgelewer wat met die intrinsieke interpreteerbaarheidsresultate van die deursigtige metodes ooreenstem. Hierdie post-hoc interpreteerbaarheid help ons om te verstaan hoe die voorspellings gemaak word en watter faktore tot die voorspellings bygedra het. Laasgenoemde is belangrik ten einde ’n betroubare en geloofwaardige raamwerk te skep wat van swartkassiemodelle vir kredietbesluite gebruik maak. Die navorsing beklemtoon die voordele van die gebruik van alternatiewe metodes vir kredietrisikogradering; dit toon dat die prestasie aansienlik kan varieer. Dit demonstreer ook die doeltreffendheid van die Shapley additiewe verduidelikings en plaaslike interpreteerbare model-agnostiese verduidelikings in die verduideliking van voorspellings van swartkissieklassifiseerders. Dit is egter so dat dit uitdagings ten opsigte van die Shapley additiewe verduidelikings identifiseer. Die gemiddelde absolute waarde mag dalk sensitief wees vir uitskieters wat ’n impak op die belangrikheid van kenmerke kan hˆe. Daarom is verdere werk nodig om die doeltreffendheid van die berekening van Shapley se additiewe verduidelikings se waardes vir liniˆere klassifiseerders en sommige ensembles te versterk.	afr
dc.description.abstract	Diphuputso tsa morao tjena di totobaditse bohlokwa ba ho nepahala le ho hlaloswa ha mefuta ya dihlopha e sebediswang dikarolong tse fapaneng. Mefuta e mengata e fapaneng ya dihlopha le motswako wa mefuta e nnile ya ithutwa haholo ho fumana hore na ke efe e nang le tshebetso e phahameng. Diphuputso tsena di bontsha hore mehlala e atisang ho nepahala haholwanyane le yona e thata ho e utlwisisa; ho bonahala ho e na le kgwebo pakeng tsa ho nepahala le ho hlalosa. Ka lebaka leo, sena se lebisitse tlhokomelong e eketsehileng ho bohlale bo hlakileng ba maiketsetso, lefapha la dipatlisiso le amanang le ho hlalosa dikgakanyo tsa mohlala. Leha bohlale ba maiketsetso bo hlaloswang e le sebaka sa dipatlisiso se ntseng se hola setumo se ntseng se hola setjhabeng sa mahlale, ho ntse ho na le dithuto tse fokolang tse hlahlobang tshebediso ya yona kotsing ya ho se be teng ha mekitlane. Kotsi ya ho se be teng ha mokitlane e bolela tahlehelo ya ditjhelete e ka bang teng kapa kotsi e hlahiswang ke mofani wa mokoloto ha motho ya tlamang a hloleha ho fihlela mekoloto ya hae. Ho lekanya, ho fokotsa le ho laola kotsi e amanang le ho fana ka mokoloto ka potlako, bafani ba mekitlane ba sebedisa dihlopha tsa dintlha ho lekola kotsi ya bakopi ba mekitlane pele ba fana ka mokoloto. Ho feta moo, bafani ba kotsi ya mokoloto ba hlokwa ka molao ho hlalosa dikgakanyo tsa dihlopha tsa dintlha. Dihlopha tse tsebahalang tse sebediswang e le kotsi ya mokoloto di kenyelletsa ho theola maemo, hlahlobo ya kgethollo, difate tsa diqeto, meru e sa rerwang, pokello ya bootstrap, marangrang a neural, metjhini ya divector ya tshehetso le dialgorithms tse matlafatsang. Phokotso ya dintho le hlahlobo ya kgethollo di amohelwa haholo indastering ya ditjhelete hobane di sebetsa hantle ka mokgwa o utlwahalang mme ka tlhaho di ka tolokwa. Leha ho le jwalo, mekgwa ena e fana ka mokgwa wa mekgwa e meng e fanang ka ho nepahala ho ntlafetseng ha ho hlahlojwa kotsi, le hoja mekgwa ena e meng e se na tlhaloso; ha di utlwisisehe mme hangata di nkwa e le mabokose a matsho. Kgaello ena ya hlaloso e bakile ho qeaqea ho sebedisa mekgwa ena e meng ya ho fana ka mekoloto. Sepheo sa thuto ena ke ho tlosa mokwallo o boletsweng ka hodimo wa ho sebedisa mehlala ya diblackbox ka ho sebedisa mekgwa e hlakileng ya bohlale ba maiketsetso, jwalo ka dihlaloso tsa tlatsetso tsa Shapley le dihlaloso tsa sebaka sa habo bona tsa agnostic. Boithuto bona bo boetse bo hlahloba kgwebo e nepahetseng le hlaloso e nepahetseng ya dihlopha tse fapaneng ka ho theha le ho lekola mefuta e robedi ya dikarolo ho didatabase tse pedi tse fumanehang phatlalatso ya tsa mekoloto. Ho ile ha ahwa mefuta e robedi ya dikarolo, ho kenyeletswa lifate tsa liqeto, ho theoha ha thepa, hlahlobo ya kgethollo e tshwanang, metjhini ya divector tse tshehetsang, marangrang a maiketsetso a neural, aggregation ya bootstrap, moru o sa rerwang, le sehlopha se matlafatsang se bobebe. Tshebetso ya bona le hlaloso ya bona di ile tsa hlahlojwa ka mora ho kwetliswa le ho lokisa di-hyperparameters bakeng sa papiso e nepahetseng mabapi le kwetliso, diteko le ho netefatsa dikarolwana tsa data. Ho nepahala ha tshebetso ho ile ha lekanyetswa ho sebediswa sebaka se ka tlasa lekgalo ho disubsets tse 30 tse sa rerwang tse hlahisitsweng ho data ya netefatso. Ho feta moo, teko ya Kruskal Wallis le ya Dunn ya ho bapisa dintho tse ngata di ile tsa sebediswa ho beha maemo a ponelopele ka ho nepahala le ho fumana hore na diphapano tsa ho nepahala ha moelelo di bohlokwa ho latela dipalo. Hlaloso ya dihlopha tsena e ile ya etswa bakeng sa mehlala ya dibox tse bonaletsang le tse ntsho. Ho finyella diphello tsena, mehato ya bohlokwa ya ho lokisa esale pele e ile ya ntlafatswa ho fokotsa ho rarahana ha hlaloso ya mohlala ya lehae le ya lefatshe. Ntle le moo, dihlaloso tsa tlatsetso tsa Shapley le dihlaloso tsa sebaka sa sebaka sa motlolo wa agnostic di ile tsa sebediswa ho sekaseka bohlokwa bo lekanyeditsweng ba dikarolo le phello ya dikgakanyo. Diteko di bontsha hore marangrang a maiketsetso a methapo ya kutlo, di-ensembles le di-algorithms tse ding tse thehilweng sefateng di feta haholo ho theoha ha thepa le hlahlobo e fapaneng ya kgethollo thutong ya pele. Leha ho le jwalo, diphetho tse hanyetsanang di fumanwa bakeng sa thuto ya mohlala ya bobedi, kaha tshebetso ya dihlopha di batla di bapiswa. Sena se bontsha hore tshebetso ya mohlala e itshetlehile ka data eo mehlala e ahilweng ho yona. Dithuto tsena tse pedi tsa dinyewe di bontsha hore phapang pakeng tsa ho nepahala le ho hlalosa ha se kamehla e leng nnete. Ho feta moo, dihlaloso tsa tlatsetso tsa Shapley di hlahisitse ditholwana tse tsamaellanang le sephetho sa ho toloka ha mekgwa e pepeneneng. Hlaloso ena ya post-hoc e re thusa ho utlwisisa hore na dikgakanyo di etswa jwang le hore na ke dintlha dife tse tlatseditseng ho bolela esale pele. Sena ke sa bohlokwa ho theha moralo o ka tsheptjwang le o ka tsheptjwang o sebedisang mehlala ya lebokose le letsho bakeng sa diqeto tsa mokitlane. Patlisiso e totobatsa melemo ya ho sebedisa mekgwa e meng bakeng sa dintlha tsa kotsi ya mokoloto, e bontsha hore tshebetso e ka fapana haholo. E boetse e bontsa katleho ya dihlaloso tsa tlatsetso ya Shapley le dihlaloso tsa sebaka seo ho ka tolokwang tsa mohlala-agnostic ho hlalosa dikgakanyo tsa dihlopha tsa diblackbox. Leha ho le jwalo, e supa mathata a ho sebedisa dihlaloso tsa tlatsetso ya Shapley. Theko ya boleng bo felletseng e kanna ya ameha ho barekisi ba kantle, e ka amang bohlokwa ba karolo. Ka hona, mosebetsi o mong o a hlokahala ho ntlafatsa bokgoni ba ho bala boleng ba dihlaloso tsa tlatsetso tsa Shapley bakeng sa dihlopha tsa linear le diensembles tse ding.	tsw
dc.format.extent	1 online resource (xx, 117 leaves): color illustrations	en
dc.language.iso	en	en
dc.subject	SDG 9 Industry, Innovation and Infrastructure	e
dc.subject.other	UCTD	en
dc.title	Exploring the accuracy-explainability trade-off on credit scoring classifiers	en
dc.type	Dissertation	en
dc.description.department	College of Engineering, Science and Technology	en
dc.description.degree	M. Sc. (Operations Research)	en