dc.contributor.advisor |
Twala, Bhekisipho
|
|
dc.contributor.author |
Kgare, Mahlodi
|
|
dc.date.accessioned |
2022-06-21T09:27:27Z |
|
dc.date.available |
2022-06-21T09:27:27Z |
|
dc.date.issued |
2021-09 |
|
dc.identifier.uri |
https://hdl.handle.net/10500/28996 |
|
dc.description.abstract |
Policy lapse is a vital component in life insurance as it affects future pricing and impacts the solvency of the life insurer. Accurate prediction of lapse will help the insurers to implement personalised retention strategies based on the model’s outcome. The major contribution of the dissertation is the empirical comparison and benchmark of nine machine learning classifier models (i.e. Decision Tree, Gradient Boost, Random Forest, Support Vector Machine trained with linear kernel, Support Vector Machine trained with polynomial kernels, Neural Network trained with Levenberg-Marquardt, Neural Network trained with backpropagation) with traditional algorithms (i.e., Logistic Regression with forward variable selection and Logistic Regression with backward variable selection) for life insurance lapse predictions. The models’ accuracy was observed over two different insurer datasets with different distributions (Insurer 1 and Insurer 2) and different feature selection methodology namely, Principal Component Analysis (PCA) and Chi-squared. Accuracy, F-measure, sensitivity, specificity, and Receiver Operating Characteristics Curve (ROC) were used as performance measures. The results show the strong prediction ability of ensemble models (Gradient Boost and Random Forest) over single classifiers, and there is a strong indication that suitable parameter tuning and model boosting improve the model performance. The best overall classifier is Gradient Boosting with an accuracy of 92%, 76% and F-measure of 92%, 84% for Insurer 1 and Insurer 2 datasets, respectively. The study recommends the use of ensemble models instead of single model classifiers as they have been proven to work better when predicting life insurance lapses. |
en |
dc.format.extent |
1 online resource (xii, 102 pages) : illustrations, graphs (some color) |
|
dc.language.iso |
en |
en |
dc.subject |
Decision tree |
en |
dc.subject |
Generalised linear models |
en |
dc.subject |
Logistic regression |
en |
dc.subject |
Lapse |
en |
dc.subject |
Machine learning |
en |
dc.subject.ddc |
368.016 |
|
dc.subject.lcsh |
Life insurance policies -- Statistics |
|
dc.subject.lcsh |
Twisting (Life insurance fraud) |
|
dc.subject.lcsh |
Lapse (Law) |
|
dc.subject.lcsh |
Linear models (Statistics) |
|
dc.subject.lcsh |
Computer algorithms |
|
dc.subject.lcsh |
Forecasting |
|
dc.title |
Predicting lapse rate in life insurance using machine learning algorithms |
en |
dc.type |
Dissertation |
en |
dc.description.department |
Statistics |
en |
dc.description.degree |
M. Sc. (Statistics) |
|