Predicting lapse rate in life insurance using machine learning algorithms

Kgare, Mahlodi

dc.contributor.advisor	Twala, Bhekisipho
dc.contributor.author	Kgare, Mahlodi
dc.date.accessioned	2022-06-21T09:27:27Z
dc.date.available	2022-06-21T09:27:27Z
dc.date.issued	2021-09
dc.identifier.uri	https://hdl.handle.net/10500/28996
dc.description.abstract	Policy lapse is a vital component in life insurance as it affects future pricing and impacts the solvency of the life insurer. Accurate prediction of lapse will help the insurers to implement personalised retention strategies based on the model’s outcome. The major contribution of the dissertation is the empirical comparison and benchmark of nine machine learning classifier models (i.e. Decision Tree, Gradient Boost, Random Forest, Support Vector Machine trained with linear kernel, Support Vector Machine trained with polynomial kernels, Neural Network trained with Levenberg-Marquardt, Neural Network trained with backpropagation) with traditional algorithms (i.e., Logistic Regression with forward variable selection and Logistic Regression with backward variable selection) for life insurance lapse predictions. The models’ accuracy was observed over two different insurer datasets with different distributions (Insurer 1 and Insurer 2) and different feature selection methodology namely, Principal Component Analysis (PCA) and Chi-squared. Accuracy, F-measure, sensitivity, specificity, and Receiver Operating Characteristics Curve (ROC) were used as performance measures. The results show the strong prediction ability of ensemble models (Gradient Boost and Random Forest) over single classifiers, and there is a strong indication that suitable parameter tuning and model boosting improve the model performance. The best overall classifier is Gradient Boosting with an accuracy of 92%, 76% and F-measure of 92%, 84% for Insurer 1 and Insurer 2 datasets, respectively. The study recommends the use of ensemble models instead of single model classifiers as they have been proven to work better when predicting life insurance lapses.	en
dc.format.extent	1 online resource (xii, 102 pages) : illustrations, graphs (some color)
dc.language.iso	en	en
dc.subject	Decision tree	en
dc.subject	Generalised linear models	en
dc.subject	Logistic regression	en
dc.subject	Lapse	en
dc.subject	Machine learning	en
dc.subject.ddc	368.016
dc.subject.lcsh	Life insurance policies -- Statistics
dc.subject.lcsh	Twisting (Life insurance fraud)
dc.subject.lcsh	Lapse (Law)
dc.subject.lcsh	Linear models (Statistics)
dc.subject.lcsh	Computer algorithms
dc.subject.lcsh	Forecasting
dc.title	Predicting lapse rate in life insurance using machine learning algorithms	en
dc.type	Dissertation	en
dc.description.department	Statistics	en
dc.description.degree	M. Sc. (Statistics)