Modeling lapse rates using machine learning: a comparison between survival forests and cox proportional hazards techniques

Authors

  • Jorge Luis Andrade Universidad Complutense de Madrid (España)
  • José Luis Valencia Universidad Complutense de Madrid (España)

DOI:

https://doi.org/10.26360/2021_7

Keywords:

análisis de supervivencia, machine learning, asas de caídas, random survival forest

Abstract

This study undertakes a comparative analysis of the performance of machine learning and traditional survival analysis techniques in the insurance industry. The techniques compared are the traditional Cox Proportional Hazards (CPH), Random Survival Forests (RSF) and Conditional Inference Forests (CIF) machine learning models. These techniques are applied in a case study of insurance portfolio of one of Ecuador's largest insurer. This study demonstrates how machine learning techniques perform better in predicting survival function measured by the C-index and Brier Score. It also demonstrates that the predictive contribution of covariates in the RSF model is consistent with the traditional CPH model.

Downloads

Download data is not yet available.

References

Aleandri, M., and Eletti, A. (2020). Modelling dynamic lapse with survival analysis and machine learning in CPI. Decisions in Economics and Finance, 1–20.

Bauer, D., Gao, J., Moenig, T., Ulm, E. R., and Zhu, N. (2017). Policy- holder exercise behavior in life insurance: The state of affairs. North American Actuarial Journal, 21(4), 485–501.

Breiman, L. (2001). Random forests. Machine learning, 45(1), 5–32.

Brockett, P., Golden, L., Guillén, M., Nielsen, J. P., Parner, J., and Perez-Marin, A. M. (2008). Survival analysis of a household portfolio of insurance policies: How much time do you have to stop total customer defection? Journal of Risk and Insurance, 75(3), 713–737.

Cramér, H. (1946). A contribution to the theory of statistical estimation. Scandinavian Actuarial Journal, 1946(1), 85–94.

Crumer, A. M. (2011). Comparison between Weibull and Cox proportional hazards models. Kansas State University Report, 1–34.

EIOPA. (2010). EIOPA Report on the Fifth Quantitative Impact Study (QIS5) for Solvency II.

Eling, M., and Kiesenbauer, D. (2014). What policy features determine life insurance lapse? an analysis of the german market. Journal of Risk and Insurance, 81(2), 241–269.

Eling, M., and Kochanski, M. (2013). Research on lapse in life insurance: What has been done and what needs to be done? The Journal of Risk Finance, 14(4), 392–413.

Fotso, S. et al. (2019). PySurvival: Open source package for survival analysis modeling. Python package version 1.0. https:// www. pysurvival.io/

Gerds, T. (2017). Pec: Prediction error curves for risk prediction models in survival analysis. R package version, 2(4).

Hothorn, T., Hornik, K., and Zeileis, A. (2006). Unbiased recursive partitioning: A conditional inference framework. Journal of Computational and Graphical statistics, 15(3), 651–674.

International Accounting Standards Board. (2017). IFRS17: Insurance contracts.

Ishwaran, H., and Kogalur, U. (2017). Randomforestsrc: Random forests for survival, regression and classification (rf-src). 2016. R package version, 2 (0).

Ishwaran, H., Kogalur, U. B., Blackstone, E. H., and Lauer, M. S. (2008). Random survival forests. The annals of applied statistics, 2(3), 841–860.

Lee, E. T., and Wang, J. (2013). Statistical methods for survival data analysis (Vol. 476). John Wiley and Sons.

Mogensen, U. B., Ishwaran, H., and Gerds, T. A. (2012). Evaluating ran dom forests for survival analysis using prediction error curves. Journal of statistical software, 50(11), 1.

Pinquet, J., Guillén, M. and Ayuso, M. (2011). Commitment and lapse behavior in long-term insurance: A case study. Journal of Risk and Insurance, 78(4), 983–1002.

Therneau, T. M., and Lumley, T. (2015). Package ‘survival’. R Top Doc, 128 (10), 28–33.

Wright, M. N., and Ziegler, A. (2017). Ranger: A fast implementation of random forests for high dimensional data in C++ and R. Journal of Statistical Software, 77(1), 1–17.

Zeileis, A., Hothorn, T., and Hornik, K. (2010). Party with the mob: Model-based recursive partitioning in R. R package version 0.9.

Published

2021-12-15

How to Cite

Andrade, J. L., & Valencia, J. L. (2021). Modeling lapse rates using machine learning: a comparison between survival forests and cox proportional hazards techniques. Anales Del Instituto De Actuarios Españoles, (27), 161–183. https://doi.org/10.26360/2021_7

Issue

Section

Research articles