Goto

Collaborating Authors

 Qazvini, Marjan


Classification problem in liability insurance using machine learning models: a comparative study

arXiv.org Machine Learning

The insurance company uses different factors to classify the policyholders. In this study, we apply several machine learning models such as nearest neighbour and logistic regression to the Actuarial Challenge dataset used by Qazvini (2019) to classify liability insurance policies into two groups: 1 - policies with claims and 2 - policies without claims. The applications of Machine Learning (ML) models and Artificial Intelligence (AI) in areas such as medical diagnosis, economics, banking, fraud detection, agriculture, etc, have been known for quite a number of years. ML models have changed these industries remarkably. However, despite their high predictive power and their capability to identify nonlinear transformations and interactions between variables, they are slowly being introduced into the insurance industry and actuarial fields.


Analysis of ELSA COVID-19 Substudy response rate using machine learning algorithms

arXiv.org Machine Learning

National Statistical Organisations every year spend time and money to collect information through surveys. Some of these surveys include follow-up studies, and usually, some participants due to factors such as death, immigration, change of employment, health, etc, do not participate in future surveys. In this study, we focus on the English Longitudinal Study of Ageing (ELSA) COVID-19 Substudy, which was carried out during the COVID-19 pandemic in two waves. In this substudy, some participants from wave 1 did not participate in wave 2. Our purpose is to predict non-responses using Machine Learning (ML) algorithms such as K-nearest neighbours (KNN), random forest (RF), AdaBoost, logistic regression, neural networks (NN), and support vector classifier (SVC). We find that RF outperforms other models in terms of balanced accuracy, KNN in terms of precision and test accuracy, and logistics regressions in terms of the area under the receiver operating characteristic curve (ROC), i.e. AUC.


Forecasting Mortality in the Middle-Aged and Older Population of England: A 1D-CNN Approach

arXiv.org Machine Learning

Longitudinal surveys are follow-up studies, in which participants' information is recorded at different time steps, say every two years. The difference between time series analysis and longitudinal studies is that in the former we have records of, say economic variables over a long period, whereas in the latter, we have a few records of participants, and the number of records depends on the number of participants. One problem with such studies is that we may have a large number of drop-outs, known as right-censoring in survival analysis, due to death, illness, immigration, etc. Another difference is that in time-series analysis, we aim to predict, say future stock prices based on past data, whereas in longitudinal study our goal is to predict a target based on some features. Longitudinal studies are used to study life events such as clinical psychology. Generalised linear models (GLMs) (McCullagh and Nelder, 1989), and generalised linear mixed models (GLMMs) (Frees, 2004) are traditional methods used in longitudinal studies. Qazvini (2023) employs GLMM to analyse the survival rate among the English population using the ELSA dataset.