Wüthrich, Mario V.
The Credibility Transformer
Richman, Ronald, Scognamiglio, Salvatore, Wüthrich, Mario V.
Feed-forward neural networks (FNNs) provide state-of-the-art deep learning regression models for actuarial pricing. FNNs can be seen as extensions of generalized linear models (GLMs), taking covariates as inputs to these FNNs, feature-engineering these covariates through several hidden FNN layers, and then using these feature-engineered covariates as inputs to a GLM. Advantages of FNNs over classical GLMs are that they are able to find functional forms and interactions in the covariates that cannot easily be captured by GLMs, and which typically require the modeler to have specific deeper insights into the data generation process. Since these specific deeper insights are not always readily available, FNNs may support the modeler in finding such structure and insight. Taking inspiration from the recent huge success of large language models (LLMs), the natural question arises whether there are network architectures other than FNNs that share more similarity with LLMs and which can further improve predictive performance of neural networks in actuarial pricing. LLMs are based on the Transformer architecture which has been invented by Vaswani et al. [31]. The Transformer architecture is based on attention layers which are special network modules that allow covariate components to communicate with each other.
Conditional expectation network for SHAP
Richman, Ronald, Wüthrich, Mario V.
A very popular model-agnostic technique for explaining predictive models is the SHapley Additive exPlanation (SHAP). The two most popular versions of SHAP are a conditional expectation version and an unconditional expectation version (the latter is also known as interventional SHAP). Except for tree-based methods, usually the unconditional version is used (for computational reasons). We provide a (surrogate) neural network approach which allows us to efficiently calculate the conditional version for both neural networks and other regression models, and which properly considers the dependence structure in the feature components. This proposal is also useful to provide drop1 and anova analyses in complex regression models which are similar to their generalized linear model (GLM) counterparts, and we provide a partial dependence plot (PDP) counterpart that considers the right dependence structure in the feature components.
Isotonic Recalibration under a Low Signal-to-Noise Ratio
Wüthrich, Mario V., Ziegel, Johanna
There are two seemingly unrelated problems in insurance pricing that we are going to tackle in this paper. First, an insurance pricing system should not have any systematic cross-financing between different price cohorts. Systematic cross-financing implicitly means that some parts of the portfolio are under-priced, and this is compensated by other parts of the portfolio that are over-priced. We can prevent systematic cross-financing between price cohorts by ensuring that the pricing system is auto-calibrated. We propose to apply isotonic recalibration which turns any regression function into an auto-calibrated pricing system.
A multi-task network approach for calculating discrimination-free insurance prices
Lindholm, Mathias, Richman, Ronald, Tsanakas, Andreas, Wüthrich, Mario V.
In applications of predictive modeling, such as insurance pricing, indirect or proxy discrimination is an issue of major concern. Namely, there exists the possibility that protected policyholder characteristics are implicitly inferred from non-protected ones by predictive models, and are thus having an undesirable (or illegal) impact on prices. A technical solution to this problem relies on building a best-estimate model using all policyholder characteristics (including protected ones) and then averaging out the protected characteristics for calculating individual prices. However, such approaches require full knowledge of policyholders' protected characteristics, which may in itself be problematic. Here, we address this issue by using a multi-task neural network architecture for claim predictions, which can be trained using only partial information on protected characteristics, and it produces prices that are free from proxy discrimination. We demonstrate the use of the proposed model and we find that its predictive accuracy is comparable to a conventional feedforward neural network (on full information). However, this multi-task network has clearly superior performance in the case of partially missing policyholder information. Keywords: Indirect discrimination, proxy discrimination, discrimination-free insurance pricing, unawareness price, best-estimate price, protected information, discriminatory covariates, fairness, incomplete information, multi-task learning, multioutput network.
LocalGLMnet: interpretable deep learning for tabular data
Richman, Ronald, Wüthrich, Mario V.
Deep learning models celebrate great success in statistical modeling because they often provide superior predictive power over classical regression models. This success is based on the fact that deep learning models perform representation learning of features, which means that they bring features into the right structure to be able to extract maximal information for the prediction task at hand. This feature engineering is done internally in a nontransparent way by the deep learning model. For this reason deep learning solutions are often criticized to be non-explainable and interpretable, in particular, because this process of representation learning is performed in high-dimensional spaces analyzing bits and pieces of the feature information. Recent research has been focusing on interpreting machine learning predictions in retrospect, see, e.g., Friedman's partial dependence plot (PDP) [10], the accumulated local effects (ALE) method of Apley-Zhu [4], the locally interpretable model-agnostic explanation (LIME) introduced by Ribeiro et al. [23], the SHapley Additive exPlanations (SHAP) of Lundberg-Lee [18] or the marginal attribution by conditioning on quantiles (MACQ) method proposed by Merz et al. [20].