concordance probability
Incorporating Expert Opinion on Observable Quantities into Statistical Models -- A General Framework
This article describes an approach to incorporate expert opinion on observable quantities through the use of a loss function which updates a prior belief as opposed to specifying parameters on the priors. Eliciting information on observable quantities allows experts to provide meaningful information on a quantity familiar to them, in contrast to elicitation on model parameters, which may be subject to interactions with other parameters or non-linear transformations before obtaining an observable quantity. The approach to incorporating expert opinion described in this paper is distinctive in that we do not specify a prior to match an expert's opinion on observed quantity, rather we obtain a posterior by updating the model parameters through a loss function. This loss function contains the observable quantity, expressed a function of the parameters, and is related to the expert's opinion which is typically operationalized as a statistical distribution. Parameters which generate observable quantities which are further from the expert's opinion incur a higher loss, allowing for the model parameters to be estimated based on their fidelity to both the data and expert opinion, with the relative strength determined by the number of observations and precision of the elicited belief. Including expert opinion in this fashion allows for a flexible specification of the opinion and in many situations is straightforward to implement with commonly used probabilistic programming software. We highlight this using three worked examples of varying model complexity including survival models, a multivariate normal distribution and a regression problem.
Computational Efficient Approximations of the Concordance Probability in a Big Data Setting
Van Oirbeek, Robin, Ponnet, Jolien, Verdonck, Tim
Performance measurement is an essential task once a statistical model is created. The Area Under the receiving operating characteristics Curve (AUC) is the most popular measure for evaluating the quality of a binary classifier. In this case, AUC is equal to the concordance probability, a frequently used measure to evaluate the discriminatory power of the model. Contrary to AUC, the concordance probability can also be extended to the situation with a continuous response variable. Due to the staggering size of data sets nowadays, determining this discriminatory measure requires a tremendous amount of costly computations and is hence immensely time consuming, certainly in case of a continuous response variable. Therefore, we propose two estimation methods that calculate the concordance probability in a fast and accurate way and that can be applied to both the discrete and continuous setting. Extensive simulation studies show the excellent performance and fast computing times of both estimators. Finally, experiments on two real-life data sets confirm the conclusions of the artificial simulations.
Concordance probability in a big data setting: application in non-life insurance
Van Oirbeek, Robin, Grumiau, Christopher, Verdonck, Tim
-- Th e concordance probability or C - index is a popular measure to capture the discriminatory ability of a regression model. In this article, the definition of this measure is adapted to the specific needs of the frequency and severity model, typically used during the technical pricing of a non - li fe insuran ce product. Due to the typical large sample size of the frequency data in particular, two different adaptations of the estimation procedure of the concordance probability are presented. Note that the latter procedures can be applied to all differ ent versions of the concordance probability . When determining the premium of a non - life insurance product, such as a car insurance product, i ts technical tariff is mostly used as a starting point.