Goto

Collaborating Authors

 policyholder


EU-Agent-Bench: Measuring Illegal Behavior of LLM Agents Under EU Law

Lichkovski, Ilija, Müller, Alexander, Ibrahim, Mariam, Mhundwa, Tiwai

arXiv.org Artificial Intelligence

Large language models (LLMs) are increasingly deployed as agents in various contexts by providing tools at their disposal. However, LLM agents can exhibit unpredictable behaviors, including taking undesirable and/or unsafe actions. In order to measure the latent propensity of LLM agents for taking illegal actions under an EU legislative context, we introduce EU-Agent-Bench, a verifiable human-curated benchmark that evaluates an agent's alignment with EU legal norms in situations where benign user inputs could lead to unlawful actions. Our benchmark spans scenarios across several categories, including data protection, bias/discrimination, and scientific integrity, with each user request allowing for both compliant and non-compliant execution of the requested actions. Comparing the model's function calls against a rubric exhaustively supported by citations of the relevant legislature, we evaluate the legal compliance of frontier LLMs, and furthermore investigate the compliance effect of providing the relevant legislative excerpts in the agent's system prompt along with explicit instructions to comply. We release a public preview set for the research community, while holding out a private test set to prevent data contamination in evaluating upcoming models. We encourage future work extending agentic safety benchmarks to different legal jurisdictions and to multi-turn and multilingual interactions. We release our code on \href{https://github.com/ilijalichkovski/eu-agent-bench}{this URL}.


A new wave of vehicle insurance fraud fueled by generative AI

Hever, Amir, Orr, Itai

arXiv.org Artificial Intelligence

Generative AI is supercharging insurance fraud by making it easier to falsify accident evidence at scale and in rapid time. Insurance fraud is a pervasive and costly problem, amounting to tens of billions of dollars in losses each year. In the vehicle insurance sector, fraud schemes have traditionally involved staged accidents, exaggerated damage, or forged documents. The rise of generative AI, including deepfake image and video generation, has introduced new methods for committing fraud at scale. Fraudsters can now fabricate highly realistic crash photos, damage evidence, and even fake identities or documents with minimal effort, exploiting AI tools to bolster false insurance claims. Insurers have begun deploying countermeasures such as AI-based deepfake detection software and enhanced verification processes to detect and mitigate these AI-driven scams. However, current mitigation strategies face significant limitations. Detection tools can suffer from false positives and negatives, and sophisticated fraudsters continuously adapt their tactics to evade automated checks. This cat-and-mouse arms race between generative AI and detection technology, combined with resource and cost barriers for insurers, means that combating AI-enabled insurance fraud remains an ongoing challenge. In this white paper, we present UVeye layered solution for vehicle fraud, representing a major leap forward in the ability to detect, mitigate and deter this new wave of fraud.


A Bayesian Approach for Prioritising Driving Behaviour Investigations in Telematic Auto Insurance Policies

McLeod, Mark, Perez-Orozco, Bernardo, Lee, Nika, Zilli, Davide

arXiv.org Machine Learning

Automotive insurers increasingly have access to telematic information via black-box recorders installed in the insured vehicle, and wish to identify undesirable behaviour which may signify increased risk or uninsured activities. However, identification of such behaviour with machine learning is non-trivial, and results are far from perfect, requiring human investigation to verify suspected cases. An appropriately formed priority score, generated by automated analysis of GPS data, allows underwriters to make more efficient use of their time, improving detection of the behaviour under investigation. An example of such behaviour is the use of a privately insured vehicle for commercial purposes, such as delivering meals and parcels. We first make use of trip GPS and accelerometer data, augmented by geospatial information, to train an imperfect classifier for delivery driving on a per-trip basis. We make use of a mixture of Beta-Binomial distributions to model the propensity of a policyholder to undertake trips which result in a positive classification as being drawn from either a rare high-scoring or common low-scoring group, and learn the parameters of this model using MCMC. This model provides us with a posterior probability that any policyholder will be a regular generator of automated alerts given any number of trips and alerts. This posterior probability is converted to a priority score, which was used to select the most valuable candidates for manual investigation. Testing over a 1-year period ranked policyholders by likelihood of commercial driving activity on a weekly basis. The top 0.9% have been reviewed at least once by the underwriters at the time of writing, and of those 99.4% have been confirmed as correctly identified, showing the approach has achieved a significant improvement in efficiency of human resource allocation compared to manual searching.


Privacy-Enhancing Collaborative Information Sharing through Federated Learning -- A Case of the Insurance Industry

Dong, Panyi, Quan, Zhiyu, Edwards, Brandon, Wang, Shih-han, Feng, Runhuan, Wang, Tianyang, Foley, Patrick, Shah, Prashant

arXiv.org Artificial Intelligence

The report demonstrates the benefits (in terms of improved claims loss modeling) of harnessing the value of Federated Learning (FL) to learn a single model across multiple insurance industry datasets without requiring the datasets themselves to be shared from one company to another. The application of FL addresses two of the most pressing concerns: limited data volume and data variety, which are caused by privacy concerns, the rarity of claim events, the lack of informative rating factors, etc.. During each round of FL, collaborators compute improvements on the model using their local private data, and these insights are combined to update a global model. Such aggregation of insights allows for an increase to the effectiveness in forecasting claims losses compared to models individually trained at each collaborator. Critically, this approach enables machine learning collaboration without the need for raw data to leave the compute infrastructure of each respective data owner. Additionally, the open-source framework, OpenFL, that is used in our experiments is designed so that it can be run using confidential computing as well as with additional algorithmic protections against leakage of information via the shared model updates. In such a way, FL is implemented as a privacy-enhancing collaborative learning technique that addresses the challenges posed by the sensitivity and privacy of data in traditional machine learning solutions. This paper's application of FL can also be expanded to other areas including fraud detection, catastrophe modeling, etc., that have a similar need to incorporate data privacy into machine learning collaborations. Our framework and empirical results provide a foundation for future collaborations among insurers, regulators, academic researchers, and InsurTech experts.


An engine to simulate insurance fraud network data

Campo, Bavo D. C., Antonio, Katrien

arXiv.org Artificial Intelligence

Traditionally, the detection of fraudulent insurance claims relies on business rules and expert judgement which makes it a time-consuming and expensive process (\'Oskarsd\'ottir et al., 2022). Consequently, researchers have been examining ways to develop efficient and accurate analytic strategies to flag suspicious claims. Feeding learning methods with features engineered from the social network of parties involved in a claim is a particularly promising strategy (see for example Van Vlasselaer et al. (2016); Tumminello et al. (2023)). When developing a fraud detection model, however, we are confronted with several challenges. The uncommon nature of fraud, for example, creates a high class imbalance which complicates the development of well performing analytic classification models. In addition, only a small number of claims are investigated and get a label, which results in a large corpus of unlabeled data. Yet another challenge is the lack of publicly available data. This hinders not only the development of new methods, but also the validation of existing techniques. We therefore design a simulation machine that is engineered to create synthetic data with a network structure and available covariates similar to the real life insurance fraud data set analyzed in \'Oskarsd\'ottir et al. (2022). Further, the user has control over several data-generating mechanisms. We can specify the total number of policyholders and parties, the desired level of imbalance and the (effect size of the) features in the fraud generating model. As such, the simulation engine enables researchers and practitioners to examine several methodological challenges as well as to test their (development strategy of) insurance fraud detection models in a range of different settings. Moreover, large synthetic data sets can be generated to evaluate the predictive performance of (advanced) machine learning techniques.


Conformal prediction for frequency-severity modeling

Graziadei, Helton, F., Paulo C. Marques, de Melo, Eduardo F. L., Targino, Rodrigo S.

arXiv.org Artificial Intelligence

The statistical modeling of insurance claims is a crucial task of the property and casualty insurance industry. An essential ingredient in this modeling process is the two-stage approach, encompassing a frequency model and a severity model. At the first stage, a frequency model predicts the number of claims, while, at the second stage, a severity model predicts the average financial impact or size of a claim, given that it has occurred. Together, these two models map relevant predictors such as the policyholder's age, geographical location, and claim history, to the response variables describing the frequency and severity of the claims. This classic approach, known as the frequency-severity model, has been instrumental in the process of risk categorization, premium calculation, and, in a broader context, risk quantification of business portfolios for specific industry segments [1, 2].


Insights From Insurance for Fair Machine Learning: Responsibility, Performativity and Aggregates

Fröhlich, Christian, Williamson, Robert C.

arXiv.org Artificial Intelligence

We argue that insurance can act as an analogon for the social situatedness of machine learning systems, hence allowing machine learning scholars to take insights from the rich and interdisciplinary insurance literature. Tracing the interaction of uncertainty, fairness and responsibility in insurance provides a fresh perspective on fairness in machine learning. We link insurance fairness conceptions to their machine learning relatives, and use this bridge to problematize fairness as calibration. In this process, we bring to the forefront three themes that have been largely overlooked in the machine learning literature: responsibility, performativity and tensions between aggregate and individual.


The Role of AI in Insurance: From Underwriting to Claims Processing

#artificialintelligence

One of the most significant changes in recent years in the insurance sector has been the incorporation of artificial intelligence (AI) into various phases of the insurance process. From underwriting to claims processing, artificial intelligence has the potential to transform the business by increasing efficiency, lowering costs, and improving customer experience. In this article, we will look at the function of artificial intelligence in insurance and its possible impact on the sector. Underwriting is an important part of the insurance process that involves assessing potential policyholders' risks and establishing the appropriate premium. This has traditionally been a time-consuming and labor-intensive procedure, but artificial intelligence has the potential to make it faster, more efficient, and more accurate.


Natural Disasters, AI and Insurance Risk Assessment

#artificialintelligence

Hurricane Ian made its way across Florida in late September 2022, causing tens of billions in estimated insurance losses due to wind and flood damage. Now, half a year later after the disaster, homeowners are still picking up the pieces and rebuilding with the payouts that have been slowly coming out from insurance policies. However, many have had the unexpected shock to learn that flooding was not a part of their homeowners insurance. Here we explain natural disasters, AI and insurance risk assessment. This event and many like it are stark reminders to both individuals and businesses that checking in with their insurance company to review insurance policies is something that needs to happen regularly, not because something may have gone unnoticed, but because things change.


Alliance Insurance Services Announces Partnership with ReFocus AI to Increase Retention - Digital Journal

#artificialintelligence

Industry experts estimate that acquiring a new policyholder is 5-25x more expensive than retaining an existing one, but until now, there was no way to predictively anticipate if an account would churn. ReFocus AI's Converge platform offers a unique, turnkey solution that allows brokers to leverage the predictive power of their policyholder data to pinpoint the ideal retention action to reduce churn by as much as 25%. "The ReFocus AI platform allows us to better serve our clients by understanding their concerns and how best we can serve them," says Christopher Cook, CEO of Alliance. "These insights will help us realize growth and prioritize our efforts around renewals. We believe that AI-powered churn analytics will result in higher customer retention and significantly better customer experiences."