AITopics | insurance

Collaborating Authors

insurance

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Federated Learning for the Design of Parametric Insurance Indices under Heterogeneous Renewable Production Losses

Niakh, Fallou

arXiv.org Machine LearningJan-21-2026

We propose a federated learning framework for the calibration of parametric insurance indices under heterogeneous renewable energy production losses. Producers locally model their losses using Tweedie generalized linear models and private data, while a common index is learned through federated optimization without sharing raw observations. The approach accommodates heterogeneity in variance and link functions and directly minimizes a global deviance objective in a distributed setting. We implement and compare FedAvg, FedProx and FedOpt, and benchmark them against an existing approximation-based aggregation method. An empirical application to solar power production in Germany shows that federated learning recovers comparable index coefficients under moderate heterogeneity, while providing a more general and scalable framework.

artificial intelligence, machine learning, producer, (17 more...)

arXiv.org Machine Learning

2601.12178

Country:

North America > United States > Virginia (0.04)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
Europe > France (0.04)

Genre: Research Report (0.82)

Industry:

Energy > Renewable > Solar (1.00)
Energy > Power Industry (1.00)
Banking & Finance > Insurance (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

I Have a Job Offer I Can't Refuse. The Company It Comes From Has a Terrible Reputation for Women.

SlateNov-18-2025, 18:00:00 GMT

Good Job I Have a Job Offer I Can't Refuse. The Company It Comes From Has a Terrible Reputation for Women. My company unexpectedly outsourced my entire department to a firm that uses AI for our jobs, even though I don't work a job that can really be done by machine learning. I have some savings but can't go without health insurance: my daughter and I both have the same complex chronic condition. I was briefly on public insurance in the past and it was a nightmare of waitlists leading to a cascade of hospital stays.

artificial intelligence, insurance, social media, (9 more...)

Slate

Country: North America > United States (0.05)

Industry:

Health & Medicine (0.50)
Marketing (0.38)

Technology:

Information Technology > Artificial Intelligence (0.56)
Information Technology > Communications > Social Media (0.34)

Add feedback

Design, Results and Industry Implications of the World's First Insurance Large Language Model Evaluation Benchmark

Zhou, Hua, Ma, Bing, Zhang, Yufei, Zhao, Yi

arXiv.org Artificial IntelligenceNov-12-2025

This paper comprehensively elaborates on the construction methodology, multi-dimensional evaluation system, and underlying design philosophy of CUFEInse v1.0. Adhering to the principles of "quantitative-oriented, expert-driven, and multi-validation," the benchmark establishes an evaluation framework covering 5 core dimensions, 54 sub-indicators, and 14,430 high-quality questions, encompassing insurance theoretical knowledge, industry understanding, safety and compliance, intelligent agent application, and logical rigor. Based on this benchmark, a comprehensive evaluation was conducted on 11 mainstream large language models. The evaluation results reveal that general-purpose models suffer from common bottlenecks such as weak actuarial capabilities and inadequate compliance adaptation. High-quality domain-specific training demonstrates significant advantages in insurance vertical scenarios but exhibits shortcomings in business adaptation and compliance. The evaluation also accurately identifies the common bottlenecks of current large models in professional scenarios such as insurance actuarial, underwriting and claim settlement reasoning, and compliant marketing copywriting. The establishment of CUFEInse not only fills the gap in professional evaluation benchmarks for the insurance field, providing academia and industry with a professional, systematic, and authoritative evaluation tool, but also its construction concept and methodology offer important references for the evaluation paradigm of large models in vertical fields, serving as an authoritative reference for academic model optimization and industrial model selection. Finally, the paper looks forward to the future iteration direction of the evaluation benchmark and the core development direction of "domain adaptation + reasoning enhancement" for insurance large models.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2511.07794

Country: Asia > China (0.05)

Genre: Research Report (0.64)

Industry:

Law (1.00)
Health & Medicine (1.00)
Banking & Finance > Risk Management (1.00)
Banking & Finance > Insurance (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Add feedback

ConVerse: Benchmarking Contextual Safety in Agent-to-Agent Conversations

Gomaa, Amr, Salem, Ahmed, Abdelnabi, Sahar

arXiv.org Artificial IntelligenceNov-10-2025

As language models evolve into autonomous agents that act and communicate on behalf of users, ensuring safety in multi-agent ecosystems becomes a central challenge. Interactions between personal assistants and external service providers expose a core tension between utility and protection: effective collaboration requires information sharing, yet every exchange creates new attack surfaces. We introduce ConVerse, a dynamic benchmark for evaluating privacy and security risks in agent-agent interactions. ConVerse spans three practical domains (travel, real estate, insurance) with 12 user personas and over 864 contextually grounded attacks (611 privacy, 253 security). Unlike prior single-agent settings, it models autonomous, multi-turn agent-to-agent conversations where malicious requests are embedded within plausible discourse. Privacy is tested through a three-tier taxonomy assessing abstraction quality, while security attacks target tool use and preference manipulation. Evaluating seven state-of-the-art models reveals persistent vulnerabilities; privacy attacks succeed in up to 88% of cases and security breaches in up to 60%, with stronger models leaking more. By unifying privacy and security within interactive multi-agent contexts, ConVerse reframes safety as an emergent property of communication.

artificial intelligence, deep learning, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2511.05359

Country:

North America > Montserrat (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Greece (0.04)
(5 more...)

Genre: Research Report (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Banking & Finance (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.31)

Add feedback

Unsupervised Evaluation of Multi-Turn Objective-Driven Interactions

Soroka, Emi, Chopra, Tanmay, Desai, Krish, Lall, Sanjay

arXiv.org Artificial IntelligenceNov-6-2025

Large language models (LLMs) have seen increasing popularity in enterprise applications where AI agents and humans engage in objective-driven interactions. However, these systems are difficult to evaluate: data may be complex and unlabeled; human annotation is often impractical at scale; custom metrics can monitor for specific errors, but not previously-undetected ones; and LLM judges can produce unreliable results. We introduce the first set of unsupervised metrics for objective-driven interactions, leveraging statistical properties of unlabeled interaction data and using fine-tuned LLMs to adapt to distributional shifts. We develop metrics for labeling user goals, measuring goal completion, and quantifying LLM uncertainty without grounding evaluations in human-generated ideal responses. Our approach is validated on open-domain and task-specific interaction data.

completion, large language model, machine learning, (21 more...)

arXiv.org Artificial Intelligence

2511.03047

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > California > Santa Clara County > Stanford (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
(8 more...)

Genre: Research Report (0.51)

Industry:

Banking & Finance > Insurance (1.00)
Health & Medicine > Health Care Providers & Services (0.93)
Health & Medicine > Therapeutic Area (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

InsurAgent: A Large Language Model-Empowered Agent for Simulating Individual Behavior in Purchasing Flood Insurance

Geng, Ziheng, Liu, Jiachen, Cao, Ran, Cheng, Lu, Frangopol, Dan M., Cheng, Minghui

arXiv.org Artificial IntelligenceNov-5-2025

Flood insurance is an effective strategy for individuals to mitigate disaster-related losses. However, participation rates among at-risk populations in the United States remain strikingly low. This gap underscores the need to understand and model the behavioral mechanisms underlying insurance decisions. Large language models (LLMs) have recently exhibited human-like intelligence across wide-ranging tasks, offering promising tools for simulating human decision-making. This study constructs a benchmark dataset to capture insurance purchase probabilities across factors. Using this dataset, the capacity of LLMs is evaluated: while LLMs exhibit a qualitative understanding of factors, they fall short in estimating quantitative probabilities. To address this limitation, InsurAgent, an LLM-empowered agent comprising five modules including perception, retrieval, reasoning, action, and memory, is proposed. The retrieval module leverages retrieval-augmented generation (RAG) to ground decisions in empirical survey data, achieving accurate estimation of marginal and bivariate probabilities. The reasoning module leverages LLM common sense to extrapolate beyond survey data, capturing contextual information that is intractable for traditional models. The memory module supports the simulation of temporal decision evolutions, illustrated through a roller coaster life trajectory. Overall, InsurAgent provides a valuable tool for behavioral modeling and policy analysis.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2511.02119

Country:

North America > United States > Illinois > Cook County > Chicago (0.04)
North America > United States > Florida > Miami-Dade County > Coral Gables (0.04)
Asia > Japan (0.04)
Asia > China (0.04)

Genre: Research Report > New Finding (1.00)

Industry:

Banking & Finance > Insurance (1.00)
Government > Regional Government > North America Government > United States Government (0.68)
Education > Educational Setting > K-12 Education (0.46)
Education > Educational Setting > Higher Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Not ready for the bench: LLM legal interpretation is unstable and out of step with human judgments

Purushothama, Abhishek, Min, Junghyun, Waldon, Brandon, Schneider, Nathan

arXiv.org Artificial IntelligenceOct-30-2025

Legal interpretation frequently involves assessing how a legal text, as understood by an 'ordinary' speaker of the language, applies to the set of facts characterizing a legal dispute in the U.S. judicial system. Recent scholarship has proposed that legal practitioners add large language models (LLMs) to their interpretive toolkit. This work offers an empirical argument against LLM interpretation as recently practiced by legal scholars and federal judges. Our investigation in English shows that models do not provide stable interpretive judgments: varying the question format can lead the model to wildly different conclusions. Moreover, the models show weak to moderate correlation with human judgment, with large variance across model and question variant, suggesting that it is dangerous to give much credence to the conclusions produced by generative AI.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2510.25356

Country:

Europe > Austria > Vienna (0.14)
Asia > Singapore (0.04)
North America > United States > New Mexico > Bernalillo County > Albuquerque (0.04)
(8 more...)

Genre: Research Report > New Finding (0.46)

Industry:

Law (1.00)
Banking & Finance > Insurance (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.34)

Add feedback

GPO: Learning from Critical Steps to Improve LLM Reasoning

Yu, Jiahao, Cheng, Zelei, Wu, Xian, Xing, Xinyu

arXiv.org Artificial IntelligenceOct-22-2025

Large language models (LLMs) are increasingly used in various domains, showing impressive potential on different tasks. Recently, reasoning LLMs have been proposed to improve the \textit{reasoning} or \textit{thinking} capabilities of LLMs to solve complex problems. Despite the promising results of reasoning LLMs, enhancing the multi-step reasoning capabilities of LLMs still remains a significant challenge. While existing optimization methods have advanced the LLM reasoning capabilities, they often treat reasoning trajectories as a whole, without considering the underlying critical steps within the trajectory. In this paper, we introduce \textbf{G}uided \textbf{P}ivotal \textbf{O}ptimization (GPO), a novel fine-tuning strategy that dives into the reasoning process to enable more effective improvements. GPO first identifies the `critical step' within a reasoning trajectory - a point that the model must carefully proceed to succeed at the problem. We locate the critical step by estimating the advantage function. GPO then resets the policy to the critical step, samples the new rollout and prioritizes the learning process on those rollouts. This focus allows the model to learn more effectively from pivotal moments within the reasoning process to improve the reasoning performance. We demonstrate that GPO is a general strategy that can be integrated with various optimization methods to improve reasoning performance. Besides theoretical analysis, our experiments across challenging reasoning benchmarks show that GPO can consistently and significantly enhance the performance of existing optimization methods, showcasing its effectiveness and generalizability in improving LLM reasoning by concentrating on pivotal moments within the generation process.

arxiv preprint arxiv, large language model, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2509.16456

Country:

North America > United States > Virginia (0.04)
Asia > Myanmar > Tanintharyi Region > Dawei (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Workflow (0.95)

Industry:

Education (0.92)
Information Technology (0.92)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

WebGen-V Bench: Structured Representation for Enhancing Visual Design in LLM-based Web Generation and Evaluation

Wang, Kuang-Da, Wang, Zhao, Shimose, Yotaro, Wang, Wei-Yao, Takamatsu, Shingo

arXiv.org Artificial IntelligenceOct-20-2025

Witnessed by the recent advancements on leveraging LLM for coding and multimodal understanding, we present WebGen-V, a new benchmark and framework for instruction-to-HTML generation that enhances both data quality and evaluation granularity. WebGen-V contributes three key innovations: (1) an unbounded and extensible agentic crawling framework that continuously collects real-world webpages and can leveraged to augment existing benchmarks; (2) a structured, section-wise data representation that integrates metadata, localized UI screenshots, and JSON-formatted text and image assets, explicit alignment between content, layout, and visual components for detailed multimodal supervision; and (3) a section-level multimodal evaluation protocol aligning text, layout, and visuals for high-granularity assessment. Experiments with state-of-the-art LLMs and ablation studies validate the effectiveness of our structured data and section-wise evaluation, as well as the contribution of each component. To the best of our knowledge, WebGen-V is the first work to enable high-granularity agentic crawling and evaluation for instruction-to-HTML generation, providing a unified pipeline from real-world data acquisition and webpage generation to structured multimodal assessment.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2510.15306

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
North America > United States (0.14)
Europe > Austria > Vienna (0.14)
(2 more...)

Genre: Research Report > New Finding (0.67)

Industry:

Transportation (1.00)
Health & Medicine > Consumer Health (1.00)
Consumer Products & Services > Travel (1.00)
(12 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.71)

Add feedback

My Mom Cured Her Post-Divorce Loneliness by Becoming a Scammer. I Have to Get Her to Stop.

SlateOct-15-2025, 19:25:42 GMT

My mother and father divorced two years ago after a long marriage. She is 64, and the divorce hit her hard. She was very upset because, among other things, my father started dating soon after the divorce and has been steadily going out with a woman for the past six months. Meanwhile, my mother had a hard time dating. She complained about it bitterly, saying it was not fair my father got to restart his life so easily while no one would go out with her.

artificial intelligence, slate shop game newsletter sign, social media, (8 more...)

Slate

Industry:

Marketing (1.00)
Health & Medicine (0.97)
Information Technology > Security & Privacy (0.52)
Law Enforcement & Public Safety > Crime Prevention & Enforcement (0.42)

Technology:

Information Technology > Communications > Social Media (0.75)
Information Technology > Artificial Intelligence (0.73)

Add feedback