AITopics | risk score

Collaborating Authors

risk score

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

British Police Built a Sprawling Crime-Prediction Machine. Some Results Couldn't Be Trusted

WIREDJun-25-2026, 10:00:00 GMT

British Police Built a Sprawling Crime-Prediction Machine. Some Results Couldn't Be Trusted As UK police embrace the AI revolution, a WIRED investigation reveals the messy inside story of one region's experiment with predictive analytics. The Think Family Database holds records on close to half a million people who live in the city of Bristol, England. For many years, few of them knew anything about it. Launched in 2016 by the Bristol City Council and the regional Avon and Somerset Police, the database has stored all manner of sensitive information--police intelligence reports, housing status, mental health records, teenage pregnancies, enrollment in parenting courses, free school meals. On top of this sensitive data, officials built machine-learning models to assign scores to thousands of adults and children. They hoped to build what they called a "picture of threat, harm, and risk" in the region. At an event in early 2022 to help officials tackle child exploitation crimes, one police data scientist described part of the approach this way: "I essentially dump all that data in a big bucket and stir it with a data-science spatula, and we come out with a lovely risk score for everybody." This risk scoring inside the Think Family Database was just one part of Avon and Somerset Police's sprawling predictive analytics program.

artificial intelligence, data mining, machine learning, (14 more...)

WIRED

Country: Europe > United Kingdom > England > Bristol (0.24)

Genre: Research Report (0.68)

Industry:

Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
Government > Regional Government > Europe Government > United Kingdom Government (1.00)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

PromptShift-CRC: Drift-Aware Conformal Risk Control for Foundation Models Under Prompt and Domain Shift

Opoku, Jeffery, Banahene, David

arXiv.org Machine LearningJun-16-2026

Foundation models are now used in settings where the prompts they receive can change quickly. Users change, topics change, policies change, and the model may suddenly face a kind of request that was rare in the calibration data. This makes fixed calibration risky. Conformal prediction and conformal risk control give model-agnostic ways to control error, but they work best when the calibration data still look like the future data. This paper develops PromptShift CRC, a drift-aware conformal risk control method for foundation-model outputs under prompt and domain shift. The method embeds prompts and responses, measures how far the current prompt stream has moved from the calibration pool, gives more weight to relevant or recent calibration examples, and updates the risk level online after observed violations. It reports three practical diagnostics: realized risk error, prompt drift, and effective calibration size. We give conditions under which the method controls risk up to terms for distribution mismatch and weighted quantile uncertainty. In a synthetic prompt-shift benchmark, static conformal risk control fails sharply after drift, while PromptShift-CRC gives the best coverage among the adaptive baselines considered. We then evaluate the same calibration layer on public benchmark derived streams for question answering, toxicity, summarization factuality, and long-context hallucination risk

large language model, machine learning, question answering, (17 more...)

arXiv.org Machine Learning

2606.15964

Country: North America > United States (0.46)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.93)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.35)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.30)

Add feedback

Accident Anticipation via Temporal Occurrence Prediction

Neural Information Processing SystemsJun-14-2026, 16:52:41 GMT

Accident anticipation aims to predict potential collisions in an online manner, enabling timely alerts to enhance road safety. Existing methods typically predict frame-level risk scores as indicators of hazard. However, these approaches rely on ambiguous binary supervision--labeling all frames in accident videos as positive--despite the fact that risk varies continuously over time, leading to unreliable learning and false alarms. To address this, we propose a novel paradigm that shifts the prediction target from current-frame risk scoring to directly estimating accident scores at multiple future time steps (e.g., 0.1s-2.0s

artificial intelligence, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Country: Asia > China (0.28)

Genre: Research Report > Experimental Study (1.00)

Industry:

Information Technology (0.67)
Transportation > Ground > Road (0.66)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

Multimodality Stacking with Blockwise missing values and application to the PIONeeR biomarkers study for prediction of resistance to immunotherapy

Boussena, Mohamed, Monville, Florence, Fieschi-Meric, Jacques, Vely, Frederic, Milpied, Pierre, Mazieres, Julien, Perol, Maurice, Vivier, Eric, Greillier, Laurent, Barlesi, Fabrice, Benzekry, Sebastien

arXiv.org Machine LearningMay-26-2026

Integrating multimodal datasets in clinical oncology is frequently hindered by high dimensionality and blockwise missingness, where entire data sources are unavailable for specific patient subsets. Standard survival models often struggle with these gaps, leading to biased results or patient exclusion. We introduce Multimodality Stacking with Blockwise missing values (MSB), a late-fusion framework for survival analysis that independently models modality-specific features before aggregating predictions via a cross-validated stacking meta-learner. MSB was validated on the PIONeeR study (n=443 patients, 378 biomarkers across eight heterogeneous sources) to predict progression-free survival in advanced non-small cell lung cancer patients receiving immunotherapy. MSB yielded higher predictive performance (C-index) than baseline algorithms. Improvements varied by baseline strength: linear models showed a 15.9% increase (p<0.001 for the Wilcoxon signed-rank test), random survival forests gained 5.4% (p=0.002), and gradient boosting methods improved by 2.1% (p=0.030). Beyond discrimination, MSB reduced the generalization gap (train-test difference in 5 folds cross-validation repeated 3 times: 0.055 vs 0.380 for linear models). Permutation importance analysis identified routine laboratory markers, clinical features, and PD-L1 expression as primary predictive drivers. Missing block indicators showed negligible importance, suggesting the model learned from biomarker values rather than data availability patterns. MSB provides a statistically validated framework for multimodal survival prediction with blockwise missingness. By enabling systematic biomarker evaluation without requiring complete data, MSB offers a practical tool for predictive modeling in biomedical research, pending external validation. Implementation is available at https://github.com/MohamedBoussena/MSB under Inria license.

artificial intelligence, machine learning, train 0, (17 more...)

arXiv.org Machine Learning

2605.2505

Country: Europe > France (0.69)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Oncology > Lung Cancer (0.89)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.88)

Add feedback

Learning Interpretable Point-Based Clinical Risk Scores via Direct Optimization

Cui, Ying, Li, Albert M, Charu, Vivek, Hwang, Yeon-Mi, Hernandez-Boussard, Tina, Tian, Lu

arXiv.org Machine LearningMay-20-2026

Many clinical risk scores are deployed as additive rules with nonnegative integer points assigned to relevant binary predictive features. These integer weights not only make the score easier to use in practice but also promote sparsity in the resulting prediction model. Such risk scores are often derived by first fitting a regression model and then rounding the estimated coefficients to the nearest integer after appropriate scaling. This approach is computationally fast but does not guarantee optimality of the resulting score. Alternatively, one may search over all possible integer weights to directly optimize a value function by posing the problem as an integer programming task. However, the associated computational burden can be substantial, especially when the value function is nonconcave or even discontinuous. In this paper, we develop new machine learning algorithms that employ a flexible greedy optimization strategy to learn such additive scoring directly under explicit and sensible optimality objectives. We apply the proposed method to a large electronic health record (EHR) cohort in Epic Cosmos to construct an integer-weighted comorbidity score for measuring the risk of post-discharge mortality. We also conduct a simulation study to examine the finite-sample operating characteristics.

artificial intelligence, machine learning, predictor, (16 more...)

arXiv.org Machine Learning

2605.19113

Country: North America (0.28)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area (1.00)
Health & Medicine > Health Care Technology > Medical Record (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.49)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.46)

Add feedback

1f3cbee17170c3ffff3e413d2df54f6b-Paper-Conference.pdf

Neural Information Processing SystemsApr-25-2026, 17:47:00 GMT

data mining, machine learning, prediction, (18 more...)

Neural Information Processing Systems

Country:

North America > United States > Texas (0.28)
North America > United States > Iowa > Johnson County > Iowa City (0.14)

Genre: Research Report (1.00)

Industry:

Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
Transportation > Ground > Road (0.93)
Government (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Data Science > Data Mining (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Add feedback

Evaluating language models as risk scores

Neural Information Processing SystemsMar-22-2026, 02:37:17 GMT

Current question-answering benchmarks predominantly focus on accuracy in realizable prediction tasks.Conditioned on a question and answer-key, does the most likely token match the ground truth?Such benchmarks necessarily fail to evaluate LLMs' ability to quantify ground-truth outcome uncertainty.In this work, we focus on the use of LLMs as risk scores for unrealizable prediction tasks.We introduce folktexts, a software package to systematically generate risk scores using LLMs, and evaluate them against US Census data products.A flexible API enables the use of different prompting schemes, local or web-hosted models, and diverse census columns that can be used to compose custom prediction tasks.We evaluate 17 recent LLMs across five proposed benchmark tasks.We find that zero-shot risk scores produced by multiple-choice question-answering have high predictive signal but are widely miscalibrated.Base models consistently overestimate outcome uncertainty, while instruction-tuned models underestimate uncertainty and produce over-confident risk scores.In fact, instruction-tuning polarizes answer distribution regardless of true underlying data uncertainty.This reveals a general inability of instruction-tuned models to express data uncertainty using multiple-choice answers.A separate experiment using verbalized chat-style risk queries yields substantially improved calibration across instruction-tuned models.These differences in ability to quantify data uncertainty cannot be revealed in realizable settings, and highlight a blind-spot in the current evaluation ecosystem that folktexts covers.

artificial intelligence, large language model, natural language, (9 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Auditing for Human Expertise

Neural Information Processing SystemsFeb-18-2026, 02:21:36 GMT

High-stakes prediction tasks (e.g., patient diagnosis) are often handled by trained

artificial intelligence, data mining, machine learning, (19 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
North America > United States > Illinois > Cook County > Chicago (0.04)
Asia > Middle East > Jordan (0.04)
(2 more...)

Genre:

Research Report > Experimental Study (0.68)
Research Report > New Finding (0.67)

Industry:

Health & Medicine > Government Relations & Public Policy (1.00)
Health & Medicine > Therapeutic Area > Gastroenterology (0.68)
Health & Medicine > Diagnostic Medicine (0.67)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science > Data Mining (0.92)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.46)

Add feedback

Appendix A Additional results This appendix section shows additional results and corresponding plots to support the insights

Neural Information Processing SystemsFeb-17-2026, 12:24:05 GMT

Section A.2 shows results using a chat-style verbalized numeric Section A.3 shows results on four extra benchmark tasks made available with Finally, Section A.5 presents and discusses results on feature In this section, we evaluate risk score calibration on the income prediction task across different subpopulations, such as typically done as part of a fairness audit. Figures A1-A2 show group-conditional calibration curves for all models on the ACSIncome task, evaluated on three subgroups specified by the race attribute in the ACS data. We show the three race categories with largest representation. The'Mixtral 8x22B' and'Yi 34B' models shown are the worst offenders, where samples belonging to the'Black' population see consistently lower scores for the same positive label probability when compared to the'Asian' or'White' populations. On average, the'Mixtral 8x22B (it)' model classifies a Black individual with a In fact, this score bias can be reversed for some base models, overestimating scores from Black individuals compared with other subgroups.

large language model, machine learning, natural language, (16 more...)

Neural Information Processing Systems

Country: