AITopics

Inference-time control of diffusion models aims to steer model outputs to satisfy new constraints without retraining. Previous approaches have mostly relied on heuristic guidance or have been coupled with Sequential Monte Carlo (SMC) for bias correction. In this paper, we propose a flexible alternative based on replica exchange, an algorithm designed initially for sampling problems. We refer to this method as the CREPE (Controlling with REPlica Exchange). Unlike SMC, CREPE: (1) generates particles sequentially, (2) maintains high diversity in the generated samples after a burn-in period, and (3) enables online refinement or early termination. We demonstrate its versatility across various tasks, including temperature annealing, reward-tilting, model composition and classifier-free guidance debiasing, with competitive performance compared to prior SMC methods.

artificial intelligence, crepe, machine learning, (15 more...)

2509.23265

Country: Europe (0.28)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.68)

Costa, Davi Bastos, Vicente, Renato

Deceive, Detect, and Disclose: Large Language Models Play Mini-Mafia

Mafia is a social deduction game where informed mafia compete against uninformed townsfolk. Its asymmetry of information and reliance on theory-of-mind reasoning mirror real-world multi-agent scenarios, making it a useful testbed for evaluating the social intelligence of large language models (LLMs). To support a systematic study, we introduce Mini-Mafia: a simplified four-player variant with one mafioso, one detective, and two villagers. We set the mafioso to kill a villager and the detective to investigate the mafioso during the night, reducing the game to a single day phase of discussion and voting. This setup isolates three interactive capabilities through role-specific win conditions: the mafioso must deceive, the villagers must detect deception, and the detective must effectively disclose information. To measure these skills, we have LLMs play against each other, creating the Mini-Mafia Benchmark: a two-stage framework that first estimates win rates within fixed opponent configurations, then aggregates performance across them using standardized scoring. Built entirely from model interactions without external data, the benchmark evolves as new models are introduced, with each one serving both as a new opponent and as a subject of evaluation. Our experiments reveal counterintuitive results, including cases where smaller models outperform larger ones. Beyond benchmarking, Mini-Mafia enables quantitative study of emergent multi-agent dynamics such as name bias and last-speaker advantage. It also contributes to AI safety by generating training data for deception detectors and by tracking models' deception capabilities against human baselines.

background, large language model, machine learning, (19 more...)

2509.23023

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.74)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Morton, David P., Dowson, Oscar, Pagnoncelli, Bernardo K.

MDP modeling for multi-stage stochastic programs

We study a class of multi-stage stochastic programs, which incorporate modeling features from Markov decision processes (MDPs). This class includes structured MDPs with continuous state and action spaces. We extend policy graphs to include decision-dependent uncertainty for one-step transition probabilities as well as a limited form of statistical learning. We focus on the expressiveness of our modeling approach, illustrating ideas with a series of examples of increasing complexity. As a solution method, we develop new variants of stochastic dual dynamic programming, including approximations to handle non-convexities.

artificial intelligence, machine learning, policy graph, (15 more...)

2509.22981

Country:

Europe (0.67)
North America > United States > New Jersey (0.28)

Genre: Research Report (0.40)

Industry:

Government > Regional Government (0.67)
Energy > Power Industry (0.46)
Energy > Renewable (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Patrone, Paul, Kearsley, Anthony

Probabilistic Consistency in Machine Learning and Its Connection to Uncertainty Quantification

Machine learning (ML) is often viewed as a powerful data analysis tool that is easy to learn because of its black-box nature. Yet this very nature also makes it difficult to quantify confidence in predictions extracted from ML models, and more fundamentally, to understand how such models are mathematical abstractions of training data. The goal of this paper is to unravel these issues and their connections to uncertainty quantification (UQ) by pursuing a line of reasoning motivated by diagnostics. In such settings, prevalence - i.e. the fraction of elements in class - is often of inherent interest. Here we analyze the many interpretations of prevalence to derive a level-set theory of classification, which shows that certain types of self-consistent ML models are equivalent to class-conditional probability distributions. We begin by studying the properties of binary Bayes optimal classifiers, recognizing that their boundary sets can be reinterpreted as level-sets of pairwise density ratios. By parameterizing Bayes classifiers in terms of the prevalence, we then show that they satisfy important monotonicity and class-switching properties that can be used to deduce the density ratios without direct access to the boundary sets. Moreover, this information is sufficient for tasks such as constructing the multiclass Bayes-optimal classifier and estimating inherent uncertainty in the class assignments. In the multiclass case, we use these results to deduce normalization and self-consistency conditions, the latter being equivalent to the law of total probability for classifiers. We also show that these are necessary conditions for arbitrary ML models to have valid probabilistic interpretations. Throughout we demonstrate how this analysis informs the broader task of UQ for ML via an uncertainty propagation framework.

artificial intelligence, classifier, machine learning, (17 more...)

2507.2167

Country: North America > United States > Massachusetts (0.28)

Genre: Research Report (1.00)

Industry: Health & Medicine (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.88)
(2 more...)

Zhao, Lingjun, Daumé, Hal III

A Necessary Step toward Faithfulness: Measuring and Improving Consistency in Free-Text Explanations

Faithful free-text explanations are important to ensure transparency in high-stakes AI decision-making contexts, but they are challenging to generate by language models and assess by humans. In this paper, we present a measure for Prediction-EXplanation (PEX) consistency, by extending the concept of weight of evidence. This measure quantifies how much a free-text explanation supports or opposes a prediction, serving as an important aspect of explanation faithfulness. Our analysis reveals that more than 62% explanations generated by large language models lack this consistency. We show that applying direct preference optimization improves the consistency of generated explanations across three model families, with improvement ranging from 43.1% to 292.3%. Furthermore, we demonstrate that optimizing this consistency measure can improve explanation faithfulness by up to 9.7%.

explanation, large language model, machine learning, (17 more...)

2505.19299

Country:

Asia (0.67)
North America > United States > Maryland (0.28)

Genre: Research Report > New Finding (0.68)

Industry: Education (0.31)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Balafas, Vasileios, Tsouros, Dimos, Ploskas, Nikolaos, Stergiou, Kostas

Overcoming Over-Fitting in Constraint Acquisition via Query-Driven Interactive Refinement

Manual modeling in Constraint Programming is a substantial bottleneck, which Constraint Acquisition (CA) aims to automate. However, passive CA methods are prone to over-fitting, often learning models that include spurious global constraints when trained on limited data, while purely active methods can be query-intensive. We introduce a hybrid CA framework specifically designed to address the challenge of over-fitting in CA. Our approach integrates passive learning for initial candidate generation, a query-driven interactive refinement phase that utilizes probabilistic confidence scores (initialized by machine learning priors) to systematically identify over-fitted constraints, and a specialized subset exploration mechanism to recover valid substructures from rejected candidates. A final active learning phase ensures model completeness. Extensive experiments on diverse benchmarks demonstrate that our interactive refinement phase is crucial for achieving high target model coverage and overall model accuracy from limited examples, doing so with manageable query complexity. This framework represents a substantial advancement towards robust and practical constraint acquisition in data-limited scenarios.

constraint, machine learning, natural language, (16 more...)

2509.24489

Country: Europe (1.00)

Genre: Research Report > New Finding (0.45)

Industry: Leisure & Entertainment > Sports (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Constraint-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.93)

Neural Information Processing SystemsSep-29-2025, 16:26:33 GMT

3de568f8597b94bda53149c7d7f5958c-Paper.pdf

artificial intelligence, machine learning, reinforcement learning, (14 more...)

Neural Information Processing Systems

Country: North America > Canada (0.46)

Genre: Research Report > New Finding (0.68)

Industry: Energy > Oil & Gas (0.36)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)

Neural Information Processing SystemsSep-29-2025, 07:07:48 GMT

e58026e2b2929108e1bd24cbfa1c8e4b-Paper-Conference.pdf

artificial intelligence, bayesian inference, machine learning, (20 more...)

Neural Information Processing Systems

Country: Europe > Germany (0.68)

Genre: Research Report > Experimental Study (0.93)

Industry:

Health & Medicine (1.00)
Government (0.67)
Energy > Oil & Gas (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Sensing and Signal Processing > Image Processing (0.94)
(2 more...)

Neural Information Processing SystemsSep-29-2025, 04:54:37 GMT

cb1ba6a42814bf83974ed45ffdb72efa-Paper-Conference.pdf

data quality, diffusion model, machine learning, (18 more...)

Neural Information Processing Systems

Country: Europe > United Kingdom (0.27)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine (0.67)
Energy > Oil & Gas (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Data Science > Data Quality (0.68)
Information Technology > Data Science > Data Mining (0.67)
(2 more...)

Murphy-Barltrop, C. J. R., Wadsworth, J. L., de Carvalho, M., Youngman, B. D.

Modelling non-stationary extremal dependence through a geometric approach

arXiv.org Machine LearningSep-29-2025

Non-stationary extremal dependence, whereby the relationship between the extremes of multiple variables evolves over time, is commonly observed in many environmental and financial data sets. However, most multivariate extreme value models are only suited to stationary data. A recent approach to multivariate extreme value modelling uses a geometric framework, whereby extremal dependence features are inferred through the limiting shapes of scaled sample clouds. This framework can capture a wide range of dependence structures, and a variety of inference procedures have been proposed in the stationary setting. In this work, we first extend the geometric framework to the non-stationary setting and outline assumptions to ensure the necessary convergence conditions hold. We then introduce a flexible, semi-parametric modelling framework for obtaining estimates of limit sets in the non-stationary setting. Through rigorous simulation studies, we demonstrate that our proposed framework can capture a wide range of dependence forms and is robust to different model formulations. We illustrate the proposed methods on financial returns data and present several practical uses.

boundary, copula example, radius estimate, (16 more...)

arXiv.org Machine Learning

2509.22501

Country:

Europe > North Sea (0.14)
North America > United States > New York (0.04)
North America > United States > Florida > Palm Beach County > Boca Raton (0.04)
(6 more...)

Genre: Research Report > Experimental Study (0.67)

Industry: Banking & Finance > Trading (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.67)