AITopics

2605.1484

Country: Europe (0.28)

Genre: Research Report (0.50)

Industry: Law > Civil Rights & Constitutional Law (1.00)

Technology:

Information Technology > Data Science > Data Mining > Big Data (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Cho, Young Hyun, Sun, Will Wei

When Should an AI Workflow Release? Always-Valid Inference for Black-Box Generate-Verify Systems

arXiv.org Machine LearningMay-14-2026

LLM-enabled AI workflows increasingly produce outputs through iterative generate-evaluate-revise loops. Each iteration can improve the candidate, but it also creates a release decision: when to stop and output the current result? This raises a statistical challenge because deployment-time evaluator scores are adaptively generated and repeatedly monitored, yet the likelihood models or exchangeability assumptions typically used for calibration are unavailable. We propose an always-valid release wrapper for existing generator-evaluator pipelines. The wrapper builds a hard-negative reference pool of high-scoring failures, calibrates deployment-time evaluator scores against this pool, and accumulates the resulting evidence with an e-process. This separates two roles: the reference pool turns black-box scores into conservative evidence, while the e-process provides validity under optional stopping. In theory, we show that a conservative reference pool yields finite-sample control of the probability of releasing on infeasible tasks, that is, tasks for which the given workflow is not capable of producing a reliable solution. We also characterize conditions under which the same conservative rule still achieves nontrivial release on feasible tasks. In an MBPP+ coding-agent case study, the wrapper reduces premature incorrect release relative to baseline stopping rules while still releasing on tasks for which the workflow repeatedly accumulates moderate supporting evidence.

large language model, machine learning, natural language, (19 more...)

2605.12947

Genre:

Workflow (1.00)
Research Report > New Finding (0.46)

Industry: Transportation > Air (0.61)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Futami, Futoshi, Fujisawa, Masahiro

Information-Theoretic Generalization Bounds for Sequential Decision Making

arXiv.org Machine LearningMay-13-2026

Information-theoretic generalization bounds based on the supersample construction are a central tool for algorithm-dependent generalization analysis in the batch i.i.d.~setting. However, existing supersample conditional mutual information (CMI) bounds do not directly apply to sequential decision-making problems such as online learning, streaming active learning, and bandits, where data are revealed adaptively and the learner evolves along a causal trajectory. To address this limitation, we develop a sequential supersample framework that separates the learner filtration from a proof-side enlargement used for ghost-coordinate comparisons. Under a row-wise exchangeability assumption, the sequential generalization gap is controlled by sequential CMI, a sum of roundwise selector--loss information terms. We also establish a Bernstein-type refinement that yields faster rates under suitable variance conditions. The selector-SCMI proof strategy applies to online learning, streaming active learning with importance weighting, and stochastic multi-armed bandits.

data mining, machine learning, reinforcement learning, (19 more...)

2605.1219

Country: Asia > Japan (0.28)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.70)
Information Technology > Data Science > Data Mining > Big Data (0.48)

Kone, Cyrille, Jamieson, Kevin

Optimal Posterior Sampling for Policy Identification in Tabular Markov Decision Processes

arXiv.org Machine LearningMay-6-2026

We study the $(\varepsilon, δ)$-PAC policy identification problem in finite-horizon episodic Markov Decision Processes. Existing approaches provide finite-time guarantees for approximate settings ($\varepsilon>0$) but suffer from high computational cost, rendering them hard to implement, and also suffer from suboptimal dependence on $\log(1/δ)$. We propose a randomized and computationally efficient algorithm for best policy identification that combines posterior sampling with an online learning algorithm to guide exploration in the MDP. Our method achieves asymptotic optimality in sample complexity, also in terms of posterior contraction rate, and runs in $O(S^2AH)$ per episode, matching standard model-based approaches. Unlike prior algorithms such as MOCA and PEDEL, our guarantees remain meaningful in the asymptotic regime and avoid sub-optimal polynomial dependence on $\log(1/δ)$. Our results provide both theoretical insights and practical tools for efficient policy identification in tabular MDPs.

artificial intelligence, machine learning, sth, (15 more...)

2605.03921

Genre: Research Report > New Finding (0.48)

Industry: Education (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.35)

Neural Information Processing SystemsApr-25-2026, 13:32:49 GMT

Statistical Inference with M-Estimators on Adaptively Collected Data

Bandit algorithms are increasingly used in real-world sequential decision-making problems. Associated with this is an increased desire to be able to use the resulting datasets to answer scientific questions like: Did one type of ad lead to more purchases? In which contexts is a mobile health intervention effective? However, classical statistical approaches fail to provide valid confidence intervals when used with data collected with bandit algorithms. Alternative methods have recently been developed for simple models (e.g., comparison of means). Yet there is a lack of general methods for conducting statistical inference using more complex models on data collected with (contextual) bandit algorithms; for example, current methods cannot be used for valid inference on parameters in a logistic regression model for a binary reward. In this work, we develop theory justifying the use of M-estimators--which includes estimators based on empirical risk minimization as well as maximum likelihood--on data collected with adaptive algorithms, including (contextual) bandit algorithms. Specifically, we show that M-estimators, modified with particular adaptive weights, can be used to construct asymptotically valid confidence regions for a variety of inferential targets.

artificial intelligence, data mining, machine learning, (17 more...)

Country:

North America > United States (0.46)
Europe (0.46)

Genre:

Research Report > Experimental Study (0.48)
Research Report > New Finding (0.34)

Industry:

Government (0.68)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology (0.46)

Technology:

Information Technology > Data Science > Data Mining > Big Data (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.69)

Neural Information Processing SystemsFeb-18-2026, 07:43:10 GMT

MambaLRP: ExplainingSelectiveStateSpace SequenceModels

To foster their reliable use in real-world scenarios, it is crucial to augment their transparency.

large language model, machine learning, natural language, (21 more...)

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Switzerland > Zürich > Zürich (0.14)
North America > United States > Ohio (0.05)
(8 more...)

Genre: Research Report (0.93)

Industry: Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.69)

Asier Mujika, Florian Meier, Angelika Steger

Approximating Real-Time Recurrent Learning with Random Kronecker Factors

Neural Information Processing SystemsFeb-14-2026, 21:04:38 GMT

Wealso confirm these theoretical results experimentally. Further,we showempirically thattheKF-RTRLalgorithm captures long-term dependencies and almost matches the performance of TBPTT on real world tasks by trainingRecurrent Highway Networks on a synthetic string memorization task and onthe Penn TreeBank task, respectively.

algorithm, artificial intelligence, machine learning, (17 more...)

Country:

Europe > Switzerland (0.05)
North America > Canada > Quebec > Montreal (0.04)
Europe > Germany > North Rhine-Westphalia > Cologne Region > Bonn (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Ehsan Hajiramezanali, Arman Hasanzadeh, Krishna Narayanan, Nick Duffield, Mingyuan Zhou, Xiaoning Qian

Variational Graph Recurrent Neural Networks

Neural Information Processing SystemsFeb-13-2026, 10:34:28 GMT

Our experiments with multiple real-world dynamic graph datasets demonstrate thatSI-VGRNN andVGRNN consistently outperform the existing baseline and state-of-the-art methods by a significant marginindynamiclinkprediction.

artificial intelligence, graph, machine learning, (15 more...)

Country:

North America > United States > Texas > Brazos County > College Station (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.68)

Neural Information Processing SystemsFeb-12-2026, 00:37:35 GMT

APPENDIX APreprocessingandtokenizationdetails

When inserting a new trial, the oldest trial will be removed. We use a population size of 25 and tournament size of 5.

artificial intelligence, machine learning, optformer, (19 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Neural Information Processing SystemsFeb-12-2026, 00:37:31 GMT

cf6501108fced72ee5c47e2151c4e153-Paper-Conference.pdf

Thus, most meta and transfer-learning HPO methods [7-16] consider a restrictive setting where all tasks must share the same set of hyperparameters so that the input data can be represented as fixed-sizedvectors.

artificial intelligence, machine learning, optimization, (19 more...)

Country: Africa > Ethiopia > Addis Ababa > Addis Ababa (0.04)

Genre: Research Report (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)