AITopics | realizability

Collaborating Authors

realizability

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Q-MMR: Off-Policy Evaluation via Recursive Reweighting and Moment Matching

Li, Xiang, Jiang, Nan

arXiv.org Machine LearningMay-11-2026

We present a novel theoretical framework, Q-MMR, for off-policy evaluation in finite-horizon MDPs. Q-MMR learns a set of scalar weights, one for each data point, such that the reweighted rewards approximate the expected return under the target policy. The weights are learned inductively in a top-down manner via a moment matching objective against a value-function discriminator class. Notably, and perhaps surprisingly, a data-dependent finite-sample guarantee for general function approximation can be established under only the realizability of $Q^π$, with a dimension-free bound -- that is, the error does not depend on the statistical complexity of the function class. We also establish connections to several existing methods, such as importance sampling and linear FQE. Further theoretical analyses shed new light on the nature of coverage, a concept of fundamental importance to offline RL.

ddh, machine learning, reinforcement learning, (19 more...)

arXiv.org Machine Learning

2605.06474

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.68)

Add feedback

Adaptivity Under Realizability Constraints: Comparing In-Context and Agentic Learning

Kratsios, Anastasis, Neuman, A. Martina, Petersen, Philipp

arXiv.org Machine LearningMay-7-2026

We compare in-context learning with fixed queries and agentic learning with adaptive queries for uniform approximation of task families. We consider two settings: an unrestricted regime, where querying and approximation are arbitrary functions, and a realizable regime, where we require these operations to be implemented by ReLU neural networks. In both settings, adaptivity never hinders approximation performance. However, this advantage can change when one passes from the unrestricted regime to the realizable regime. We identify four distinct approximation scenarios, each witnessed by an explicit task family: (a) no advantage of adaptivity; (b) an advantage in the unrestricted regime that persists under ReLU realizability; (c) an advantage that arises only under realizability; and (d) an advantage that disappears under realizability. This demonstrates that representational constraints interact profoundly with the effect of adaptivity.

learner, machine learning, natural language, (19 more...)

arXiv.org Machine Learning

2605.04995

Country:

Europe > Austria (0.28)
North America > Canada (0.28)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

Bellman Residual Orthogonalization for Offline Reinforcement Learning Anonymous Author(s) Affiliation Address email

Neural Information Processing SystemsApr-24-2026, 18:11:15 GMT

We propose and analyze a reinforcement learning principle that approximates the1 Bellman equations by enforcing their validity only along an user-defined space of2 test functions. Focusing on applications to model-free offline RL with function3 approximation, we exploit this principle to derive confidence intervals for off-policy4 evaluation, as well as to optimize over policies within a prescribed policy class.5 We prove an oracle inequality on our policy optimization procedure in terms of6 a trade-off between the value and uncertainty of an arbitrary comparator policy.7 Different choices of test function spaces allow us to tackle different problems8 within a common framework. We characterize the loss of efficiency in moving9 from on-policy to off-policy data using our procedures, and establish connections10 to concentrability coefficients studied in past work. We examine in depth the11 implementation of our methods with linear function approximation, and provide12 theoretical guarantees with polynomial-time implementations even when Bellman13 closure does not hold.14

artificial intelligence, machine learning, reinforcement learning, (16 more...)

Neural Information Processing Systems

Country:

North America > United States (0.28)
Europe > United Kingdom > England (0.27)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

6058d0c628a03fd95dfe5c72cbdf9e64-Supplemental-Conference.pdf

Neural Information Processing SystemsFeb-12-2026, 14:16:50 GMT

artificial intelligence, data mining, machine learning, (17 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.04)
North America > United States > Virginia > Arlington County > Arlington (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(2 more...)

Genre: Research Report (0.93)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science > Data Mining (0.68)

Add feedback

AsymptoticallyExactErrorCharacterizationof OfflinePolicyEvaluationwithMisspecifiedLinear Models

Neural Information Processing SystemsFeb-11-2026, 20:20:54 GMT

Recently, theoretical understanding of OPE has been rapidly advanced under (approximate) realizability assumptions, i.e., where the environments of interest are well approximated with the given hypothetical models.

artificial intelligence, arxivpreprintarxiv, machine learning, (18 more...)

Neural Information Processing Systems

Country: Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

2a095b46705d7e6f81fc50270fe770c2-Supplemental-Conference.pdf

Neural Information Processing SystemsFeb-9-2026, 08:45:25 GMT

arxiv preprint arxiv, q-function, realizability, (11 more...)

Neural Information Processing Systems

Country: North America > United States > Illinois > Cook County > Chicago (0.04)

Genre:

Research Report > Experimental Study (0.68)
Research Report > Strength High (0.46)

Industry: Health & Medicine (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Offline Minimax Soft-Q-learning Under Realizability and Partial Coverage

Neural Information Processing SystemsFeb-9-2026, 08:45:21 GMT

We tackle this by introducing two novel value-based algorithms.

artificial intelligence, machine learning, reinforcement learning, (13 more...)

Neural Information Processing Systems

Country: North America > United States > Illinois > Cook County > Chicago (0.04)

Genre:

Research Report > Experimental Study (0.68)
Research Report > Strength High (0.46)

Industry: Health & Medicine (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

AnExponentialLowerBoundforLinearly-Realizable MDPswithConstantSuboptimalityGap

Neural Information Processing SystemsFeb-8-2026, 15:06:49 GMT

A fundamental question in the theory of reinforcement learning is: suppose the optimalQ-function lies inthe linear span ofagivenddimensional feature mapping, is sample-efficient reinforcement learning (RL) possible? The recent and remarkable result of Weisz et al. (2020) resolves this question in the negative, providinganexponential(ind)samplesizelowerbound,whichholdsevenifthe agent has access to a generative model of the environment. One may hope that such a lower can be circumvented with an even stronger assumption that there isaconstant gapbetween the optimalQ-value ofthe best action and that ofthe second-best action (for allstates); indeed, the construction inWeisz etal.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

Neural Information Processing Systems

Country: