AITopics | rmax 1

Collaborating Authors

rmax 1

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Theoretical Foundations and Effective Algorithms for Policy-Aware Simulator Learning

Dann, Christoph, Mansour, Yishay, Mohri, Mehryar

arXiv.org Machine LearningMay-29-2026

Model-based reinforcement learning (MBRL) agents typically learn world models by minimizing predictive loss. However, powerful RL optimizers inevitably exploit minor model inaccuracies, leading to simulator exploitation and a reality gap where policies succeed in simulation but fail in the real world. We propose that the objective for learning simulators should be strategic robustness rather than predictive accuracy, and formulate this as a zero-sum minimax game between a model player and an adversarial policy player. We provide a comprehensive theoretical analysis: (1) an online learning guarantee showing the game is learnable with sublinear regret bounds; (2) a tractable critic-based simplification bounding the global policy-value gap by the local critic's loss; and (3) an Error-MDP duality, proving that finding the worst-case policy is formally dual to a standard RL problem where the reward is the one-step critic error. This duality yields a provably convergent active data selection algorithm. Experiments on continuous control tasks demonstrate that our approach reduces prediction error in strategically important regions by $1.5$-$2.2\times$ and enables policies trained purely in simulation to match near-optimal real-world performance.

machine learning, reinforcement learning, simulator, (13 more...)

arXiv.org Machine Learning

2605.29032

Genre: Research Report (0.81)

Industry:

Education (0.48)
Leisure & Entertainment (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Appendix

Neural Information Processing SystemsFeb-11-2026, 19:47:24 GMT

According to Alg. 2, in each exploration, at least one leaf node will be expanded. Moreover, the overall size of the belief tree isO((|A|min(Pδmax,Nmax))D), where Nmax is the maximum sample size given by KLD-Sampling,Pδmax = supb,aPδ(Yb,a), and Yb,a is the set of reachable beliefs after executing actiona at belief b. The tree size is limited sinceNmax is finite. The weights are normalized, i.e., There exist bounded functionsα and α0 such that V (b) = R α(s)b(s)ds, and V (b0) = R α0(s)b0(s)ds. Wecan bound the first and third terms, respectively,byλinlight ofthe assumptions.

artificial intelligence, rmax 1, rocksample, (17 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.70)

Add feedback

VisualAdversarialImitationLearning usingVariationalModels

Neural Information Processing SystemsFeb-7-2026, 16:06:10 GMT

Behaviour cloning (BC) is a classic algorithm to imitate expert demonstrations [7], which uses supervised learning to greedily match the expert behaviour at demonstrated expert states. Due to environmentstochasticity,covariateshift,andpolicyapproximationerror,theagentmaydriftaway from the expert state distribution and ultimately fail to mimic the demonstrator [8].

artificial intelligence, machine learning, reinforcement learning, (17 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.04)
Asia > Middle East > Jordan (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.68)

Add feedback