AITopics | oracle policy

Collaborating Authors

oracle policy

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Loss Functions for Multiset Prediction

Sean Welleck, Zixin Yao, Yu Gai, Jialin Mao, Zheng Zhang, Kyunghyun Cho

Neural Information Processing SystemsFeb-15-2026, 08:54:06 GMT

Neural Information Processing Systems http://nips.cc/

loss function, multiset, prediction, (15 more...)

Neural Information Processing Systems

Country:

North America > United States (0.04)
North America > Canada > Quebec > Montreal (0.04)
Asia > China > Shanghai > Shanghai (0.04)

Genre: Workflow (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

3c56fe2f24038c4d22b9eb0aca78f590-Paper.pdf

Neural Information Processing SystemsFeb-8-2026, 03:56:18 GMT

algorithm, oracle, oracle policy, (13 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Middlesex County > Belmont (0.04)
North America > Canada (0.04)
Asia > Middle East > Jordan (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Robots (0.95)

Add feedback

3c56fe2f24038c4d22b9eb0aca78f590-AuthorFeedback.pdf

Neural Information Processing SystemsFeb-8-2026, 03:56:08 GMT

oracle policy, revision, value function, (14 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.72)

Add feedback

Policy Improvement via Imitation of Multiple Oracles

Neural Information Processing SystemsDec-23-2025, 23:16:07 GMT

Despite its promise, reinforcement learning's real-world adoption has been hampered by the need for costly exploration to learn a good policy. Imitation learning (IL) mitigates this shortcoming by using an oracle policy during training as a bootstrap to accelerate the learning process. However, in many practical situations, the learner has access to multiple suboptimal oracles, which may provide conflicting advice in a state. The existing IL literature provides a limited treatment of such scenarios. Whereas in the single-oracle case, the return of the oracle's policy provides an obvious benchmark for the learner to compete against, neither such a benchmark nor principled ways of outperforming it are known for the multi-oracle setting. In this paper, we propose the state-wise maximum of the oracle policies' values as a natural baseline to resolve conflicting advice from multiple oracles. Using a reduction of policy optimization to online learning, we introduce a novel IL algorithm MAMBA, which can provably learn a policy competitive with this benchmark.

imitation, name change, policy improvement, (10 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Learning Dexterous Manipulation Skills from Imperfect Simulations

Hsieh, Elvis, Hsieh, Wen-Han, Wang, Yen-Jen, Lin, Toru, Malik, Jitendra, Sreenath, Koushil, Qi, Haozhi

arXiv.org Artificial IntelligenceDec-2-2025

Figure 1: We propose DexScrew, a sim-to-real framework for learning dexterous manipulation skills when the environment cannot be accurately simulated. In simulation, we use simplified objects to learn transferable rotational skills, which are then used to collect data and train tactile policies in the real world. We demonstrate the framework on contact-rich screwdriving (top row) and nut-bolt fastening (middle row). We also show generalization across different objects (bottom row). More videos and code are available on https://dexscrew.github.io. Abstract-- Reinforcement learning and sim-to-real transfer have made significant progress in dexterous manipulation. However, progress remains limited by the difficulty of simulating complex contact dynamics and multisensory signals, especially tactile feedback. In this work, we propose DexScrew, a sim-to-real framework that addresses these limitations and demonstrates its effectiveness on nut-bolt fastening and screwdriving with multi-fingered hands. The framework has three stages. First, we train reinforcement learning policies in simulation using simplified object models that lead to the emergence of correct finger gaits. We then use the learned policy as a skill primitive within a teleoperation system to collect real-world demonstrations that contain tactile and proprioceptive information. Finally, we train a behavior cloning policy that incorporates tactile sensing and show that it generalizes to nuts and screwdrivers with diverse geometries. Experiments across both tasks show high task progress ratios compared to direct sim-to-real transfer and robust performance even on unseen object shapes and under external perturbations.

artificial intelligence, machine learning, simulation, (18 more...)

arXiv.org Artificial Intelligence

2512.02011

Genre: Research Report (0.64)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots > Manipulation (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Loss Functions for Multiset Prediction

Sean Welleck, Zixin Yao, Yu Gai, Jialin Mao, Zheng Zhang, Kyunghyun Cho

Neural Information Processing SystemsNov-20-2025, 21:23:39 GMT

We study the problem of multiset prediction. The goal of multiset prediction is to train a predictor that maps an input to a multiset consisting of multiple items.

loss function, multiset, prediction, (16 more...)

Neural Information Processing Systems

Country: