AITopics | learned model

Collaborating Authors

learned model

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Online and Offline Reinforcement Learning by Planning with a Learned Model

Neural Information Processing SystemsDec-25-2025, 03:31:08 GMT

Learning efficiently from small amounts of data has long been the focus of model-based reinforcement learning, both for the online case when interacting with the environment, and the offline case when learning from a fixed dataset. However, to date no single unified algorithm could demonstrate state-of-the-art results for both settings.In this work, we describe the Reanalyse algorithm, which uses model-based policy and value improvement operators to compute improved training targets for existing data points, allowing for efficient learning at data budgets varying by several orders of magnitude. We further show that Reanalyse can also be used to learn completely without environment interactions, as in the case of Offline Reinforcement Learning (Offline RL). Combining Reanalyse with the MuZero algorithm, we introduce MuZero Unplugged, a single unified algorithm for any data budget, including Offline RL. In contrast to previous work, our algorithm requires no special adaptations for the off-policy or Offline RL settings.

algorithm, name change, online and offline reinforcement learning, (8 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.86)

Add feedback

Online and Offline Reinforcement Learning by Planning with a Learned Model

Neural Information Processing SystemsJan-19-2025, 11:20:15 GMT

algorithm, offline reinforcement learning, online and offline reinforcement learning, (7 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Interpreting the Learned Model in MuZero Planning

Guei, Hung, Ju, Yan-Ru, Chen, Wei-Yu, Wu, Ti-Rong

arXiv.org Artificial IntelligenceNov-7-2024

MuZero has achieved superhuman performance in various games by using a dynamics network to predict environment dynamics for planning, without relying on simulators. However, the latent states learned by the dynamics network make its planning process opaque. This paper aims to demystify MuZero's model by interpreting the learned latent states. We incorporate observation reconstruction and state consistency into MuZero training and conduct an in-depth analysis to evaluate latent states across two board games: 9x9 Go and Outer-Open Gomoku, and three Atari games: Breakout, Ms. Pacman, and Pong. Our findings reveal that while the dynamics network becomes less accurate over longer simulations, MuZero still performs effectively by using planning to correct errors. Our experiments also show that the dynamics network learns better latent states in board games than in Atari games. These insights contribute to a better understanding of MuZero and offer directions for future research to improve the playing performance, robustness, and interpretability of the MuZero algorithm.

board game, learned model, muzero, (17 more...)

arXiv.org Artificial Intelligence

2411.0458

Country:

Asia > Taiwan > Taiwan Province > Taipei (0.04)
Asia > China (0.04)

Genre: Research Report > New Finding (0.67)

Industry: Leisure & Entertainment > Games > Computer Games (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

MOReL : Model-Based Offline Reinforcement Learning

Kidambi, Rahul, Rajeswaran, Aravind, Netrapalli, Praneeth, Joachims, Thorsten

arXiv.org Artificial IntelligenceJul-20-2020

In offline reinforcement learning (RL), the goal is to learn a highly rewarding policy based solely on a dataset of historical interactions with the environment. The ability to train RL policies offline can greatly expand the applicability of RL, its data efficiency, and its experimental velocity. Prior work in offline RL has been confined almost exclusively to model-free RL approaches. In this work, we present MOReL, an algorithmic framework for model-based offline RL. This framework consists of two steps: (a) learning a pessimistic MDP (P-MDP) using the offline dataset; and (b) learning a near-optimal policy in this P-MDP. The learned P-MDP has the property that for any policy, the performance in the real environment is approximately lower-bounded by the performance in the P-MDP. This enables it to serve as a good surrogate for purposes of policy evaluation and learning, and overcome common pitfalls of model-based RL like model exploitation. Theoretically, we show that MOReL is minimax optimal (up to log factors) for offline RL. Through experiments, we show that MOReL matches or exceeds state-of-the-art results in widely studied offline RL benchmarks. Moreover, the modular design of MOReL enables future advances in its components (e.g. generative modeling, uncertainty estimation, planning etc.) to directly translate into advances for offline RL.

dataset, deep learning, neural network, (17 more...)

arXiv.org Artificial Intelligence

2005.05951

Country: North America > United States (0.67)

Genre: Research Report > New Finding (0.46)

Industry: Energy > Oil & Gas (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

r/MachineLearning - [R] [1911.08265] Mastering Atari, Go, Chess and Shogi by Planning with a Learned Model

#artificialintelligenceNov-21-2019, 15:44:28 GMT

Much of it is the same as Value Prediction Networks, which proposes that instead of training a model to minimize L2 prediction-loss, you just train it to get the long-term reward/value right for a start state and a series of actions. That gets around a lot of the difficulty of using MBRL for Atari-like things, where it's very hard to accurately predict next pixels. They pretty much simulate a dense tree to some short depth, assign estimated values to the nodes, and use that for action selection. One is that you're probably simulating a lot of states that your value-function would tell you are DEFINITELY not worthwhile. Atari has 16 actions -- it's unfeasible to simulate more than 3 states deep. And since you're simulating in all directions, but only taking the best (e-greedy) action, you're not going to gather training data on most of the transitions you're estimating.

chess and shogi, machinelearning, mastering atari, (3 more...)

#artificialintelligence

Industry:

Media > News (0.40)
Leisure & Entertainment > Games > Chess (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Communications > Social Media (0.76)

Add feedback