AITopics | Remi Munos

Collaborating Authors

Remi Munos

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Safe and Efficient Off-Policy Reinforcement Learning

Remi Munos, Tom Stepleton, Anna Harutyunyan, Marc Bellemare

Neural Information Processing SystemsJun-2-2025, 06:29:12 GMT

In this work, we take a fresh look at some old and new algorithms for off-policy, return-based reinforcement learning. Expressing these in a common form, we derive a novel algorithm, Retrace(λ), with three desired properties: (1) it has low variance; (2) it safely uses samples collected from any behaviour policy, whatever its degree of "off-policyness"; and (3) it is efficient as it makes the best use of samples collected from near on-policy behaviour policies. We analyze the contractive nature of the related operator under both off-policy policy evaluation and control settings and derive online sample-based algorithms. We believe this is the first return-based off-policy control algorithm converging a.s. to Q

artificial intelligence, machine learning, reinforcement learning, (19 more...)

Neural Information Processing Systems

Country:

North America > United States (0.14)
Europe > Spain (0.14)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Memory-Efficient Backpropagation Through Time

Audrunas Gruslys, Remi Munos, Ivo Danihelka, Marc Lanctot, Alex Graves

Neural Information Processing SystemsJun-2-2025, 05:07:48 GMT

We propose a novel approach to reduce memory consumption of the backpropagation through time (BPTT) algorithm when training recurrent neural networks (RNNs). Our approach uses dynamic programming to balance a trade-off between caching of intermediate results and recomputation. The algorithm is capable of tightly fitting within almost any user-set memory budget while finding an optimal execution policy minimizing the computational cost. Computational devices have limited memory capacity and maximizing a computational performance given a fixed memory budget is a practical use-case. We provide asymptotic computational upper bounds for various regimes. The algorithm is particularly effective for long sequences. For sequences of length 1000, our algorithm saves 95% of memory usage while using only one third more time per iteration than the standard BPTT.

algorithm, artificial intelligence, machine learning, (14 more...)

Neural Information Processing Systems

Country: Europe > Spain (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Planning in entropy-regularized Markov decision processes and games

Jean-Bastien Grill, Omar Darwiche Domingues, Pierre Menard, Remi Munos, Michal Valko

Neural Information Processing SystemsMay-31-2025, 17:11:25 GMT

We propose SmoothCruiser, a new planning algorithm for estimating the value function in entropy-regularized Markov decision processes and two-player games, given a generative model of the environment.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

Neural Information Processing Systems

Country:

North America > Canada (0.14)
Europe > France (0.14)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.62)

Add feedback

Successor Features for Transfer in Reinforcement Learning

Andre Barreto, Will Dabney, Remi Munos, Jonathan J. Hunt, Tom Schaul, Hado P. van Hasselt, David Silver

Neural Information Processing SystemsMay-28-2025, 00:16:01 GMT

Transfer in reinforcement learning refers to the notion that generalization should occur not only within a task but also across tasks. We propose a transfer framework for the scenario where the reward function changes between tasks but the environment's dynamics remain the same. Our approach rests on two key ideas: successor features, a value function representation that decouples the dynamics of the environment from the rewards, and generalized policy improvement, a generalization of dynamic programming's policy improvement operation that considers a set of policies rather than a single one. Put together, the two ideas lead to an approach that integrates seamlessly within the reinforcement learning framework and allows the free exchange of information across tasks. The proposed method also provides performance guarantees for the transferred policy even before any learning has taken place. We derive two theorems that set our approach in firm theoretical ground and present experiments that show that it successfully promotes transfer in practice, significantly outperforming alternative methods in a sequence of navigation tasks and in the control of a simulated robotic arm.

artificial intelligence, machine learning, reinforcement learning, (15 more...)

Neural Information Processing Systems

Country: North America > United States (0.28)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Optimistic optimization of a Brownian

Jean-Bastien Grill, Michal Valko, Remi Munos

Neural Information Processing SystemsMay-26-2025, 09:13:29 GMT

We address the problem of optimizing a Brownian motion. We consider a (random) realization W of a Brownian motion with input space in [0, 1]. Given W, our goal is to return an ε-approximation of its maximum using the smallest possible number of function evaluations, the sample complexity of the algorithm.

artificial intelligence, brownian motion, machine learning, (17 more...)

Neural Information Processing Systems

Country: Europe > France (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.46)
Information Technology > Artificial Intelligence > Machine Learning (0.46)

Add feedback

Optimistic optimization of a Brownian

Jean-Bastien Grill, Michal Valko, Remi Munos

Neural Information Processing SystemsMar-26-2025, 20:36:30 GMT

Neural Information Processing Systems http://nips.cc/

artificial intelligence, brownian motion, machine learning, (16 more...)

Neural Information Processing Systems

Country: Europe > France (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.66)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.46)

Add feedback

Multiagent Evaluation under Incomplete Information

Mark Rowland, Shayegan Omidshafiei, Karl Tuyls, Julien Perolat, Michal Valko, Georgios Piliouras, Remi Munos

Neural Information Processing SystemsMar-23-2025, 13:25:49 GMT

Neural Information Processing Systems http://nips.cc/

artificial intelligence, strategy profile, survey article, (18 more...)

Neural Information Processing Systems

Country:

North America (0.46)
Asia (0.46)

Genre: Overview (1.00)

Industry: Leisure & Entertainment > Games (1.00)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Planning in entropy-regularized Markov decision processes and games

Jean-Bastien Grill, Omar Darwiche Domingues, Pierre Menard, Remi Munos, Michal Valko

Neural Information Processing SystemsMar-23-2025, 12:58:03 GMT

artificial intelligence, machine learning, reinforcement learning, (16 more...)

Neural Information Processing Systems

Country:

North America (0.28)
Europe (0.28)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.62)

Add feedback

Bounded Regret for Finite-Armed Structured Bandits

Tor Lattimore, Remi Munos

Neural Information Processing SystemsFeb-12-2025, 00:59:27 GMT

We study a new type of K-armed bandit problem where the expected return of one arm may depend on the returns of other arms. We present a new algorithm for this general class of problems and show that under certain circumstances it is possible to achieve finite expected cumulative regret. We also give problemdependent lower bounds on the cumulative regret showing that at least in special cases the new algorithm is nearly optimal.

data mining, finite regret, machine learning, (20 more...)

Neural Information Processing Systems

Country: North America > Canada (0.28)

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science > Data Mining > Big Data (0.51)

Add feedback

Multiagent Evaluation under Incomplete Information

Mark Rowland, Shayegan Omidshafiei, Karl Tuyls, Julien Perolat, Michal Valko, Georgios Piliouras, Remi Munos

Neural Information Processing SystemsJan-23-2025, 16:31:26 GMT

This paper investigates the evaluation of learned multiagent strategies in the incomplete information setting, which plays a critical role in ranking and training of agents. Traditionally, researchers have relied on Elo ratings for this purpose, with recent works also using methods based on Nash equilibria. Unfortunately, Elo is unable to handle intransitive agent interactions, and other techniques are restricted to zero-sum, two-player settings or are limited by the fact that the Nash equilibrium is intractable to compute. Recently, a ranking method called α-Rank, relying on a new graph-based game-theoretic solution concept, was shown to tractably apply to general games. However, evaluations based on Elo or α-Rank typically assume noise-free game outcomes, despite the data often being collected from noisy simulations, making this assumption unrealistic in practice. This paper investigates multiagent evaluation in the incomplete information regime, involving general-sum many-player games with noisy outcomes. We derive sample complexity guarantees required to confidently rank agents in this setting. We propose adaptive algorithms for accurate ranking, provide correctness and sample complexity guarantees, then introduce a means of connecting uncertainties in noisy match outcomes to uncertainties in rankings. We evaluate the performance of these approaches in several domains, including Bernoulli games, a soccer meta-game, and Kuhn poker.

artificial intelligence, strategy profile, survey article, (18 more...)

Neural Information Processing Systems

Country: