AITopics | David Silver

Well File:

Well Planning ( results)
- Shallow Hazard Analysis ( results)
- Well Plat ( results)
Wellbore Schematic ( results)
Directional Survey ( results)
Fluid Sample ( results)
Log ( results)
- Density ( results)
- Gamma Ray ( results)
- Mud ( results)
- Resistivity ( results)
Report ( results)
- Daily Report ( results)
- End of Well Report ( results)
- Well Completion Report ( results)
Rock Sample ( results)

David Silver

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Natural Value Approximators: Learning when to Trust Past Estimates

Zhongwen Xu, Joseph Modayil, Hado P. van Hasselt, Andre Barreto, David Silver, Tom Schaul

Neural Information Processing SystemsMay-28-2025, 05:56:36 GMT

Neural Information Processing Systems http://nips.cc/

artificial intelligence, machine learning, reinforcement learning, (14 more...)

Neural Information Processing Systems

Country:

North America > United States (0.46)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)

Industry:

Leisure & Entertainment > Games (0.69)
Leisure & Entertainment > Sports (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)

Add feedback

Imagination-Augmented Agents for Deep Reinforcement Learning

Sébastien Racanière, Theophane Weber, David Reichert, Lars Buesing, Arthur Guez, Danilo Jimenez Rezende, Adrià Puigdomènech Badia, Oriol Vinyals, Nicolas Heess, Yujia Li, Razvan Pascanu, Peter Battaglia, Demis Hassabis, David Silver, Daan Wierstra

Neural Information Processing SystemsMay-28-2025, 03:19:48 GMT

Neural Information Processing Systems http://nips.cc/

artificial intelligence, machine learning, reinforcement learning, (14 more...)

Neural Information Processing Systems

Country: North America > United States (0.28)

Industry: Leisure & Entertainment > Games (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Successor Features for Transfer in Reinforcement Learning

Andre Barreto, Will Dabney, Remi Munos, Jonathan J. Hunt, Tom Schaul, Hado P. van Hasselt, David Silver

Neural Information Processing SystemsMay-28-2025, 00:16:01 GMT

Transfer in reinforcement learning refers to the notion that generalization should occur not only within a task but also across tasks. We propose a transfer framework for the scenario where the reward function changes between tasks but the environment's dynamics remain the same. Our approach rests on two key ideas: successor features, a value function representation that decouples the dynamics of the environment from the rewards, and generalized policy improvement, a generalization of dynamic programming's policy improvement operation that considers a set of policies rather than a single one. Put together, the two ideas lead to an approach that integrates seamlessly within the reinforcement learning framework and allows the free exchange of information across tasks. The proposed method also provides performance guarantees for the transferred policy even before any learning has taken place. We derive two theorems that set our approach in firm theoretical ground and present experiments that show that it successfully promotes transfer in practice, significantly outperforming alternative methods in a sequence of navigation tasks and in the control of a simulated robotic arm.

artificial intelligence, machine learning, reinforcement learning, (15 more...)

Neural Information Processing Systems

Country: North America > United States (0.28)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Meta-Gradient Reinforcement Learning

Zhongwen Xu, Hado P. van Hasselt, David Silver

Neural Information Processing SystemsMay-26-2025, 05:11:34 GMT

Neural Information Processing Systems http://nips.cc/

artificial intelligence, machine learning, reinforcement learning, (15 more...)

Neural Information Processing Systems

Country: North America > Canada > Alberta (0.14)

Industry: Leisure & Entertainment (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Add feedback

Meta-Gradient Reinforcement Learning

Zhongwen Xu, Hado P. van Hasselt, David Silver

Neural Information Processing SystemsMay-23-2025, 21:53:57 GMT

The goal of reinforcement learning algorithms is to estimate and/or optimise the value function. However, unlike supervised learning, no teacher or oracle is available to provide the true value function. Instead, the majority of reinforcement learning algorithms estimate and/or optimise a proxy for the value function. This proxy is typically based on a sampled and bootstrapped approximation to the true value function, known as a return. The particular choice of return is one of the chief components determining the nature of the algorithm: the rate at which future rewards are discounted; when and how values should be bootstrapped; or even the nature of the rewards themselves.

artificial intelligence, machine learning, reinforcement learning, (15 more...)

Neural Information Processing Systems

Country: North America > Canada > Alberta (0.14)

Industry: Leisure & Entertainment (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Add feedback

The Option Keyboard: Combining Skills in Reinforcement Learning

Andre Barreto, Diana Borsa, Shaobo Hou, Gheorghe Comanici, Eser Aygün, Philippe Hamel, Daniel Toyama, Jonathan hunt, Shibl Mourad, David Silver, Doina Precup

Neural Information Processing SystemsMar-22-2025, 23:06:20 GMT

Neural Information Processing Systems http://nips.cc/

artificial intelligence, machine learning, reinforcement learning, (13 more...)

Neural Information Processing Systems

Country: North America (0.28)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Discovery of Useful Questions as Auxiliary Tasks

Vivek Veeriah, Matteo Hessel, Zhongwen Xu, Janarthanan Rajendran, Richard L. Lewis, Junhyuk Oh, Hado P. van Hasselt, David Silver, Satinder Singh

Neural Information Processing SystemsMar-22-2025, 16:53:59 GMT

Arguably, intelligent agents ought to be able to discover their own questions so that in learning answers for them they learn unanticipated useful knowledge and skills; this departs from the focus in much of machine learning on agents learning answers to externally defined questions. We present a novel method for a reinforcement learning (RL) agent to discover questions formulated as general value functions or GVFs, a fairly rich form of knowledge representation. Specifically, our method uses non-myopic meta-gradients to learn GVF-questions such that learning answers to them, as an auxiliary task, induces useful representations for the main task faced by the RL agent. We demonstrate that auxiliary tasks based on the discovered GVFs are sufficient, on their own, to build representations that support main task learning, and that they do so better than popular hand-designed auxiliary tasks from the literature. Furthermore, we show, in the context of Atari 2600 videogames, how such auxiliary tasks, meta-learned alongside the main task, can improve the data efficiency of an actor-critic agent.

auxiliary task, machine learning, reinforcement learning, (14 more...)

Neural Information Processing Systems

Country:

North America > Canada (0.68)
North America > United States (0.46)

Genre: Research Report (0.88)

Industry: Leisure & Entertainment > Games > Computer Games (0.35)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

The Option Keyboard: Combining Skills in Reinforcement Learning

Andre Barreto, Diana Borsa, Shaobo Hou, Gheorghe Comanici, Eser Aygün, Philippe Hamel, Daniel Toyama, Jonathan hunt, Shibl Mourad, David Silver, Doina Precup

Neural Information Processing SystemsJan-22-2025, 09:18:21 GMT

The ability to combine known skills to create new ones may be crucial in the solution of complex reinforcement learning problems that unfold over extended periods. We argue that a robust way of combining skills is to define and manipulate them in the space of pseudo-rewards (or "cumulants"). Based on this premise, we propose a framework for combining skills using the formalism of options. We show that every deterministic option can be unambiguously represented as a cumulant defined in an extended domain. Building on this insight and on previous results on transfer learning, we show how to approximate options whose cumulants are linear combinations of the cumulants of known options. This means that, once we have learned options associated with a set of cumulants, we can instantaneously synthesise options induced by any linear combination of them, without any learning involved. We describe how this framework provides a hierarchical interface to the environment whose abstract actions correspond to combinations of basic skills. We demonstrate the practical benefits of our approach in a resource management problem and a navigation task involving a quadrupedal simulated robot.

artificial intelligence, machine learning, reinforcement learning, (13 more...)

Neural Information Processing Systems

Country: North America > Canada (0.14)

Industry: Education (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Discovery of Useful Questions as Auxiliary Tasks

Vivek Veeriah, Matteo Hessel, Zhongwen Xu, Janarthanan Rajendran, Richard L. Lewis, Junhyuk Oh, Hado P. van Hasselt, David Silver, Satinder Singh

Neural Information Processing SystemsJan-21-2025, 18:05:54 GMT

auxiliary task, machine learning, reinforcement learning, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > Michigan (0.14)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.14)
North America > Canada > Alberta (0.14)
Europe > United Kingdom > England (0.14)

Genre: Research Report (0.88)

Industry: Leisure & Entertainment > Games > Computer Games (0.49)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Learning values across many orders of magnitude

Hado P. van Hasselt, Arthur Guez, Arthur Guez, Matteo Hessel, Volodymyr Mnih, David Silver

Neural Information Processing SystemsJan-20-2025, 11:06:48 GMT

Most learning algorithms are not invariant to the scale of the signal that is being approximated. We propose to adaptively normalize the targets used in the learning updates. This is important in value-based reinforcement learning, where the magnitude of appropriate value approximations can change over time when we update the policy of behavior. Our main motivation is prior work on learning to play Atari games, where the rewards were clipped to a predetermined range. This clipping facilitates learning across many different games with a single learning algorithm, but a clipped reward function can result in qualitatively different behavior. Using adaptive normalization we can remove this domain-specific heuristic without diminishing overall performance.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

Neural Information Processing Systems

Country: