AITopics | Ruderman, Avraham

Collaborating Authors

Ruderman, Avraham

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Rigorous Agent Evaluation: An Adversarial Approach to Uncover Catastrophic Failures

Uesato, Jonathan, Kumar, Ananya, Szepesvari, Csaba, Erez, Tom, Ruderman, Avraham, Anderson, Keith, Krishmamurthy, null, Dvijotham, null, Heess, Nicolas, Kohli, Pushmeet

arXiv.org Machine LearningDec-4-2018

This paper addresses the problem of evaluating learning systems in safety critical domains such as autonomous driving, where failures can have catastrophic consequences. We focus on two problems: searching for scenarios when learned agents fail and assessing their probability of failure. The standard method for agent evaluation in reinforcement learning, Vanilla Monte Carlo, can miss failures entirely, leading to the deployment of unsafe agents. We demonstrate this is an issue for current agents, where even matching the compute used for training is sometimes insufficient for evaluation. To address this shortcoming, we draw upon the rare event probability estimation literature and propose an adversarial evaluation approach. Our approach focuses evaluation on adversarially chosen situations, while still providing unbiased estimates of failure probabilities. The key difficulty is in identifying these adversarial situations -- since failures are rare there is little signal to drive optimization. To solve this we propose a continuation approach that learns failure modes in related but less robust agents. Our approach also allows reuse of data already collected for training the agent. We demonstrate the efficacy of adversarial evaluation on two standard domains: humanoid control and simulated driving. Experimental results show that our methods can find catastrophic failures and estimate failures rates of agents multiple orders of magnitude faster than standard evaluation schemes, in minutes to hours rather than days.

agent, deep learning, neural network, (20 more...)

arXiv.org Machine Learning

1812.01647

Country: North America > United States (0.14)

Genre: Research Report > New Finding (1.00)

Industry:

Leisure & Entertainment (0.67)
Automobiles & Trucks (0.48)
Transportation > Ground > Road (0.34)
Information Technology > Robotics & Automation (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
(2 more...)

Add feedback

Human-level performance in first-person multiplayer games with population-based deep reinforcement learning

Jaderberg, Max, Czarnecki, Wojciech M., Dunning, Iain, Marris, Luke, Lever, Guy, Castaneda, Antonio Garcia, Beattie, Charles, Rabinowitz, Neil C., Morcos, Ari S., Ruderman, Avraham, Sonnerat, Nicolas, Green, Tim, Deason, Louise, Leibo, Joel Z., Silver, David, Hassabis, Demis, Kavukcuoglu, Koray, Graepel, Thore

arXiv.org Machine LearningJul-3-2018

Recent progress in artificial intelligence through reinforcement learning (RL) has shown great success on increasingly complex single-agent environments and two-player turn-based games. However, the real-world contains multiple agents, each learning and acting independently to cooperate and compete with other agents, and environments reflecting this degree of complexity remain an open challenge. In this work, we demonstrate for the first time that an agent can achieve human-level in a popular 3D multiplayer first-person video game, Quake III Arena Capture the Flag, using only pixels and game points as input. These results were achieved by a novel two-tier optimisation process in which a population of independent RL agents are trained concurrently from thousands of parallel matches with agents playing in teams together and against each other on randomly generated environments. Each agent in the population learns its own internal reward signal to complement the sparse delayed reward from winning, and selects actions using a novel temporally hierarchical representation that enables the agent to reason at multiple timescales. During game-play, these agents display human-like behaviours such as navigating, following, and defending based on a rich learned representation that is shown to encode high-level game knowledge. In an extensive tournament-style evaluation the trained agents exceeded the win-rate of strong human players both as teammates and opponents, and proved far stronger than existing state-of-the-art agents. These results demonstrate a significant jump in the capabilities of artificial agents, bringing us closer to the goal of human-level intelligence.

agent, computer game, deep learning, (21 more...)

arXiv.org Machine Learning

1807.01281

Genre: Research Report > New Finding (1.00)

Industry: Leisure & Entertainment > Games > Computer Games (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Learned Deformation Stability in Convolutional Neural Networks

Ruderman, Avraham, Rabinowitz, Neil, Morcos, Ari S., Zoran, Daniel

arXiv.org Machine LearningApr-12-2018

Conventional wisdom holds that interleaved pooling layers in convolutional neural networks lead to stability to small translations and deformations. In this work, we investigate this claim empirically. We find that while pooling confers stability to deformation at initialization, the deformation stability at each layer changes significantly over the course of training and even decreases in some layers, suggesting that deformation stability is not unilaterally helpful. Surprisingly, after training, the pattern of deformation stability across layers is largely independent of whether or not pooling was present. We then show that a significant factor in determining deformation stability is filter smoothness. Moreover, filter smoothness and deformation stability are not simply a consequence of the distribution of input images, but depend crucially on the joint distribution of images and labels. This work demonstrates a way in which biases such as deformation stability can in fact be learned and provides an example of understanding how a simple property of learned network weights contributes to the overall network computation.

deep learning, deformation stability, neural network, (20 more...)

arXiv.org Machine Learning

1804.04438

Genre: Research Report (0.82)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Model-Free Episodic Control

Blundell, Charles, Uria, Benigno, Pritzel, Alexander, Li, Yazhe, Ruderman, Avraham, Leibo, Joel Z, Rae, Jack, Wierstra, Daan, Hassabis, Demis

arXiv.org Machine LearningJun-14-2016

State of the art deep reinforcement learning algorithms take many millions of interactions to attain human-level performance. Humans, on the other hand, can very quickly exploit highly rewarding nuances of an environment upon first discovery. In the brain, such rapid learning is thought to depend on the hippocampus and its capacity for episodic memory. Here we investigate whether a simple model of hippocampal episodic control can learn to solve difficult sequential decision-making tasks. We demonstrate that it not only attains a highly rewarding strategy significantly faster than state-of-the-art deep reinforcement learning algorithms, but also achieves a higher overall reward on some of the more challenging domains.

computer game, deep learning, episodic controller, (21 more...)

arXiv.org Machine Learning

1606.0446

Country: Europe > United Kingdom > England (0.14)

Genre: Research Report (0.82)

Industry:

Leisure & Entertainment > Games > Computer Games (0.69)
Health & Medicine > Therapeutic Area > Neurology (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.96)

Add feedback

Tighter Variational Representations of f-Divergences via Restriction to Probability Measures

Ruderman, Avraham, Reid, Mark, Garcia-Garcia, Dario, Petterson, James

arXiv.org Machine LearningJun-18-2012

We show that the variational representations for f-divergences currently used in the literature can be tightened. This has implications to a number of methods recently proposed based on this representation. As an example application we use our tighter representation to derive a general f-divergence estimator based on two i.i.d. samples and derive the dual program for this estimator that performs well empirically. We also point out a connection between our estimator and MMD.

artificial intelligence, estimator, machine learning, (15 more...)

arXiv.org Machine Learning

1206.4664

Country:

Oceania > Australia (0.14)
North America > United States (0.14)
Europe > United Kingdom > Scotland (0.14)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback