AITopics | deepmind paper

Collaborating Authors

deepmind paper

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

DeepMind papers at ICLR 2018 DeepMind

#artificialintelligenceMay-11-2018, 14:21:30 GMT

Here you can read details of all DeepMind's accepted papers and find out where you can see the accompanying poster sessions and talks. We introduce a new algorithm for reinforcement learning called Maximum a posteriori Policy Optimisation (MPO) based on coordinate ascent on a relative entropy objective. We show that several existing methods can directly be related to our derivation. We develop two off-policy algorithms and demonstrate that they are competitive with the state-of-the-art in deep reinforcement learning. In particular, for continuous control, our method outperforms existing methods with respect to sample efficiency, premature convergence and robustness to hyperparameter settings.

artificial intelligence, deep learning, machine learning, (17 more...)

#artificialintelligence

Country: North America > Canada (0.28)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

DeepMind papers at NIPS 2017 DeepMind

#artificialintelligenceApr-28-2018, 21:16:04 GMT

Learning in models with discrete latent variables is challenging due to high-variance gradient estimators. Previous approaches either produced high-variance, unbiased gradients or low-variance, biased gradients. REBAR uses control variates and the reparameterization trick to get the best of both: low-variance, unbiased gradients that result in faster convergence to a better result. "We describe a new family of approaches for imagination-based planning...We also introduce architectures which provide new ways for agents to learn and construct plans to maximise the efficiency of a task. These architectures are efficient, robust to complex and imperfect models, and can adopt flexible strategies for exploiting their imagination. The agents we introduce benefit from an'imagination encoder'- a neural network which learns to extract any information useful for the agent's future decisions, but ignore that which is not relevant."

deepmind paper, large language model, machine learning, (7 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.76)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.76)

Add feedback

DeepMind papers at ICML 2017 (part one) DeepMind

@machinelearnbotAug-7-2017, 17:10:05 GMT

We consider the problem of provably optimal exploration in reinforcement learning for finite horizon MDPs. We show that an optimistic modification to value iteration achieves a regret bound of order (HSAT)1/2 (up to a logarithmic factor) where H is the time horizon, S the number of states, A the number of actions and T the number of time-steps. This result improves over the best previous known bound HS(AT)1/2 achieved by the UCRL2 algorithm of [Jaksch, Ortner, Auer, 2010]. The key significance of our new results is that for large T, the sample complexity of our algorithm matches the optimal lower bound of Ω(HSAT)1/2. Our analysis contains two key insights.

deepmind paper, large language model, machine learning, (7 more...)

@machinelearnbot

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.76)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.76)

Add feedback

Learning Policies For Learning Policies -- Meta Reinforcement Learning (RL²) in Tensorflow

#artificialintelligenceFeb-1-2017, 16:15:24 GMT

Reinforcement Learning provides a framework for training agents to solve problems in the world. One of the limitations of these agents however is their inflexibility once trained. They are able to learn a policy to solve a specific problem (formalized as an MDP), but that learned policy is often useless in new problems, even relatively similar ones. Imagine the simplest possible agent: one trained to solve a two-armed bandit task in which one arm always provides a positive reward, and the other arm always provides no reward. Using any RL algorithm such as Q-Learning or Policy Gradient, the agent can quickly learn to always choose the arm with the positive reward.

artificial intelligence, machine learning, reinforcement learning, (18 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback