Reinforcement Learning


DeepMind says it's given AI an imagination. Let's take a closer look at that

#artificialintelligence

In two papers published this week – "Imagination-Augmented Agents for Deep Reinforcement Learning" and "Learning model-based planning from scratch" – the AI biz's brain boffins, based in Britain, describe novel techniques for improving deep reinforcement learning through what can generously be described as imaginative planning. The researchers tested their imaginative agent with Sokoban, a puzzle-oriented video game, created in Japan in 1981, that involves moving boxes around a warehouse, and a spaceship navigation game. "Because agents are able to extract more knowledge from internal simulations, they can solve tasks more with fewer imagination steps than conventional search methods, like the Monte Carlo tree search." Thinking before acting makes machine learning efforts slower, but the researchers contend, "This is essential in irreversible domains, where actions can have catastrophic outcomes, such as in Sokoban."


DeepMind's AI is teaching itself parkour, and the results are adorable

#artificialintelligence

The research explores how reinforcement learning (or RL) can be used to teach a computer to navigate unfamiliar and complex environments. The novelty here is that the researchers are exploring how difficult environments can teach an agent complex and robust movements (i.e., using its knee to get purchase on top of a high wall). Usually, reinforcement learning produces behavior that is fragile, and that breaks down in unfamiliar circumstances, like a baby who knows how to tackle the stairs at home, but who can't understand an escalator. This research shows that isn't always the case, and that RL can be used to teach complex movements.


Google's DeepMind uses reinforcement learning to master parkour

#artificialintelligence

Google has taught its DeepMind AI to navigate a parkour course by using reinforcement learning. Reinforcement learning is the practice of rewarding desirable behaviour. The faster the AI could navigate the virtual parkour course, the greater the reward. It's fascinating (and humourous) to observe all the leaps, crouches, leaps, and limbos the AI decided was the best method of navigating the course.


Google's AI bots are learning to get around obstacles

Daily Mail

DeepMind researchers have trained a number of simulated bodies, including a headless'walker,' a four-legged'ant,' and a 3D humanoid, to learn more complex behaviours as they carry out different locomotion tasks. The results, while comical, show how these systems can learn to improve their own techniques as they interact with the different environments, eventually allowing them to run, jump, crouch and turn as needed. The results, while comical, show how these systems can learn to improve their own techniques as they interact with the different environments, eventually allowing them to run, jump, crouch and turn as needed. The DeepMind researchers trained a number of simulated bodies, including a headless'walker,' a four-legged'ant,' (pictured) and a 3D humanoid, to learn more complex behaviours as they carry out different locomotion tasks The approach relies on a reinforcement learning algorithm, developed using components from several recent deep learning systems.


Two Giants of AI Team Up to Head Off the Robot Apocalypse

#artificialintelligence

"If you're worried about bad things happening, the best thing we can do is study the relatively mundane things that go wrong in AI systems today," says Dario Amodei, a curly-haired researcher on OpenAI's small team working on AI safety. Amodei says the project shows it's possible to do practical work right now on making machine learning systems less able to produce nasty surprises. DeepMind and OpenAI's solution is to have reinforcement learning software take feedback from human trainers instead, and use their input to define its virtual reward system. Longer term, Amodei says, spending the next few years working on making existing, modestly smart machine learning systems more aligned with human goals could also lay the groundwork for our potential future face-off with superintelligence.


Two Giants of AI Team Up to Head Off the Robot Apocalypse

WIRED

"If you're worried about bad things happening, the best thing we can do is study the relatively mundane things that go wrong in AI systems today," says Dario Amodei, a curly-haired researcher on OpenAI's small team working on AI safety. Amodei says the project shows it's possible to do practical work right now on making machine learning systems less able to produce nasty surprises. DeepMind and OpenAI's solution is to have reinforcement learning software take feedback from human trainers instead, and use their input to define its virtual reward system. Longer term, Amodei says, spending the next few years working on making existing, modestly smart machine learning systems more aligned with human goals could also lay the groundwork for our potential future face-off with superintelligence.


Industrial AI Podcast – Bonsai – Medium

#artificialintelligence

We've partnered with the This Week in Machine Learning & AI podcast for a 7 part series on Industrial AI. In Part 3 of TWIML's Industrial AI series, Sam Charrington digs into robotics and reinforcement learning with Berkeley PhD student, Chelsea Finn. Chelsea also talks about what it's like pursuing a PhD in machine learning and how to keep up with such a rapidly advancing field. If you want to explore using reinforcement learning in your own organization, learn more at bons.ai.


Google's DeepMind Turns to Canada for Artificial Intelligence Boost

#artificialintelligence

The new research center, which will work closely with the University of Alberta, is the United Kingdom-based DeepMind's first international AI research lab. Sutton, in particular, is a noted expert in a subset of AI technologies called reinforcement learning and was an advisor to DeepMind in 2010. Google has also incorporated some of the reinforcement learning techniques used by DeepMind in its data centers to discover the best calibrations that result in lower power consumption. DeepMind has also been investigated by the United Kingdom's Information Commissioner's Office for failing to comply with the United Kingdom's Data Protection Act as it expands to using its technology in the healthcare space.


Automating AI to Make Enterprises Smarter, Faster

#artificialintelligence

In fact, what if machine learning software could be developed by machine learning software? Google's initiative, called AutoML, was a prime topic at the company's recent I/O software developer conference, and was also explained in a recent blog post by Quoc Le & Barret Zoph, research scientists on the Google Brain AI research group. The goal is to automate the design of machine learning models through reinforcement-learning algorithms. "In our approach, a controller neural net can propose a "child" model architecture, which can then be trained and evaluated for quality on a particular task.


DALI 2017 – Workshop – Data Efficient Reinforcement Learning

#artificialintelligence

With data collection on the rise, machine learning is a hot topic. The manner in which computers are able to mimic human thinking is rapidly exceeding human capabilities in everything from chess to picking the winner of a song contest.