"Reinforcement learning is learning what to do – how to map situations to actions – so as to maximize a numerical reward signal. The learner is not told which actions to take, as in most forms of machine learning, but instead must discover which actions yield the most reward by trying them."
– Sutton, Richard S. and Andrew G. Barto. Reinforcement Learning: An Introduction. (1.1). MIT Press, Cambridge, MA, 1998.
It was not long ago that the world watched World Chess Champion Garry Kasparov lose a decisive match against a supercomputer. IBM's Deep Blue embodied the state of the art in the late 1990s, when a machine defeating a world (human) champion at a complex game such as chess was still unheard of. Fast-forward to today, and not only have supercomputers greatly surpassed Deep Blue in chess, they have managed to achieve superhuman performance in a string of other games, often much more complex than chess, ranging from Go to Dota to classic Atari titles. Many of these games have been mastered just in the last five years, pointing to a pace of innovation much quicker than the two decades prior. Recently, Google released work on Agent57, which for the first time showcased superior performance over existing benchmarks across all 57 Atari 2600 games.
A startup called CogitAI has developed a platform that lets companies use reinforcement learning, the technique that gave AlphaGo mastery of the board game Go. Gaining experience: AlphaGo, an AI program developed by DeepMind, taught itself to play Go by practicing. It's practically impossible for a programmer to manually code in the best strategies for winning. Instead, reinforcement learning let the program figure out how to defeat the world's best human players on its own. Drug delivery: Reinforcement learning is still an experimental technology, but it is gaining a foothold in industry.
What will be the next thing to revolutionize data science in 2019? Reinforcement learning will be the next big thing in data science in 2019. While RL has been around for a long time in academia, it has hardly seen any industry adoption at all. Why? Partly because there have been plenty of low-hanging fruits to pick in predictive analytics, but mostly because of the barriers in implementation, knowledge and available tools. The potential value in using RL in proactive analytics and AI is enormous, but it also demands a greater skillset to master.
However even these reinforcement learning algorithms couldn't transfer what they'd learned about one task to acquiring a new task. In order to realize this achievement, DeepMind supercharged a reinforcement learning algorithm called A3C. In so-called actor-critic reinforcement learning, of which A3C is one variety, acting and learning are decoupled so that one neural network, the critic, evaluates the other, the actor. Together, they drive the learning process. This was already the state of the art, but DeepMind added a new off-policy correction algorithm called V-trace to the mix, which made the learning more efficient, and crucially, better able to achieve positive transfer between tasks.
We are building technology that enables existing robot hardware to handle a much wider range of tasks where existing solutions break down, for example, bin picking of complex shapes, kitting, assembly, depalletizing of irregular stacks, and manipulation of deformable objects such as wires, cables, fabrics, linens, fluid-bags, and food. To equip existing robots with these skills, our software builds on the latest advances in deep reinforcement learning, deep imitation learning, and few-shot learning, to all of which the founding team has made significant contributions.
Canada has produced several big breakthroughs in artificial intelligence in recent years, and its government is keen to establish the country as a global epicenter of AI. The country's prime minister, Justin Trudeau, also hopes that the technology will learn Canadian values as it grows up. Speaking at a major AI event in Toronto today, Trudeau demonstrated an impressive enthusiasm for AI and machine learning, at one point even taking a stab at describing the concept of deep reinforcement learning, an approach that lets computers learn to do complex things that can't be programmed manually (see "10 Breakthrough Technologies 2017: Reinforcement Learning"). Both deep reinforcement learning and deep neural networks, which the method exploits, were pioneered by researchers working at Canadian universities. The country's government is now investing in big efforts to spur more AI research.