Goto

Collaborating Authors

 Reinforcement Learning


Data Science: Supervised Machine Learning in Python

#artificialintelligence

In recent years, we've seen a resurgence in AI, or artificial intelligence, and machine learning. Machine learning has led to some amazing results, like being able to analyze medical images and predict diseases on-par with human experts. Google's AlphaGo program was able to beat a world champion in the strategy game go using deep reinforcement learning. Machine learning is even being used to program self driving cars, which is going to change the automotive industry forever. Imagine a world with drastically reduced car accidents, simply by removing the element of human error.


ARTIFICIAL INTELLIGENCE: Deep Q-Learning-SENSOR WORLD_2 Part 10 PHASE I completed

#artificialintelligence

This is a Deep Reinforcement Learning project which will eventually employ a Deep Q-Learning algorithm. In SENSOR WORLD_2, a robot with sensors starts with no information about it's environment and uses its sensors to define it's State Space. Here the Robot explores randomly to detect all of Sensor Word.


Reinforcement Learning: The Business Use Case, Part 1

#artificialintelligence

The whirl of reinforcement learning started with the advent of AlphaGo by DeepMind, the AI system built to play the game Go. Since then, various companies have invested a great deal of time, energy, and research, and today reinforcement learning is one of the hot topics within Deep Learning. That said, most businesses are struggling to find use cases for reinforcement learning or ways to encompass it within their business logic. So far, it's been studied only in risk-free, observed, environments that are easy to simulate, which means that industries like finance, health, insurance, tech-consultancies are reluctant to risk their own money to explore its applications. What's more, the aspect of "risk factoring" within reinforcement learning puts a high strain on systems.


Bonsai is Bringing the BRAINs to Microsoft

#artificialintelligence

Microsoft has increasingly keen on AI and has sought to commercialize the ideas its own researchers come up with. This is a strategy also employed by their main rivals Amazon and Google, as well as other big technology companies. They're planning on doing even more in this field with the acquisition of AI startup Bonsai, in order to ease the on-ramp for building AI on Microsoft. Bonsai is quite the expert at combining the power of machine teaching and deep reinforcement learning into an end-to-end platform that is accessible to data scientists, software engineers and subject matter experts. This makes Microsoft very well-positioned to lead the penetration of AI into the enterprise market.


Scale-invariant temporal history (SITH): optimal slicing of the past in an uncertain world

arXiv.org Artificial Intelligence

In both the human brain and any general artificial intelligence (AI), a representation of the past is necessary to predict the future. However, perfect storage of all experiences is not feasible. One possibility, utilized in many applications, is to retain information about the past in a buffer. A limitation of this approach is that, although events in the buffer are represented with perfect accuracy, the resources necessary to represent information at multiple time scales go up rapidly. Here we present a neurally-plausible, compressed, scale-free memory representation we call Scale-Invariant Temporal History (SITH). This representation covers an exponentially large period of time at the cost of sacrificing temporal accuracy for events further in the past. The form of this decay is scale-invariant and can be shown to be optimal, in that it is able to respond to worlds with a wide range of relevant time scales. We demonstrate the utility of this representation in learning to play video games at different levels of complexity. In these environments, SITH exhibits better learning performance than both a fixed-size buffer history representation and a representation with exponentially decaying features. Whereas the buffer performs well as long as the temporal dependencies can be represented within the buffer, SITH performs well over a much larger range of time scales with the same amount of resources. Finally, we discuss how the application of SITH, along with other human-inspired models of cognition, could improve reinforcement and machine learning algorithms in general.


Policy Optimization as Wasserstein Gradient Flows

arXiv.org Machine Learning

Policy optimization is a core component of reinforcement learning (RL), and most existing RL methods directly optimize parameters of a policy based on maximizing the expected total reward, or its surrogate. Though often achieving encouraging empirical success, its underlying mathematical principle on {\em policy-distribution} optimization is unclear. We place policy optimization into the space of probability measures, and interpret it as Wasserstein gradient flows. On the probability-measure space, under specified circumstances, policy optimization becomes a convex problem in terms of distribution optimization. To make optimization feasible, we develop efficient algorithms by numerically solving the corresponding discrete gradient flows. Our technique is applicable to several RL settings, and is related to many state-of-the-art policy-optimization algorithms. Empirical results verify the effectiveness of our framework, often obtaining better performance compared to related algorithms.


Exponential improvements for quantum-accessible reinforcement learning

arXiv.org Artificial Intelligence

Quantum computers can offer dramatic improvements over classical devices for data analysis tasks such as prediction and classification. However, less is known about the advantages that quantum computers may bring in the setting of reinforcement learning, where learning is achieved via interaction with a task environment. Here, we consider a special case of reinforcement learning, where the task environment allows quantum access. In addition, we impose certain "naturalness" conditions on the task environment, which rule out the kinds of oracle problems that are studied in quantum query complexity (and for which quantum speedups are well-known). Within this framework of quantum-accessible reinforcement learning environments, we demonstrate that quantum agents can achieve exponential improvements in learning efficiency, surpassing previous results that showed only quadratic improvements. A key step in the proof is to construct task environments that encode well-known oracle problems, such as Simon's problem and Recursive Fourier Sampling, while satisfying the above "naturalness" conditions for reinforcement learning. Our results suggest that quantum agents may perform well in certain game-playing scenarios, where the game has recursive structure, and the agent can learn by playing against itself.


Researchers teach an AI how to dribble

#artificialintelligence

While this animated fellow looks like something out of NBA 2K18, it's really an AI that's learning how to dribble in real time. The AI starts out fumbling the ball a bit and by cycle 95 it is able to do some real Harlem Globetrotters stuff. In short, what you're watching is a human-like avatar learning a very specialized human movement. To do this researchers at Carnegie Mellon and DeepMotion, Inc. created a "physics-based, real-time method for controlling animated characters that can learn dribbling skills from experience." The system, which uses "deep reinforcement learning," can use motion capture date to learn basic movements.


Artificial Intelligence: Reinforcement Learning in Python

#artificialintelligence

When people talk about artificial intelligence, they usually don't mean supervised and unsupervised machine learning. These tasks are pretty trivial compared to what we think of AIs doing - playing chess and Go, driving cars, and beating video games at a superhuman level. Reinforcement learning has recently become popular for doing all of that and more. Much like deep learning, a lot of the theory was discovered in the 70s and 80s but it hasn't been until recently that we've been able to observe first hand the amazing results that are possible. In 2016 we saw Google's AlphaGo beat the world Champion in Go.


Advanced AI: Deep Reinforcement Learning in Python

#artificialintelligence

This course is all about the application of deep learning and neural networks to reinforcement learning. If you've taken my first reinforcement learning class, then you know that reinforcement learning is on the bleeding edge of what we can do with AI. Specifically, the combination of deep learning with reinforcement learning has led to AlphaGo beating a world champion in the strategy game Go, it has led to self-driving cars, and it has led to machines that can play video games at a superhuman level. Reinforcement learning has been around since the 70s but none of this has been possible until now. The world is changing at a very fast pace. The state of California is changing their regulations so that self-driving car companies can test their cars without a human in the car to supervise.