Goto

Collaborating Authors

 Reinforcement Learning


Reimagining Language Learning with NLP and Reinforcement Learning

#artificialintelligence

The way we learn natural languages hasn't really changed for decades. We now have beautiful apps like Duolingo and Spaced Repetition software like Anki, but I'm talking about our fundamental approach. We still follow pre-defined curricula, and do essentially random exercises. Learning isn't personalized, and learning isn't driven by data. And I think there's a big opportunity to change that.


Towards a Common Implementation of Reinforcement Learning for Multiple Robotic Tasks

arXiv.org Artificial Intelligence

Mobile robots are increasingly being employed for performing complex tasks in dynamic environments. Reinforcement learning (RL) methods are recognized to be promising for specifying such tasks in a relatively simple manner. However, the strong dependency between the learning method and the task to learn is a well-known problem that restricts practical implementations of RL in robotics, often requiring major modifications of parameters and adding other techniques for each particular task. In this paper we present a practical core implementation of RL which enables the learning process for multiple robotic tasks with minimal per-task tuning or none. Based on value iteration methods, this implementation includes a novel approach for action selection, called Q-biased softmax regression (QBIASSR), which avoids poor performance of the learning process when the robot reaches new unexplored states. Our approach takes advantage of the structure of the state space by attending the physical variables involved (e.g., distances to obstacles, X,Y,{\theta} pose, etc.), thus experienced sets of states may favor the decision-making process of unexplored or rarely-explored states. This improvement has a relevant role in reducing the tuning of the algorithm for particular tasks. Experiments with real and simulated robots, performed with the software framework also introduced here, show that our implementation is effectively able to learn different robotic tasks without tuning the learning method. Results also suggest that the combination of true online SARSA({\lambda}) with QBIASSR can outperform the existing RL core algorithms in low-dimensional robotic tasks.


Google's Artificial Intelligence Becoming 'Human-Like' With Aggressive, Greedy Behavior We Are Change

#artificialintelligence

Will artificial intelligence get more aggressive and selfish the more intelligent it becomes? A new report out of Google's DeepMind AI division suggests this is possible based on the outcome of millions of video game sessions it monitored. The results of the two games indicate that as artificial intelligence becomes more complex, it is more likely to take extreme measures to ensure victory, including sabotage and greed. The first game, Gathering, is a simple one that involves gathering digital fruit. Two DeepMind AI agents were pitted against each other after being trained in the ways of deep reinforcement learning.


GitHub - Microsoft/AirSim: Open source simulator based on Unreal Engine for autonomous vehicles from Microsoft AI & Research

#artificialintelligence

AirSim is a simulator for drones (and soon other vehicles) built on Unreal Engine. It is open-source, cross platform and supports hardware-in-loop with popular flight controllers such as Pixhawk for physically and visually realistic simulations. It is developed as an Unreal plugin that can simply be dropped in to any Unreal environment you want. Our goal is to develop AirSim as a platform for AI research to experiment with deep learning, computer vision and reinforcement learning algorithms for autonomous vehicles. For this purpose, AirSim also exposes APIs to retrieve data and control vehicles in a platform independent way.


Google's Artificial Intelligence Is Becoming 'Human-Like' -- and That Might Be a Bad Thing

#artificialintelligence

Will artificial intelligence get more aggressive and selfish the more intelligent it becomes? A new report out of Google's DeepMind AI division suggests this is possible based on the outcome of millions of video game sessions it monitored. The results of the two games indicate that as artificial intelligence becomes more complex, it is more likely to take extreme measures to ensure victory, including sabotage and greed. The first game, Gathering, is a simple one that involves gathering digital fruit. Two DeepMind AI agents were pitted against each other after being trained in the ways of deep reinforcement learning.


Microsoft offers drone lovers a simulator

#artificialintelligence

Microsoft has created and released a simulator for drone pilots to help them avoid destroying their toys while running machine learning experiments. Not unreasonably, Redmond has figured out that UAV-fanciers would like a way to generate training data for machine learning algorithms governing autonomous flight in a simulator, instead of having the toys buzz about in meatspace where hobbyists will need to take out their wallets every time they crash into a tree, or remortgage should they happen to collide with a litigious passer-by. Dubbed AirSim, the simulator for drones (and Microsoft plans for other vehicles to be supported soon) has been built on Unreal Engine, but is otherwise open source and available today on GitHub. It is designed as a platform for artificial intelligence researchers to gobble training data and experiment with their various deep learning, computer vision and reinforcement learning algorithms to achieve functioning autonomous vehicles. While an official Linux build is due in a few weeks, the current code base is cross-platform and supports hardware-in-loop with flight controllers – such as Pixhawk – directly interacting with the simulation environment.


Deep Q Learning with Keras and Gym – IIoT & Machine Learning

#artificialintelligence

This blog post will demonstrate how deep reinforcement learning (deep q learning) can be implemented and applied to play a CartPole game using Keras and Gym, in only 78 lines of code! I'll explain everything without requiring any prerequisite knowledge about reinforcement learning.


Accelerated Gradient Temporal Difference Learning

AAAI Conferences

The family of temporal difference (TD) methods span a spectrum from computationally frugal linear methods like TD(λ) to data efficient least squares methods. Least square methods make the best use of available data directly computing the TD solution and thus do not require tuning a typically highly sensitive learning rate parameter, but require quadratic computation and storage. Recent algorithmic developments have yielded several sub-quadratic methods that use an approximation to the least squares TD solution, but incur bias. In this paper, we propose a new family of accelerated gradient TD (ATD) methods that (1) provide similar data efficiency benefits to least-squares methods, at a fraction of the computation and storage (2) significantly reduce parameter sensitivity compared to linear TD methods, and (3) are asymptotically unbiased. We illustrate these claims with a proof of convergence in expectation and experiments on several benchmark domains and a large-scale industrial energy allocation domain.


Learning Options in Multiobjective Reinforcement Learning

AAAI Conferences

Reinforcement Learning (RL) is a successful technique to train autonomous agents. However, the classical RL methods take a long time to learn how to solve tasks. Option-based solutions can be used to accelerate learning and transfer learned behaviors across tasks by encapsulating a partial policy into an action. However, the literature report only single-agent and single-objective option-based methods, but many RL tasks, especially real-world problems, are better described through multiple objectives. We here propose a method to learn options in Multiobjective Reinforcement Learning domains in order to accelerate learning and reuse knowledge across tasks. Our initial experiments in the Goldmine Domain show that our proposal learn useful options that accelerate learning in multiobjective domains. Our next steps are to use the learned options to transfer knowledge across tasks and evaluate this method with stochastic policies.


Arnold: An Autonomous Agent to Play FPS Games

AAAI Conferences

Advances in deep reinforcement learning have allowed autonomous agents to perform well on Atari games, often outperforming humans, using only raw pixels to make their decisions. However, most of these games take place in 2D environments that are fully observable to the agent. In this paper, we present Arnold, a completely autonomous agent to play First-Person Shooter Games using only screen pixel data and demonstrate its effectiveness on Doom, a classical first-person shooter game. Arnold is trained with deep reinforcement learning using a recent Action-Navigation architecture, which uses separate deep neural networks for exploring the map and fighting enemies. Furthermore, it utilizes a lot of techniques such as augmenting high-level game features, reward shaping and sequential updates for efficient training and effective performance. Arnold outperforms average humans as well as in-built game bots on different variations of the deathmatch. It also obtained the highest kill-to-death ratio in both the tracks of the Visual Doom AI Competition and placed second in terms of the number of frags.