AITopics | Reinforcement Learning

Collaborating Authors

Reinforcement Learning

"Reinforcement learning is learning what to do – how to map situations to actions – so as to maximize a numerical reward signal. The learner is not told which actions to take, as in most forms of machine learning, but instead must discover which actions yield the most reward by trying them."
– Sutton, Richard S. and Andrew G. Barto. Reinforcement Learning: An Introduction. (1.1). MIT Press, Cambridge, MA, 1998.

News Overviews Instructional Materials AI-Alerts Classics

Reimagining Language Learning with NLP and Reinforcement Learning

#artificialintelligenceFeb-22-2017, 13:40:51 GMT

The way we learn natural languages hasn't really changed for decades. We now have beautiful apps like Duolingo and Spaced Repetition software like Anki, but I'm talking about our fundamental approach. We still follow pre-defined curricula, and do essentially random exercises. Learning isn't personalized, and learning isn't driven by data. And I think there's a big opportunity to change that.

learner, machine learning, reinforcement learning, (14 more...)

#artificialintelligence

Industry: Education > Curriculum > Subject-Specific Education (0.43)

Technology:

Information Technology > Artificial Intelligence > Natural Language (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.41)

Add feedback

Towards a Common Implementation of Reinforcement Learning for Multiple Robotic Tasks

Martínez-Tenor, Angel, Fernández-Madrigal, Juan Antonio, Cruz-Martín, Ana, González-Jiménez, Javier

arXiv.org Artificial IntelligenceFeb-21-2017

Mobile robots are increasingly being employed for performing complex tasks in dynamic environments. Reinforcement learning (RL) methods are recognized to be promising for specifying such tasks in a relatively simple manner. However, the strong dependency between the learning method and the task to learn is a well-known problem that restricts practical implementations of RL in robotics, often requiring major modifications of parameters and adding other techniques for each particular task. In this paper we present a practical core implementation of RL which enables the learning process for multiple robotic tasks with minimal per-task tuning or none. Based on value iteration methods, this implementation includes a novel approach for action selection, called Q-biased softmax regression (QBIASSR), which avoids poor performance of the learning process when the robot reaches new unexplored states. Our approach takes advantage of the structure of the state space by attending the physical variables involved (e.g., distances to obstacles, X,Y,{\theta} pose, etc.), thus experienced sets of states may favor the decision-making process of unexplored or rarely-explored states. This improvement has a relevant role in reducing the tuning of the algorithm for particular tasks. Experiments with real and simulated robots, performed with the software framework also introduced here, show that our implementation is effectively able to learn different robotic tasks without tuning the learning method. Results also suggest that the combination of true online SARSA({\lambda}) with QBIASSR can outperform the existing RL core algorithms in low-dimensional robotic tasks.

artificial intelligence, machine learning, reinforcement learning, (14 more...)

arXiv.org Artificial Intelligence

doi: 10.1016/j.eswa.2017.11.011

1702.06329

Country:

North America > United States (0.93)
Europe (0.67)

Genre: Research Report > New Finding (0.47)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Add feedback

Google's Artificial Intelligence Becoming 'Human-Like' With Aggressive, Greedy Behavior We Are Change

#artificialintelligenceFeb-20-2017, 06:05:23 GMT

Will artificial intelligence get more aggressive and selfish the more intelligent it becomes? A new report out of Google's DeepMind AI division suggests this is possible based on the outcome of millions of video game sessions it monitored. The results of the two games indicate that as artificial intelligence becomes more complex, it is more likely to take extreme measures to ensure victory, including sabotage and greed. The first game, Gathering, is a simple one that involves gathering digital fruit. Two DeepMind AI agents were pitted against each other after being trained in the ways of deep reinforcement learning.

large language model, machine learning, reinforcement learning, (11 more...)

#artificialintelligence

Industry: Leisure & Entertainment > Games (0.95)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.57)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.54)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.54)

Add feedback

GitHub - Microsoft/AirSim: Open source simulator based on Unreal Engine for autonomous vehicles from Microsoft AI & Research

#artificialintelligenceFeb-16-2017, 07:35:26 GMT

AirSim is a simulator for drones (and soon other vehicles) built on Unreal Engine. It is open-source, cross platform and supports hardware-in-loop with popular flight controllers such as Pixhawk for physically and visually realistic simulations. It is developed as an Unreal plugin that can simply be dropped in to any Unreal environment you want. Our goal is to develop AirSim as a platform for AI research to experiment with deep learning, computer vision and reinforcement learning algorithms for autonomous vehicles. For this purpose, AirSim also exposes APIs to retrieve data and control vehicles in a platform independent way.

artificial intelligence, machine learning, reinforcement learning, (15 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (0.73)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.57)

Add feedback

Google's Artificial Intelligence Is Becoming 'Human-Like' -- and That Might Be a Bad Thing

#artificialintelligenceFeb-15-2017, 21:25:06 GMT

large language model, machine learning, reinforcement learning, (11 more...)

#artificialintelligence

Industry: Leisure & Entertainment > Games (0.95)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.57)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.54)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.54)

Add feedback

Microsoft offers drone lovers a simulator

#artificialintelligenceFeb-15-2017, 16:15:09 GMT

Microsoft has created and released a simulator for drone pilots to help them avoid destroying their toys while running machine learning experiments. Not unreasonably, Redmond has figured out that UAV-fanciers would like a way to generate training data for machine learning algorithms governing autonomous flight in a simulator, instead of having the toys buzz about in meatspace where hobbyists will need to take out their wallets every time they crash into a tree, or remortgage should they happen to collide with a litigious passer-by. Dubbed AirSim, the simulator for drones (and Microsoft plans for other vehicles to be supported soon) has been built on Unreal Engine, but is otherwise open source and available today on GitHub. It is designed as a platform for artificial intelligence researchers to gobble training data and experiment with their various deep learning, computer vision and reinforcement learning algorithms to achieve functioning autonomous vehicles. While an official Linux build is due in a few weeks, the current code base is cross-platform and supports hardware-in-loop with flight controllers – such as Pixhawk – directly interacting with the simulation environment.

machine learning, reinforcement learning, simulator, (9 more...)

#artificialintelligence

Industry:

Transportation > Air (0.59)
Information Technology > Robotics & Automation (0.59)

Technology:

Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (0.59)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.59)

Add feedback

Deep Q Learning with Keras and Gym – IIoT & Machine Learning

#artificialintelligenceFeb-15-2017, 04:25:31 GMT

This blog post will demonstrate how deep reinforcement learning (deep q learning) can be implemented and applied to play a CartPole game using Keras and Gym, in only 78 lines of code! I'll explain everything without requiring any prerequisite knowledge about reinforcement learning.

artificial intelligence, machine learning, reinforcement learning, (1 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Accelerated Gradient Temporal Difference Learning

Pan, Yangchen (Indiana University) | White, Adam (Indiana University) | White, Martha (Indiana University)

AAAI ConferencesFeb-14-2017

The family of temporal difference (TD) methods span a spectrum from computationally frugal linear methods like TD(λ) to data efficient least squares methods. Least square methods make the best use of available data directly computing the TD solution and thus do not require tuning a typically highly sensitive learning rate parameter, but require quadratic computation and storage. Recent algorithmic developments have yielded several sub-quadratic methods that use an approximation to the least squares TD solution, but incur bias. In this paper, we propose a new family of accelerated gradient TD (ATD) methods that (1) provide similar data efficiency benefits to least-squares methods, at a fraction of the computation and storage (2) significantly reduce parameter sensitivity compared to linear TD methods, and (3) are asymptotically unbiased. We illustrate these claims with a proof of convergence in expectation and experiments on several benchmark domains and a large-scale industrial energy allocation domain.

algorithm, approximation, atd, (15 more...)

AAAI Conferences

Thirty-First AAAI Conference on Artificial Intelligence

Country:

North America > Canada > Alberta (0.14)
North America > United States > Indiana (0.04)

Genre: Research Report (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Learning Options in Multiobjective Reinforcement Learning

Bonini, Rodrigo Cesar (Escola Politécnica da Universidade de São Paulo) | Silva, Felipe Leno da (Escola Politécnica da Universidade de São Paulo) | Costa, Anna Helena Reali (Escola Politécnica da Universidade de São Paulo)

AAAI ConferencesFeb-14-2017

Reinforcement Learning (RL) is a successful technique to train autonomous agents. However, the classical RL methods take a long time to learn how to solve tasks. Option-based solutions can be used to accelerate learning and transfer learned behaviors across tasks by encapsulating a partial policy into an action. However, the literature report only single-agent and single-objective option-based methods, but many RL tasks, especially real-world problems, are better described through multiple objectives. We here propose a method to learn options in Multiobjective Reinforcement Learning domains in order to accelerate learning and reuse knowledge across tasks. Our initial experiments in the Goldmine Domain show that our proposal learn useful options that accelerate learning in multiobjective domains. Our next steps are to use the learned options to transfer knowledge across tasks and evaluate this method with stochastic policies.

algorithm, objective, reward function, (15 more...)

AAAI Conferences

Thirty-First AAAI Conference on Artificial Intelligence

Country:

South America > Brazil > São Paulo (0.07)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.05)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Arnold: An Autonomous Agent to Play FPS Games

Chaplot, Devendra Singh (Carnegie Mellon University) | Lample, Guillaume (Carnegie Mellon University)

AAAI ConferencesFeb-14-2017

Advances in deep reinforcement learning have allowed autonomous agents to perform well on Atari games, often outperforming humans, using only raw pixels to make their decisions. However, most of these games take place in 2D environments that are fully observable to the agent. In this paper, we present Arnold, a completely autonomous agent to play First-Person Shooter Games using only screen pixel data and demonstrate its effectiveness on Doom, a classical first-person shooter game. Arnold is trained with deep reinforcement learning using a recent Action-Navigation architecture, which uses separate deep neural networks for exploring the map and fighting enemies. Furthermore, it utilizes a lot of techniques such as augmenting high-level game features, reward shaping and sequential updates for efficient training and effective performance. Arnold outperforms average humans as well as in-built game bots on different variations of the deathmatch. It also obtained the highest kill-to-death ratio in both the tracks of the Visual Doom AI Competition and placed second in terms of the number of frags.

artificial intelligence, machine learning, reinforcement learning, (12 more...)

AAAI Conferences

Thirty-First AAAI Conference on Artificial Intelligence

Country: North America > United States (0.15)

Industry: Leisure & Entertainment > Games > Computer Games (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.52)

Add feedback