AITopics | Reinforcement Learning

Collaborating Authors

Reinforcement Learning

"Reinforcement learning is learning what to do – how to map situations to actions – so as to maximize a numerical reward signal. The learner is not told which actions to take, as in most forms of machine learning, but instead must discover which actions yield the most reward by trying them."
– Sutton, Richard S. and Andrew G. Barto. Reinforcement Learning: An Introduction. (1.1). MIT Press, Cambridge, MA, 1998.

News Overviews Instructional Materials AI-Alerts Classics

Torch Dueling Deep Q-Networks

#artificialintelligenceMay-23-2016, 15:15:51 GMT

Deep Q-networks (DQNs) [1] have reignited interest in neural networks for reinforcement learning, proving their abilities on the challenging Arcade Learning Environment (ALE) benchmark [2]. The ALE is a reinforcement learning interface for over 50 video games for the Atari 2600; with a single architecture and choice of hyperparameters the DQN was able to achieve superhuman scores on over half of these games. The original work has now been superseded with several advancements, several of which can be found on GitHub. As training on the ALE can take over a week on a GPU, the code is also set up to learn how to play a simpler game of catch in a couple of hours on a CPU. Most recent deep learning research has focused around supervised learning, which involves finding a mapping from input data $x$ to target data $y$.

artificial intelligence, machine learning, reinforcement learning, (15 more...)

#artificialintelligence

Industry:

Leisure & Entertainment > Games (0.56)
Education (0.35)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Q-learning with Neural Networks

#artificialintelligenceMay-21-2016, 13:33:22 GMT

We've made it to what we've all been waiting for, Q-learning with neural networks. Since I'm sure a lot of people didn't follow parts 1 and 2 because they were kind of boring, I will attempt to make this post relatively (but not completely) self-contained. In this post, we will dive into using Q-learning to train an agent (player) how to play Gridworld. Gridworld is a simple text based game in which there is a 4x4 grid of tiles and 4 objects placed therein: a player, pit, goal, and a wall. The player can move up/down/left/right ( a \in A \{up,down,left,right\}) and the point of the game is to get to the goal where the player will receive a numerical reward. Unfortunately, we have to avoid a pit, because if we land on the pit we are penalized with a negative'reward'.

artificial intelligence, machine learning, reinforcement learning, (8 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

TensorFlow in Action: TensorBoard, Training a Model, and Deep Q-learning - Blog on All Things Cloud Foundry

@machinelearnbotMay-19-2016, 18:05:22 GMT

Peter Morgan is a published author and computer science industry veteran with twenty years' experience working within the IT industry. Before entering industry, he solved high energy physics problems while enrolled in the PhD program in physics at the University of Massachusetts at Amherst. After spending three years as a Research Associate on an experiment lead by Stanford University to measure the mass of the neutrino, Peter now works as a technical director at Data Science Partnership--a company he co-founded--overseeing business development and helping clients to design and implement their deep learning solutions.

artificial intelligence, machine learning, reinforcement learning, (7 more...)

@machinelearnbot

Country: North America > United States > Massachusetts (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.76)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.74)

Add feedback

An Introduction to Semi-supervised Reinforcement Learning

#artificialintelligenceMay-17-2016, 19:51:00 GMT

As usual, our goal is to quickly learn a policy which receives a high reward per episode. We can apply a traditional RL algorithm to the semi-supervised setting by simply ignoring all of the unlabelled episodes. This will generally result in very slow learning. The interesting challenge is to learn efficiently from the unlabelled episodes. I think that semi-supervised RL is a valuable ingredient for AI control, as well as an interesting research problem in reinforcement learning.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Build a Game AI using OpenAI's Gym & Deep Q Learning • /r/MachineLearning

@machinelearnbotMay-17-2016, 13:05:23 GMT

That was actually a quite nice overview for beginners. Also, I have no clue why your speaking style reminds me of hack the planet.

deep learning, machinelearning, reinforcement learning, (5 more...)

@machinelearnbot

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.58)

Add feedback

Deep Reinforcement Learning

#artificialintelligenceMay-17-2016, 09:35:17 GMT

Goal In this week's summary we introduce the basic concepts behind reinforcement learning and some ways it is applied in very controlled environments. Motivation Reinforcement learning methods recently experienced a hype through AlphaGo ranking next to the best human Go players. Furthermore the complexity of Go might ease the transfer of reinforcement learning to very large NLP tasks like dialog handling. Steps Reinforcement Learning is usually applied to tasks, where an environment is partially observable and a certain action has to be taken. Any kind of game basically fits the former description.

artificial intelligence, machine learning, reinforcement learning, (14 more...)

#artificialintelligence

Industry: Leisure & Entertainment > Games > Go (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

NN with Q-learning: which activation function with which cost function? • /r/MachineLearning

@machinelearnbotMay-16-2016, 10:55:19 GMT

I've been messing around with Q-learning adapted with NN, after I read these two articles: I'm not yet ready to understand and implement conv NN so I just fooled around with normal NN. I've been told to use sigmoid as activation function and cross-entropy as cost function. The problem is it doesn't seem to work well with Q-learning since I want my output to be a real number, using a probability output seem like a bad hack to me. The papers I read seem to use the quadratic cost function but I have no detail about the activation function. I checked the github of someone who implemented all these and he seem to not use any activation function at all.

activation function, artificial intelligence, reinforcement learning, (4 more...)

@machinelearnbot

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

openai/gym

#artificialintelligenceMay-11-2016, 21:45:32 GMT

OpenAI Gym is a toolkit for developing and comparing reinforcement learning algorithms. This is the gym open-source library, which gives you access to an ever-growing variety of environments. You can use it from Python code, and soon from other languages. If you're not sure where to start, we recommend beginning with the docs on our site. There are two basic concepts in reinforcement learning: the environment (namely, the outside world) and the agent (namely, the algorithm you are writing).

large language model, machine learning, reinforcement learning, (19 more...)

#artificialintelligence

Industry: Leisure & Entertainment > Games (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.79)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.61)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.61)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.61)

Add feedback

Sample Complexity of Episodic Fixed-Horizon Reinforcement Learning

Dann, Christoph, Brunskill, Emma

arXiv.org Artificial IntelligenceMay-11-2016

Recently, there has been significant progress in understanding reinforcement learning in discounted infinite-horizon Markov decision processes (MDPs) by deriving tight sample complexity bounds. However, in many real-world applications, an interactive learning agent operates for a fixed or bounded period of time, for example tutoring students for exams or handling customer service requests. Such scenarios can often be better treated as episodic fixed-horizon MDPs, for which only looser bounds on the sample complexity exist. A natural notion of sample complexity in this setting is the number of episodes required to guarantee a certain performance with high probability (PAC guarantee). In this paper, we derive an upper PAC bound $\tilde O(\frac{|\mathcal S|^2 |\mathcal A| H^2}{\epsilon^2} \ln\frac 1 \delta)$ and a lower PAC bound $\tilde \Omega(\frac{|\mathcal S| |\mathcal A| H^2}{\epsilon^2} \ln \frac 1 {\delta + c})$ that match up to log-terms and an additional linear dependency on the number of states $|\mathcal S|$. The lower bound is the first of its kind for this setting. Our upper bound leverages Bernstein's inequality to improve on previous bounds for episodic finite-horizon MDPs which have a time-horizon dependency of at least $H^3$.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

arXiv.org Artificial Intelligence

1510.08906

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.86)

Add feedback

A New 'Gym' for Building and Testing A.I. - Dice Insights

#artificialintelligenceMay-10-2016, 18:21:23 GMT

If you're interested in working with machine learning and artificial-intelligence algorithms--but unsure of how to start--check out the OpenAI Gym, now in beta. The premise behind OpenAI Gym is simple: it's a toolkit for building reinforcement learning (RL) algorithms, which govern bots' decision-making and motor-control capabilities. Reinforcement learning is a key element in A.I. development, as it allows software to deal with random, unpredictable environments; one "classic" problem involves balancing an untethered pole on a rolling cart: OpenAI is a non-profit "artificial intelligence research company" funded by some heavy hitters in the tech world, including Tesla CEO Elon Musk and venture capitalist Peter Thiel. Its altruistic goal is to develop open-source A.I. software that's "friendly" to humanity. According to a blog posting accompanying the launch of OpenAI Gym, RL research is slowed by two factors: a need for better benchmarks, and a lack of standardization of environments used in publications.

large language model, machine learning, reinforcement learning, (10 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.83)

Add feedback