AITopics | Reinforcement Learning

Collaborating Authors

Reinforcement Learning

"Reinforcement learning is learning what to do – how to map situations to actions – so as to maximize a numerical reward signal. The learner is not told which actions to take, as in most forms of machine learning, but instead must discover which actions yield the most reward by trying them."
– Sutton, Richard S. and Andrew G. Barto. Reinforcement Learning: An Introduction. (1.1). MIT Press, Cambridge, MA, 1998.

News Overviews Instructional Materials AI-Alerts Classics

Compression and Localization in Reinforcement Learning for ATARI Games

Moniz, Joel Ruben Antony, Patra, Barun, Garg, Sarthak

arXiv.org Artificial IntelligenceApr-20-2019

Deep neural networks have become commonplace in the domain of reinforcement learning, but are often expensive in terms of the number of parameters needed. While compressing deep neural networks has of late assumed great importance to overcome this drawback, little work has been done to address this problem in the context of reinforcement learning agents. This work aims at making first steps towards model compression in an RL agent. In particular, we compress networks to drastically reduce the number of parameters in them (to sizes less than 3% of their original size), further facilitated by applying a global max pool after the final convolution layer, and propose using Actor-Mimic in the context of compression. Finally, we show that this global max-pool allows for weakly supervised object localization, improving the ability to identify the agent's points of focus.

machine learning, reinforcement learning, student network, (15 more...)

arXiv.org Artificial Intelligence

1904.09489

Country: North America (0.29)

Genre: Research Report > New Finding (0.68)

Industry: Leisure & Entertainment > Games > Computer Games (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Deep Q Learning Driven CT Pancreas Segmentation with Geometry-Aware U-Net

Man, Yunze, Huang, Yangsibo, Feng, Junyi, Li, Xi, Wu, Fei

arXiv.org Artificial IntelligenceApr-19-2019

Segmentation of pancreas is important for medical image analysis, yet it faces great challenges of class imbalance, background distractions and non-rigid geometrical features. To address these difficulties, we introduce a Deep Q Network(DQN) driven approach with deformable U-Net to accurately segment the pancreas by explicitly interacting with contextual information and extract anisotropic features from pancreas. The DQN based model learns a context-adaptive localization policy to produce a visually tightened and precise localization bounding box of the pancreas. Furthermore, deformable U-Net captures geometry-aware information of pancreas by learning geometrically deformable filters for feature extraction. Experiments on NIH dataset validate the effectiveness of the proposed framework in pancreas segmentation.

machine learning, reinforcement learning, segmentation, (18 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/TMI.2019.2911588

1904.0912

Country: Asia > China (0.14)

Genre: Research Report (1.00)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (0.89)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Emergence of Compositional Language with Deep Generational Transmission

Cogswell, Michael, Lu, Jiasen, Lee, Stefan, Parikh, Devi, Batra, Dhruv

arXiv.org Artificial IntelligenceApr-19-2019

Consider a collaborative task that requires communication. Two agents are placed in an environment and must create a language from scratch in order to coordinate. Recent work has been interested in what kinds of languages emerge when deep reinforcement learning agents are put in such a situation, and in particular in the factors that cause language to be compositional-i.e. meaning is expressed by combining words which themselves have meaning. Evolutionary linguists have also studied the emergence of compositional language for decades, and they find that in addition to structural priors like those already studied in deep learning, the dynamics of transmitting language from generation to generation contribute significantly to the emergence of compositionality. In this paper, we introduce these cultural evolutionary dynamics into language emergence by periodically replacing agents in a population to create a knowledge gap, implicitly inducing cultural transmission of language. We show that this implicit cultural transmission encourages the resulting languages to exhibit better compositional generalization and suggest how elements of cultural dynamics can be further integrated into populations of deep agents.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

arXiv.org Artificial Intelligence

1904.09067

Country: North America > United States (0.28)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)

Add feedback

Learn about the Types of Machine Learning Algorithms

#artificialintelligenceApr-18-2019, 07:42:57 GMT

Isn't it true that we are living in a digitalized world that has eliminated tons of human work by positioning automation?. In fact, it is the most defined period as Google's self-driving car has been invented. But, this period is not in its final stages instead is multiplying to create many more awesome things to surface in the near future. The most exciting concept that sits beside all these major transformations is Machine Learning, which is nothing but allowing computers to learn on their own to arrive at useful insights. Supervised learning is similar to a teacher teaching his students with examples and after sufficient practice, the teacher stops supervising and let the students derive at their own solution.

artificial intelligence, machine learning, reinforcement learning, (5 more...)

#artificialintelligence

Country: Asia > India > Karnataka > Bengaluru (0.08)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.58)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.58)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (0.57)

Add feedback

Introduction to Deep Q-Learning for Reinforcement Learning (in Python)

#artificialintelligenceApr-18-2019, 07:41:26 GMT

I have always been fascinated with games. The seemingly infinite options available to perform an action under a tight timeline – it's a thrilling experience. So when I read about the incredible algorithms DeepMind was coming up with (like AlphaGo and AlphaStar), I was hooked. I wanted to learn how to make these systems on my own machine. And that led me into the world of deep reinforcement learning (Deep RL).

artificial intelligence, machine learning, reinforcement learning, (16 more...)

#artificialintelligence

Industry: Leisure & Entertainment > Games > Computer Games (0.32)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

Decoding Molecular Graph Embeddings with Reinforcement Learning

Kearnes, Steven, Li, Li, Riley, Patrick

arXiv.org Machine LearningApr-18-2019

We present RL-VAE, a graph-to-graph variational autoencoder that uses reinforcement learning to decode molecular graphs from latent embeddings. Methods have been described previously for graph-to-graph autoencoding, but these approaches require sophisticated decoders that increase the complexity of training and evaluation (such as requiring parallel encoders and decoders or non-trivial graph matching). Here, we repurpose a simple graph generator to enable efficient decoding and generation of molecular graphs.

artificial intelligence, machine learning, reinforcement learning, (13 more...)

arXiv.org Machine Learning

1904.08915

Country: North America > United States (0.14)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

When is a Prediction Knowledge?

Kearney, Alex, Pilarski, Patrick M.

arXiv.org Artificial IntelligenceApr-18-2019

Within Reinforcement Learning, there is a growing collection of research which aims to express all of an agent's knowledge of the world through predictions about sensation, behaviour, and time. This work can be seen not only as a collection of architectural proposals, but also as the beginnings of a theory of machine knowledge in reinforcement learning. Recent work has expanded what can be expressed using predictions, and developed applications which use predictions to inform decision-making on a variety of synthetic and real-world problems. While promising, we here suggest that the notion of predictions as knowledge in reinforcement learning is as yet underdeveloped: some work explicitly refers to predictions as knowledge, what the requirements are for considering a prediction to be knowledge have yet to be well explored. This specification of the necessary and sufficient conditions of knowledge is important; even if claims about the nature of knowledge are left implicit in technical proposals, the underlying assumptions of such claims have consequences for the systems we design. These consequences manifest in both the way we choose to structure predictive knowledge architectures, and how we evaluate them. In this paper, we take a first step to formalizing predictive knowledge by discussing the relationship of predictive knowledge learning methods to existing theories of knowledge in epistemology. Specifically, we explore the relationships between Generalized Value Functions and epistemic notions of Justification and Truth.

machine learning, prediction, reinforcement learning, (13 more...)

arXiv.org Artificial Intelligence

1904.09024

Country: North America > Canada > Alberta (0.30)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.91)

Add feedback

Making Meaning: Semiotics Within Predictive Knowledge Architectures

Kearney, Alex, Oxton, Oliver

arXiv.org Artificial IntelligenceApr-18-2019

Within Reinforcement Learning, there is a fledgling approach to conceptualizing the environment in terms of predictions. Central to this predictive approach is the assertion that it is possible to construct ontologies in terms of predictions about sensation, behaviour, and time---to categorize the world into entities which express all aspects of the world using only predictions. This construction of ontologies is integral to predictive approaches to machine knowledge where objects are described exclusively in terms of how they are perceived. In this paper, we ground the Pericean model of semiotics in terms of Reinforcement Learning Methods, describing Peirce's Three Categories in the notation of General Value Functions. Using the Peircean model of semiotics, we demonstrate that predictions alone are insufficient to construct an ontology; however, we identify predictions as being integral to the meaning-making process. Moreover, we discuss how predictive knowledge provides a particularly stable foundation for semiosis\textemdash the process of making meaning\textemdash and suggest a possible avenue of research to design algorithmic methods which construct semantics and meaning using predictions.

machine learning, prediction, reinforcement learning, (18 more...)

arXiv.org Artificial Intelligence

1904.09023

Country:

North America > Canada > Alberta (0.16)
North America > Canada > Quebec (0.15)
North America > Canada > Ontario (0.14)

Genre: Research Report (0.50)

Industry: Professional Services (0.41)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Ontologies (0.91)

Add feedback

Improving Interactive Reinforcement Agent Planning with Human Demonstration

Li, Guangliang, Gomez, Randy, Nakamura, Keisuke, Lin, Jinying, Zhang, Qilei, He, Bo

arXiv.org Artificial IntelligenceApr-18-2019

TAMER has proven to be a powerful interactive reinforcement learning method for allowing ordinary people to teach and personalize autonomous agents' behavior by providing evaluative feedback. However, a TAMER agent planning with UCT---a Monte Carlo Tree Search strategy, can only update states along its path and might induce high learning cost especially for a physical robot. In this paper, we propose to drive the agent's exploration along the optimal path and reduce the learning cost by initializing the agent's reward function via inverse reinforcement learning from demonstration. We test our proposed method in the RL benchmark domain---Grid World---with different discounts on human reward. Our results show that learning from demonstration can allow a TAMER agent to learn a roughly optimal policy up to the deepest search and encourage the agent to explore along the optimal path. In addition, we find that learning from demonstration can improve the learning efficiency by reducing total feedback, the number of incorrect actions and increasing the ratio of correct actions to obtain an optimal policy, allowing a TAMER agent to converge faster.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

arXiv.org Artificial Intelligence

1904.08621

Country: Asia (0.14)

Genre: Research Report > New Finding (0.86)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

r/deeplearning - Learning to paint: A Painting AI

#artificialintelligenceApr-17-2019, 10:47:55 GMT

Abstract: We show how to teach machines to paint like human painters, who can use a few strokes to create fantastic paintings. By combining the neural renderer and model-based Deep Reinforcement Learning (DRL), our agent can decompose texture-rich images into strokes and make long-term plans. For each stroke, the agent directly determines the position and color of the stroke. Excellent visual effect can be achieved using hundreds of strokes. The training process does not require experience of human painting or stroke tracking data.

artificial intelligence, machine learning, reinforcement learning, (1 more...)

#artificialintelligence

Industry: Media > News (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.73)

Add feedback