Goto

Collaborating Authors

 Reinforcement Learning


Representations for Continuous Learning

AAAI Conferences

Systems deployed in unstructured environments must be able to adapt to novel situations. This requires the ability to perform in domains that may be vastly different from training domains. My dissertation focuses on the representations used in lifelong learning and how these representations enable predictions and knowledge sharing over time, allowing an agent to continuously learn and adapt in changing environments. Specifically, my contributions will enable lifelong learning systems to efficiently accumulate data, use prior knowledge to predict models for novel tasks, and alter existing models to account for changes in the environment.


Improving Deep Reinforcement Learning with Knowledge Transfer

AAAI Conferences

Recent successes in applying Deep Learning techniques on Reinforcement Learning algorithms have led to a wave of breakthrough developments in agent theory and established the field of Deep Reinforcement Learning (DRL). While DRL has shown great results for single task learning, the multi-task case is still underrepresented in the available literature. This D.Sc. research proposal aims at extending DRL to the multi- task case by leveraging the power of Transfer Learning algorithms to improve the training time and results for multi-task learning. Our focus lies on defining a novel framework for scalable DRL agents that detects similarities between tasks and balances various TL techniques, like parameter initialization, policy or skill transfer.


Accelerating Multiagent Reinforcement Learning through Transfer Learning

AAAI Conferences

Reinforcement Learning (RL) is a widely used solution for sequential decision-making problems and has been used in many complex domains. However, RL algorithms suffer from scalability issues, especially when multiple agents are acting in a shared environment. This research intends to accelerate learning in multiagent sequential decision-making tasks by reusing previous knowledge, both from past solutions and advising between agents. We intend to contribute a Transfer Learning framework focused on Multiagent RL, requiring as few domain-specific hand-coded parameters as possible.


SEAPoT-RL: Selective Exploration Algorithm for Policy Transfer in RL

AAAI Conferences

We propose a new method for transferring a policy from a source task to a target task in model-based reinforcement learning. Our work is motivated by scenarios where a robotic agent operates in similar but challenging environments, such as hospital wards, differentiated by structural arrangements or obstacles, such as furniture. We address problems that require fast responses adapted from incomplete, prior knowledge of the agent in new scenarios. We present an efficient selective exploration strategy that maximally reuses the source task policy. Reuse efficiency is effected through identifying sub-spaces that are different in the target environment, thus limiting the exploration needed in the target task. We empirically show that SEAPoT performs better in terms of jump starts and cumulative average rewards, as compared to existing state-of-the-art policy reuse methods.


Policy Reuse in Deep Reinforcement Learning

AAAI Conferences

Driven by recent developments in Artificial Intelligence research, a promising new technology for building intelligent agents has evolved. The approach is termed Deep Reinforcement Learning and combines the classic field of Reinforcement Learning (RL) with the representational power of modern Deep Learning approaches. It is very well suited for single task learning but needs a long time to learn any new task. To speed up this process, we propose to extend the concept to multi-task learning by adapting Policy Reuse, a Transfer Learning approach from classic RL, to use with Deep Q-Networks.


Handwriting Profiling Using Generative Adversarial Networks

AAAI Conferences

Handwriting is a skill learned by humans from a very early age. The ability to develop oneโ€™s own unique handwriting as well as mimic another personโ€™s handwriting is a task learned by the brain with practice. This paper deals with this very problem where an intelligent system tries to learn the handwriting of an entity using Generative Adversarial Networks (GANs). We propose a modified architecture of DCGAN (Radford, Metz, and Chintala 2015) to achieve this. We also discuss about applying reinforcement learning techniques to achieve faster learning. Our algorithm hopes to give new insights in this area and its uses include identification of forged documents, signature verification, computer generated art, digitization of documents among others. Our early implementation of the algorithm illustrates a good performance with MNIST datasets.


Cornhole: A Widely-Accessible AI Robotics Task

AAAI Conferences

In this paper we present the game of cornhole as a compelling, accessible, and adaptable AI robotics task. Cornhole is a fun and social game with simple rules, but involves strategy and physical training for humans to play competitively; thus, developing a robot that can play at the level of even the average human player presents a multitude of opportunities for curricular integration at a variety of levels. We characterize the AI tasks involved with the game, and present results and resources gained from preliminary offerings.


Maximizing the Probability of Arriving on Time: A Practical Q-Learning Method

AAAI Conferences

The stochastic shortest path problem is of crucial importance for the development of sustainable transportation systems. Existing methods based on the probability tail model seek for the path that maximizes the probability of arriving at the destination before a deadline. However, they suffer from low accuracy and/or high computational cost. We design a novel Q-learning method where the converged Q-values have the practical meaning as the actual probabilities of arriving on time so as to improve accuracy. By further adopting dynamic neural networks to learn the value function, our method can scale well to large road networks with arbitrary deadlines. Experimental results on real road networks demonstrate the significant advantages of our method over other counterparts.


Localizing by Describing: Attribute-Guided Attention Localization for Fine-Grained Recognition

AAAI Conferences

A key challenge in fine-grained recognition is how to find and represent discriminative local regions.Recent attention models are capable of learning discriminative region localizers only from category labels with reinforcement learning. However, not utilizing any explicit part information, they are not able to accurately find multiple distinctive regions.In this work, we introduce an attribute-guided attention localization scheme where the local region localizers are learned under the guidance of part attribute descriptions.By designing a novel reward strategy, we are able to learn to locate regions that are spatially and semantically distinctive with reinforcement learning algorithm. The attribute labeling requirement of the scheme is more amenable than the accurate part location annotation required by traditional part-based fine-grained recognition methods.Experimental results on the CUB-200-2011 dataset demonstrate the superiority of the proposed scheme on both fine-grained recognition and attribute recognition.


An Efficient Approach to Model-Based Hierarchical Reinforcement Learning

AAAI Conferences

We propose a model-based approach to hierarchical reinforcement learning that exploits shared knowledge and selective execution at different levels of abstraction, to efficiently solve large, complex problems. Our framework adopts a new transition dynamics learning algorithm that identifies the common action-feature combinations of the subtasks, and evaluates the subtask execution choices through simulation. The framework is sample efficient, and tolerates uncertain and incomplete problem characterization of the subtasks. We test the framework on common benchmark problems and complex simulated robotic environments. It compares favorably against the state-of-the-art algorithms, and scales well in very large problems.