Konidaris, George


Robust and Efficient Transfer Learning with Hidden-Parameter Markov Decision Processes

arXiv.org Machine Learning

We introduce a new formulation of the Hidden Parameter Markov Decision Process (HiP-MDP), a framework for modeling families of related tasks using low-dimensional latent embeddings. Our new framework correctly models the joint uncertainty in the latent parameters and the state space. We also replace the original Gaussian Process-based model with a Bayesian Neural Network, enabling more scalable inference. Thus, we expand the scope of the HiP-MDP to applications with higher dimensions and more complex dynamics.


Mean Actor Critic

arXiv.org Machine Learning

We propose a new algorithm, Mean Actor-Critic (MAC), for discrete-action continuous-state reinforcement learning. MAC is a policy gradient algorithm that uses the agent's explicit representation of all action values to estimate the gradient of the policy, rather than using only the actions that were actually executed. This significantly reduces variance in the gradient updates and removes the need for a variance reduction baseline. We show empirical results on two control domains where MAC performs as well as or better than other policy gradient approaches, and on five Atari games, where MAC is competitive with state-of-the-art policy search algorithms.


Reinforcement Learning with Parameterized Actions

AAAI Conferences

We introduce a model-free algorithm for learning in Markov decision processes with parameterized actions—discrete actions with continuous parameters. At each step the agent must select both which action to use and which parameters to use with that action. We introduce the Q-PAMDP algorithm for learning in these domains, show that it converges to a local optimum, and compare it to direct policy search in the goal-scoring and Platform domains.


Reinforcement Learning with Parameterized Actions

arXiv.org Artificial Intelligence

We introduce a model-free algorithm for learning in Markov decision processes with parameterized actions-discrete actions with continuous parameters. At each step the agent must select both which action to use and which parameters to use with that action. We introduce the Q-PAMDP algorithm for learning in these domains, show that it converges to a local optimum, and compare it to direct policy search in the goal-scoring and Platform domains.


A Summary of the Twenty-Ninth AAAI Conference on Artificial Intelligence

AI Magazine

The Twenty-Ninth AAAI Conference on Artificial Intelligence, (AAAI-15) was held in January 2015 in Austin, Texas (USA) The conference program was cochaired by Sven Koenig and Blai Bonet. This report contains reflective summaries of the main conference, the robotics program, the AI and robotics workshop, the virtual agent exhibition, the what's hot track, the competition panel, the senior member track, student and outreach activities, the student abstract and poster program, the doctoral consortium, the women's mentoring event, and the demonstrations program.


A Summary of the Twenty-Ninth AAAI Conference on Artificial Intelligence

AI Magazine

The Twenty-Ninth AAAI Conference on Artificial Intelligence, (AAAI-15) was held in January 2015 in Austin, Texas (USA) The conference program was cochaired by Sven Koenig and Blai Bonet. This report contains reflective summaries of the main conference, the robotics program, the AI and robotics workshop, the virtual agent exhibition, the what's hot track, the competition panel, the senior member track, student and outreach activities, the student abstract and poster program, the doctoral consortium, the women's mentoring event, and the demonstrations program.


Symbol Acquisition for Probabilistic High-Level Planning

AAAI Conferences

We introduce a framework that enables an agent to autonomously learn its own symbolic representation of a low-level, continuous environment. Propositional symbols are formalized as names for probability distributions, providing a natural means of dealing with uncertain representations and probabilistic plans. We determine the symbols that are sufficient for computing the probability with which a plan will succeed, and demonstrate the acquisition of a symbolic representation in a computer game domain.


Reports of the AAAI 2014 Conference Workshops

AI Magazine

The AAAI-14 Workshop program was held Sunday and Monday, July 27–28, 2012, at the Québec City Convention Centre in Québec, Canada. The AAAI-14 workshop program included fifteen workshops covering a wide range of topics in artificial intelligence. The titles of the workshops were AI and Robotics; Artificial Intelligence Applied to Assistive Technologies and Smart Environments; Cognitive Computing for Augmented Human Intelligence; Computer Poker and Imperfect Information; Discovery Informatics; Incentives and Trust in Electronic Communities; Intelligent Cinematography and Editing; Machine Learning for Interactive Systems: Bridging the Gap between Perception, Action and Communication; Modern Artificial Intelligence for Health Analytics; Multiagent Interaction without Prior Coordination; Multidisciplinary Workshop on Advances in Preference Handling; Semantic Cities -- Beyond Open Data to Models, Standards and Reasoning; Sequential Decision Making with Big Data; Statistical Relational AI; and The World Wide Web and Public Health Intelligence. This article presents short summaries of those events.


Reports of the AAAI 2014 Conference Workshops

AI Magazine

The AAAI-14 Workshop program was held Sunday and Monday, July 27–28, 2012, at the Québec City Convention Centre in Québec, Canada. Canada. The AAAI-14 workshop program included fifteen workshops covering a wide range of topics in artificial intelligence. The titles of the workshops were AI and Robotics; Artificial Intelligence Applied to Assistive Technologies and Smart Environments; Cognitive Computing for Augmented Human Intelligence; Computer Poker and Imperfect Information; Discovery Informatics; Incentives and Trust in Electronic Communities; Intelligent Cinematography and Editing; Machine Learning for Interactive Systems: Bridging the Gap between Perception, Action and Communication; Modern Artificial Intelligence for Health Analytics; Multiagent Interaction without Prior Coordination; Multidisciplinary Workshop on Advances in Preference Handling; Semantic Cities — Beyond Open Data to Models, Standards and Reasoning; Sequential Decision Making with Big Data; Statistical Relational AI; and The World Wide Web and Public Health Intelligence. This article presents short summaries of those events.


Constructing Symbolic Representations for High-Level Planning

AAAI Conferences

We consider the problem of constructing a symbolic description of a continuous, low-level environment for use in planning. We show that symbols that can represent the preconditions and effects of an agent's actions are both necessary and sufficient for high-level planning. This eliminates the symbol design problem when a representation must be constructed in advance, and in principle enables an agent to autonomously learn its own symbolic representations. The resulting representation can be converted into PDDL, a canonical high-level planning representation that enables very fast planning.