Planning & Scheduling
Learning Plannable Representations with Causal InfoGAN
Kurutach, Thanard, Tamar, Aviv, Yang, Ge, Russell, Stuart J., Abbeel, Pieter
In recent years, deep generative models have been shown to 'imagine' convincing high-dimensional observations such as images, audio, and even video, learning directly from raw data. In this work, we ask how to imagine goal-directed visual plans -- a plausible sequence of observations that transition a dynamical system from its current configuration to a desired goal state, which can later be used as a reference trajectory for control. We focus on systems with high-dimensional observations, such as images, and propose an approach that naturally combines representation learning and planning. Our framework learns a generative model of sequential observations, where the generative process is induced by a transition in a low-dimensional planning model, and an additional noise. By maximizing the mutual information between the generated observations and the transition in the planning model, we obtain a low-dimensional representation that best explains the causal nature of the data. We structure the planning model to be compatible with efficient planning algorithms, and we propose several such models based on either discrete or continuous states. Finally, to generate a visual plan, we project the current and goal observations onto their respective states in the planning model, plan a trajectory, and then use the generative model to transform the trajectory to a sequence of observations. We demonstrate our method on imagining plausible visual plans of rope manipulation.
Monte-Carlo Tree Search for Constrained POMDPs
Lee, Jongmin, Kim, Geon-hyeong, Poupart, Pascal, Kim, Kee-Eung
Monte-Carlo Tree Search (MCTS) has been successfully applied to very large POMDPs, a standard model for stochastic sequential decision-making problems. However, many real-world problems inherently have multiple goals, where multi-objective formulations are more natural. The constrained POMDP (CPOMDP) is such a model that maximizes the reward while constraining the cost, extending the standard POMDP model. To date, solution methods for CPOMDPs assume an explicit model of the environment, and thus are hardly applicable to large-scale real-world problems. In this paper, we present CC-POMCP (Cost-Constrained POMCP), an online MCTS algorithm for large CPOMDPs that leverages the optimization of LP-induced parameters and only requires a black-box simulator of the environment. In the experiments, we demonstrate that CC-POMCP converges to the optimal stochastic action selection in CPOMDP and pushes the state-of-the-art by being able to scale to very large problems.
Hierarchical Reinforcement Learning for Zero-shot Generalization with Subtask Dependencies
Sohn, Sungryull, Oh, Junhyuk, Lee, Honglak
We introduce a new RL problem where the agent is required to generalize to a previously-unseen environment characterized by a subtask graph which describes a set of subtasks and their dependencies. Unlike existing hierarchical multitask RL approaches that explicitly describe what the agent should do at a high level, our problem only describes properties of subtasks and relationships among them, which requires the agent to perform complex reasoning to find the optimal subtask to execute. To solve this problem, we propose a neural subtask graph solver (NSGS) which encodes the subtask graph using a recursive neural network embedding. To overcome the difficulty of training, we propose a novel non-parametric gradient-based policy, graph reward propagation, to pre-train our NSGS agent and further finetune it through actor-critic method. The experimental results on two 2D visual domains show that our agent can perform complex reasoning to find a near-optimal way of executing the subtask graph and generalize well to the unseen subtask graphs. In addition, we compare our agent with a Monte-Carlo tree search (MCTS) method showing that our method is much more efficient than MCTS, and the performance of NSGS can be further improved by combining it with MCTS.
StarAlgo: A Squad Movement Planning Library for StarCraft using Monte Carlo Tree Search and Negamax
Viazovskyi, Mykyta, Certicky, Michal
Real-Time Strategy (RTS) games have recently become a popular testbed for artificial intelligence research. They represent a complex adversarial domain providing a number of interesting AI challenges. There exists a wide variety of research-supporting software tools, libraries and frameworks for one RTS game in particular -- StarCraft: Brood War. These tools are designed to address various specific sub-problems, such as resource allocation or opponent modelling so that researchers can focus exclusively on the tasks relevant to them. We present one such tool -- a library called StarAlgo that produces plans for the coordinated movement of squads (groups of combat units) within the game world. StarAlgo library can solve the squad movement planning problem using one of two algorithms: Monte Carlo Tree Search Considering Durations (MCTSCD) and a slightly modified version of Negamax. We evaluate both the algorithms, compare them, and demonstrate their usage. The library is implemented as a static C++ library that can be easily plugged into most StarCraft AI bots.
Tech Giant AI Researchers Boycott Nature 'Machine Intelligence' Journal
NEW YORK, NY - JUNE 16: Director of Facebook AI Research Yann LeCun attends the 2016 Wired Business Conference on June 16, 2016 in New York City. Renowned artificial intelligence (AI) experts from almost all of the tech giants are planning to boycott a new journal from Nature Publishing Group, which is widely regarded as one of the most influential science publishers in the world. Nature's new Machine Intelligence Journal is due to be published for the first time in January 2019. Nature said it will cover the "best research from across the field of artificial intelligence" but it will also be a closed access journal, and this has angered many in the AI community who want to see AI research openly available to everyone. Over 2,000 people -- including more than 75 from Google, 25 from Microsoft, 23 from DeepMind, 16 from Facebook, and 11 from Amazon -- have pledged to "not submit to, review, or edit for this new journal". They each signed a statement from Oregon State University's Professor Thomas Dietterich that was published on Monday.
Toward Cognitive and Immersive Systems: Experiments in a Cognitive Microworld
Peveler, Matthew, Govindarajulu, Naveen Sundar, Bringsjord, Selmer, Srivastava, Biplav, Talamadupula, Kartik, Su, Hui
As computational power has continued to increase, and sensors have become more accurate, the corresponding advent of systems that are at once cognitive and immersive has arrived. These \textit{cognitive and immersive systems} (CAISs) fall squarely into the intersection of AI with HCI/HRI: such systems interact with and assist the human agents that enter them, in no small part because such systems are infused with AI able to understand and reason about these humans and their knowledge, beliefs, goals, communications, plans, etc. We herein explain our approach to engineering CAISs. We emphasize the capacity of a CAIS to develop and reason over a `theory of the mind' of its human partners. This capacity entails that the AI in question has a sophisticated model of the beliefs, knowledge, goals, desires, emotions, etc.\ of these humans. To accomplish this engineering, a formal framework of very high expressivity is needed. In our case, this framework is a \textit{cognitive event calculus}, a particular kind of quantified multi-operator modal logic, and a matching high-expressivity automated reasoner and planner. To explain, advance, and to a degree validate our approach, we show that a calculus of this type satisfies a set of formal requirements, and can enable a CAIS to understand a psychologically tricky scenario couched in what we call the \textit{cognitive polysolid framework} (CPF). We also formally show that a room that satisfies these requirements can have a useful property we term \emph{expectation of usefulness}. CPF, a sub-class of \textit{cognitive microworlds}, includes machinery able to represent and plan over not merely blocks and actions (such as seen in the primitive `blocks worlds' of old), but also over agents and their mental attitudes about both other agents and inanimate objects.
Lifelong Path Planning with Kinematic Constraints for Multi-Agent Pickup and Delivery
Ma, Hang, Hönig, Wolfgang, Kumar, T. K. Satish, Ayanian, Nora, Koenig, Sven
The Multi-Agent Pickup and Delivery (MAPD) problem models applications where a large number of agents attend to a stream of incoming pickup-and-delivery tasks. Token Passing (TP) is a recent MAPD algorithm that is efficient and effective. We make TP even more efficient and effective by using a novel combinatorial search algorithm, called Safe Interval Path Planning with Reservation Table (SIPPwRT), for single-agent path planning. SIPPwRT uses an advanced data structure that allows for fast updates and lookups of the current paths of all agents in an online setting. The resulting MAPD algorithm TP-SIPPwRT takes kinematic constraints of real robots into account directly during planning, computes continuous agent movements with given velocities that work on non-holonomic robots rather than discrete agent movements with uniform velocity, and is complete for well-formed MAPD instances. We demonstrate its benefits for automated warehouses using both an agent simulator and a standard robot simulator. For example, we demonstrate that it can compute paths for hundreds of agents and thousands of tasks in seconds and is more efficient and effective than existing MAPD algorithms that use a post-processing step to adapt their paths to continuous agent movements with given velocities.
A Recap of the AAAI and IAAI 2018 Conferences and the EAAI Symposium
McIlraith, Sheila (University of Toronto) | Weinberger, Kilian (Cornell University) | Youngblood, G. Michael (PARC) | Myers, Karen (SRI International) | Eaton, Eric (University of Pennsylvania) | Wollowski, Michael (Rose-Hulman Institute of Technology)
The 2018 AAAI Conference on Artificial Intelligence, the 2018 Innovative Applications of Artificial Intelligence, and the 2018 Symposium on Educational Advances in Artificial Intelligence were held February 2–7, 2018 at the Hilton New Orleans Riverside, New Orleans, Louisiana, USA. This report, based on the prefaces contained in the AAAI-18 proceedings and program, summarizes the events of the conference.
Reports of the Workshops of the 32nd AAAI Conference on Artificial Intelligence
Bouchard, Bruno (Université du Québec à Chicoutimi) | Bouchard, Kevin (Université du Québec à Chicoutimi) | Brown, Noam (Carnegie Mellon University) | Chhaya, Niyati (Adobe Research, Bangalore) | Farchi, Eitan (IBM Research, Haifa) | Gaboury, Sebastien (Université du Québec à Chicoutimi) | Geib, Christopher (Smart Information Flow Technologies) | Gyrard, Amelie (Wright State University) | Jaidka, Kokil (University of Pennsylvania) | Keren, Sarah (Technion – Israel Institute of Technology) | Khardon, Roni (Tufts University) | Kordjamshidi, Parisa (Tulane University) | Martinez, David (MIT Lincoln Laboratory) | Mattei, Nicholas (IBM Research, TJ Watson) | Michalowski, Martin (University of Minnesota School of Nursing) | Mirsky, Reuth (Ben Gurion University) | Osborn, Joseph (Pomona College) | Sahin, Cem (MIT Lincoln Laboratory) | Shehory, Onn (Bar Ilan University) | Shaban-Nejad, Arash (University of Tennessee Health Science Center) | Sheth, Amit (Wright State University) | Shimshoni, Ilan (University of Haifa) | Shrobe, Howie (Massachusetts Institute of Technology) | Sinha, Arunesh (University of Southern California.) | Sinha, Atanu R. (Adobe Research, Bangalore) | Srivastava, Biplav (IBM Research, Yorktown Height) | Streilein, William (MIT Lincoln Laboratory) | Theocharous, Georgios (Adobe Research, San Jose) | Venable, K. Brent (Tulane University and IHMC) | Wagner, Neal (MIT Lincoln Laboratory) | Zamansky, Anna (University of Haifa)
The AAAI-18 workshop program included 15 workshops covering a wide range of topics in AI. Workshops were held Sunday and Monday, February 2–7, 2018, at the Hilton New Orleans Riverside in New Orleans, Louisiana, USA. This report contains summaries of the Affective Content Analysis workshop; the Artificial Intelligence Applied to Assistive Technologies and Smart Environments; the AI and Marketing Science workshop; the Artificial Intelligence for Cyber Security workshop; the AI for Imperfect-Information Games; the Declarative Learning Based Programming workshop; the Engineering Dependable and Secure Machine Learning Systems workshop; the Health Intelligence workshop; the Knowledge Extraction from Games workshop; the Plan, Activity, and Intent Recognition workshop; the Planning and Inference workshop; the Preference Handling workshop; the Reasoning and Learning for Human-Machine Dialogues workshop; and the the AI Enhanced Internet of Things Data Processing for Intelligent Applications workshop.
TuSeRACT: Turn-Sample-Based Real-Time Traffic Signal Control
Dhamija, Srishti, Varakantham, Pradeep
Real-time traffic signal control systems can effectively reduce urban traffic congestion but can also become significant contributors to congestion if poorly timed. Real-time traffic signal control is typically challenging owing to constantly changing traffic demand patterns, very limited planning time and various sources of uncertainty in the real world (due to vehicle detection or unobserved vehicle turn movements, for instance). SURTRAC (Scalable URban TRAffic Control) is a recently developed traffic signal control approach which computes delay-minimising and coordinated (across neighbouring traffic lights) schedules of oncoming vehicle clusters in real time. To ensure real-time responsiveness in the presence of turn-induced uncertainty, SURTRAC computes schedules which minimize the delay for the expected turn movements as opposed to minimizing the expected delay under turn-induced uncertainty. Furthermore, expected outgoing traffic clusters are communicated to downstream intersections. These approximations ensure real-time tractability, but degrade solution quality in the presence of turn-induced uncertainty. To address this limitation, we introduce TuSeRACT (Turn-Sample-based Real-time trAffic signal ConTrol), a distributed sample-based scheduling approach to traffic signal control. Unlike SURTRAC, TuSeRACT computes schedules that minimize expected delay over sampled turn movements of observed traffic, and communicates samples of traffic outflows to neighbouring intersections. We formulate this sample-based scheduling problem as a constraint program, and empirically evaluate our approach on synthetic traffic networks. We demonstrate that our approach results in substantially lower average vehicle waiting times as compared to SURTRAC when turn-induced uncertainty is present.