Planning & Scheduling
Artificial Intelligence โ the Future of Automation in Workforce Management
Artificial Intelligence (AI) has already started to influence processes and automate decision making in manufacturing, health care, finance and customer service industries. By many measures, HR appears to be next on that list. While the technology is still nascent, the building blocks exist to suggest that machine learning could ease the burden of complex analysis, surface insights, and trigger actions on behalf of managers in the workplace. One way is by comparing real-time data to historical data or benchmarks to identify statistically significant deviations from the norm. For example, if scheduled labor hours as a percent of sales is significantly higher than the norm in a group of stores, the system can detect this and send that information proactively to management. Another way is to take a series of interrelated decisions and use algorithms to model scenarios.
Make Peace, Not War The AI of Total War (Part 4)
Welcome to part 4 of my series on the AI of Total War. A game that completely re-built the campaign AI systems to accommodate for an increasingly more complex series of mechanics, resources and consequences. Rome II's adoption of the Monte Carlo Tree Search (MCTS) algorithm is a critical step in bring the campaign AI up to spec for more contemporary entries in the series. In this entry I'm going to look at how the MCTS systems were improved upon, as well as how the diplomacy systems have been scaled up for the modern era as Rome gave way to 2015's Total War: Attila. Attila is the ninth entry in the Total War franchise and transposes the conflict to the late 4th and early 5th century: an phase of history known as the Migration Period.
ScottyActivity: Mixed Discrete-Continuous Planning with Convex Optimization
Fernandez-Gonzalez, Enrique, Williams, Brian, Karpas, Erez
The state of the art practice in robotics planning is to script behaviors manually, where each behavior is typically generated using trajectory optimization. However, in order for robots to be able to act robustly and adapt to novel situations, they need to plan these activity sequences autonomously. Since the conditions and effects of these behaviors are tightly coupled through time, state and control variables, many problems require that the tasks of activity planning and trajectory optimization are considered together. There are two key issues underlying effective hybrid activity and trajectory planning: the sufficiently accurate modeling of robot dynamics and the capability of planning over long horizons. Hybrid activity and trajectory planners that employ mixed integer programming within a discrete time formulation are able to accurately model complex dynamics for robot vehicles, but are often restricted to relatively short horizons. On the other hand, current hybrid activity planners that employ continuous time formulations can handle longer horizons but they only allow actions to have continuous effects with constant rate of change, and restrict the allowed state constraints to linear inequalities. This is insufficient for many robotic applications and it greatly limits the expressivity of the problems that these approaches can solve. In this work we present the ScottyActivity planner, that is able to generate practical hybrid activity and motion plans over long horizons by employing recent methods in convex optimization combined with methods for planning with relaxed plan graphs and heuristic forward search. Unlike other continuous time planners, ScottyActivity can solve a broad class of robotic planning problems by supporting convex quadratic constraints on state variables and control variables that are jointly constrained and that affect multiple state variables simultaneously. In order to support planning over long horizons, ScottyActivity does not resort to time, state or control variable discretization. While straightforward formulations of consistency checks are not convex and do not scale, we present an efficient convex formulation, in the form of a Second Order Cone Program (SOCP), that is very fast to solve. We also introduce several new realistic domains that demonstrate the capabilities and scalability of our approach, and their simplified linear versions, that we use to compare with other state of the art planners. This work demonstrates the power of integrating advanced convex optimization techniques with discrete search methods and paves the way for extensions dealing with non-convex disjoint constraints, such as obstacle avoidance.
Learning to guide task and motion planning using score-space representation
Kim, Beomjoon, Wang, Zi, Kaelbling, Leslie Pack, Lozano-Perez, Tomas
In this paper, we propose a learning algorithm that speeds up the search in task and motion planning problems. Our algorithm proposes solutions to three different challenges that arise in learning to improve planning efficiency: what to predict, how to represent a planning problem instance, and how to transfer knowledge from one problem instance to another. We propose a method that predicts constraints on the search space based on a generic representation of a planning problem instance, called score-space, where we represent a problem instance in terms of the performance of a set of solutions attempted so far. Using this representation, we transfer knowledge, in the form of constraints, from previous problems based on the similarity in score space. We design a sequential algorithm that efficiently predicts these constraints, and evaluate it in three different challenging task and motion planning problems. Results indicate that our approach performs orders of magnitudes faster than an unguided planner
Decentralized Cooperative Planning for Automated Vehicles with Hierarchical Monte Carlo Tree Search
Kurzer, Karl, Zhou, Chenyang, Zรถllner, J. Marius
Today's automated vehicles lack the ability to cooperate implicitly with others. This work presents a Monte Carlo Tree Search (MCTS) based approach for decentralized cooperative planning using macro-actions for automated vehicles in heterogeneous environments. Based on cooperative modeling of other agents and Decoupled-UCT (a variant of MCTS), the algorithm evaluates the state-action-values of each agent in a cooperative and decentralized manner, explicitly modeling the interdependence of actions between traffic participants. Macro-actions allow for temporal extension over multiple time steps and increase the effective search depth requiring fewer iterations to plan over longer horizons. Without predefined policies for macro-actions, the algorithm simultaneously learns policies over and within macro-actions. The proposed method is evaluated under several conflict scenarios, showing that the algorithm can achieve effective cooperative planning with learned macro-actions in heterogeneous environments.
Counterexample-Guided Cartesian Abstraction Refinement for Classical Planning
Seipp, Jendrik, Helmert, Malte
Counterexample-guided abstraction refinement (CEGAR) is a method for incrementally computing abstractions of transition systems. We propose a CEGAR algorithm for computing abstraction heuristics for optimal classical planning. Starting from a coarse abstraction of the planning task, we iteratively compute an optimal abstract solution, check if and why it fails for the concrete planning task and refine the abstraction so that the same failure cannot occur in future iterations. A key ingredient of our approach is a novel class of abstractions for classical planning tasks that admits efficient and very fine-grained refinement. Since a single abstraction usually cannot capture enough details of the planning task, we also introduce two methods for producing diverse sets of heuristics within this framework, one based on goal atoms, the other based on landmarks. In order to sum their heuristic estimates admissibly we introduce a new cost partitioning algorithm called saturated cost partitioning. We show that the resulting heuristics outperform other state-of-the-art abstraction heuristics in many benchmark domains.
Learning Plannable Representations with Causal InfoGAN
Kurutach, Thanard, Tamar, Aviv, Yang, Ge, Russell, Stuart, Abbeel, Pieter
Pieter Abbeel UC Berkeley In recent years, deep generative models have been shown to'imagine' convincing high-dimensional observations such as images, audio, and even video, learning directly from raw data. In this work, we ask how to imagine goal-directed visual plans - a plausible sequence of observations that transition a dynamical system from its current configuration to a desired goal state, which can later be used as a reference trajectory for control. We focus on systems with high-dimensional observations, such as images, and propose an approach that naturally combines representation learning and planning. Our framework learns a generative model of sequential observations, where the generative process is induced by a transition in a low-dimensional planning model, and an additional noise. By maximizing the mutual information between the generated observations and the transition in the planning model, we obtain a low-dimensional representation that best explains the causal nature of the data. We structure the planning model to be compatible with efficient planning algorithms, and we propose several such models based on either discrete or continuous states. Finally, to generate a visual plan, we project the current and goal observations onto their respective states in the planning model, plan a trajectory, and then use the generative model to transform the trajectory to a sequence of observations. We demonstrate our method on imagining plausible visual plans of rope manipulation.
Potentially Guided Bidirectionalized RRT* for Fast Optimal Path Planning in Cluttered Environments
Tahir, Zaid, Qureshi, Ahmed H., Ayaz, Yasar, Nawaz, Raheel
Rapidly-exploring Random Tree star (RRT*) has recently gained immense popularity in the motion planning community as it provides a probabilistically complete and asymptotically optimal solution without requiring the complete information of the obstacle space. In spite of all of its advantages, RRT* converges to an optimal solution very slowly. Hence to improve the convergence rate, its bidirectional variants were introduced, the Bi-directional RRT* (B-RRT*) and Intelligent Bi-directional RRT* (IB-RRT*). However, as both variants perform pure exploration, they tend to suffer in highly cluttered environments. In order to overcome these limitations, we introduce a new concept of potentially guided bidirectional trees in our proposed Potentially Guided Intelligent Bi-directional RRT* (PIB-RRT*) and Potentially Guided Bi-directional RRT* (PB-RRT*). The proposed algorithms greatly improve the convergence rate and have a more efficient memory utilization. Theoretical and experimental evaluation of the proposed algorithms have been made and compared to the latest state of the art motion planning algorithms under different challenging environmental conditions and have proven their remarkable improvement in efficiency and convergence rate.
Baidu Apollo EM Motion Planner
Fan, Haoyang, Zhu, Fan, Liu, Changchun, Zhang, Liangliang, Zhuang, Li, Li, Dong, Zhu, Weicheng, Hu, Jiangtao, Li, Hongye, Kong, Qi
In this manuscript, we introduce a real-time motion planning system based on the Baidu Apollo (open source) autonomous driving platform. The developed system aims to address the industrial level-4 motion planning problem while considering safety, comfort and scalability. The system covers multilane and single-lane autonomous driving in a hierarchical manner: (1) The top layer of the system is a multilane strategy that handles lane-change scenarios by comparing lane-level trajectories computed in parallel. (2) Inside the lane-level trajectory generator, it iteratively solves path and speed optimization based on a Frenet frame. (3) For path and speed optimization, a combination of dynamic programming and spline-based quadratic programming is proposed to construct a scalable and easy-to-tune framework to handle traffic rules, obstacle decisions and smoothness simultaneously. The planner is scalable to both highway and lower-speed city driving scenarios. We also demonstrate the algorithm through scenario illustrations and on-road test results. The system described in this manuscript has been deployed to dozens of Baidu Apollo autonomous driving vehicles since Apollo v1.5 was announced in September 2017. As of May 16th, 2018, the system has been tested under 3,380 hours and approximately 68,000 kilometers (42,253 miles) of closed-loop autonomous driving under various urban scenarios. The algorithm described in this manuscript is available at https://github.com/ApolloAuto/apollo/tree/master/modules/planning.
Multitask Reinforcement Learning for Zero-shot Generalization with Subtask Dependencies
Sohn, Sungryull, Oh, Junhyuk, Lee, Honglak
We introduce a new RL problem where the agent is required to execute a given subtask graph which describes a set of subtasks and their dependency. Unlike existing multitask RL approaches that explicitly describe what the agent should do, a subtask graph in our problem only describes properties of subtasks and relationships among them, which requires the agent to perform complex reasoning to find the optimal sequence of subtasks to execute. To tackle this problem, we propose a neural subtask graph solver (NSS) which encodes the subtask graph using a recursive neural network. To overcome the difficulty of training, we propose a novel non-parametric gradient-based policy to pre-train our NSS agent. % and further finetune it through actor-critic method. The experimental results on two 2D visual domains show that our agent can perform complex reasoning to find a near-optimal way of executing the subtask graph and generalize well to the unseen subtask graphs. In addition, we compare our agent with a Monte-Carlo tree search (MCTS) method showing that (1) our method is much more efficient than MCTS and (2) combining MCTS with NSS dramatically improves the search performance.