Planning & Scheduling
Why Isn't 'Arrow' On Tonight? CW Superhero Crossover Event Causes Schedule Changes
If you were looking forward to a new episode of "Arrow" tonight, then this news might be a bit disappointing. The hit CW series will not be airing tonight because the network switched around its broadcast schedule this week, and its latest new episode aired this past Monday night instead. The reason for this change was to accommodate the four-show crossover event, "Crisis on Earth-X," with "Supergirl," "Arrow," "The Flash" and "DC's Legends of Tomorrow." With the "Arrow" part of the crossover airing on Monday night after "Supergirl," it allowed for the event to take place over two nights instead of three, and also prevented the flow from breaking up due to the usual Wednesday block of superhero-less programming. The scheduling for "Arrow" will go back to normal next week, just in time for its Season 6 midseason finale.
Time and Space Bounds for Planning
Bรคckstrรถm, Christer, Jonsson, Peter
There is an extensive literature on the complexity of planning, but explicit bounds on time and space complexity are very rare. On the other hand, problems like the constraint satisfaction problem (CSP) have been thoroughly analysed in this respect. We provide a number of upper- and lower-bound results (the latter based on various complexity-theoretic assumptions such as the Exponential Time Hypothesis) for both satisficing and optimal planning. We show that many classes of planning instances exhibit a dichotomy: either they can be solved in polynomial time or they cannot be solved in subexponential time. In many cases, we can even prove closely matching upper and lower bounds. Our results also indicate, analogously to CSPs, the existence of sharp phase transitions. We finally study and discuss the trade-off between time and space. In particular, we show that depth-first search may sometimes be a viable option for planning under severe space constraints.
The wealthy get the biggest benefit from House Republican tax plan, analysis finds
Trump opens Asia trip with Japan's Abe against backdrop of tensions with North Korea Just one in three Americans trust Trump to handle North Korean tensions well Japan's Abe treats Trump to a day of personal diplomacy, including golf and trucker hats Brazile says Democratic primaries weren't'rigged' though some see evidence in her new book Trump is silent on Saudi king's purge though he and Salman spoke by phone Japan's Abe treats Trump to a day of personal diplomacy, including golf and trucker hats Brazile says Democratic primaries weren't'rigged' though some see evidence in her new book Trump is silent on Saudi king's purge though he and Salman spoke by phone The greatest benefit from the House Republican tax bill would go to upper-income households, according to an analysis released Monday by the nonpartisan Tax Policy Center. Middle-income taxpayers -- those earning between $48,600 and $86,100 annually -- would receive an average tax cut of $700 next year, or about 1% of their after-tax income, the analysis said. The top 20% of the nation's earners -- those making more than $149,400 a year -- would receive an average tax cut of $4,850, or about 1.4% of after-tax income. Those top earners would also receive 60% of the total tax benefits under the plan. Of that, the top 1% of earners, defined as those making more than $730,000 a year, receive about 22% of the total amount of tax cuts in 2018, the Tax Policy Center said.
Monte-Carlo Tree Search by Best Arm Identification
Kaufmann, Emilie, Koolen, Wouter
We consider two-player zero-sum turn-based interactions, in which the sequence of possible successive moves is represented by a maximin game tree T. This tree models the possible actions sequences by a collection of MAX nodes, that correspond to states in the game in which player A should take action, MIN nodes, for states in the game in which player B should take action, and leaves which specify the payoff for player A. The goal is to determine the best action at the root for player A. For deterministic payoffs this search problem is primarily algorithmic, with several powerful pruning strategies available [20]. We look at problems with stochastic payoffs, which in addition present a major statistical challenge. Sequential identification questions in game trees with stochastic payoffs arise naturally as robust versions of bandit problems. They are also a core component of Monte Carlo tree search (MCTS) approaches for solving intractably large deterministic tree search problems, where an entire sub-tree is represented by a stochastic leaf in which randomized play-out and/or evaluations are performed [4]. A play-out consists in finishing the game with some simple, typically random, policy and observing the outcome for player A. For example, MCTS is used within the AlphaGo system [21], and the evaluation of a leaf position combines supervised learning and (smart) play-outs. While MCTS algorithms for Go have now reached expert human level, such algorithms remain very costly, in that many (expensive) leaf evaluations or play-outs are necessary to output the next action to be taken by the player. In this paper, we focus on the sample complexity of Monte-Carlo Tree Search methods, about which very little is known. For this purpose, we work under a simplified model for MCTS already studied by [22], and that generalizes the depth-two framework of [10].
RADAR โ A Proactive Decision Support System for Human-in-the-Loop Planning
Sengupta, Sailik (Arizona State University) | Chakraborti, Tathagata (Arizona State University) | Sreedharan, Sarath (Arizona State University) | Vadlamudi, Satya Gautam (Arizona State University) | Kambhampati, Subbarao (Arizona State University)
Proactive Decision Support (PDS) aims at improving the decision making experience ofย human decision makers by enhancing both the quality of the decisions and the ease of making them. In this paper, we ask the question what role automated decision-making technologies can play in the deliberative process of the human decision maker.Specifically, we focus on expert humans in the loop who now share a detailed, if not complete, model of the domain with the assistant, but may still be unable to compute plans due to cognitive overload.ย To this end, we propose a PDS framework RADAR based on research in the automated planning community that aids the human decision maker in constructing plans.ย We will situate our discussion on principles of interface design laid out in the literature on the degrees of automation and its effect on the collaborative decision-making process. ย Also, at the heart of our design is the principle ofย naturalistic decision making which has been shown to be a necessary requirement of such systems, thus focusing more on providing suggestions rather than enforcing decisions and executing actions.ย We will demonstrate the different properties of such a system through examples in a fire-fighting domain, where human commanders are involved in building response strategies to mitigate a fire outbreak.The paper is written to serve both as a position paper by motivating requirements of an effective proactive decision support system, and also an emerging application of these ideas in the context of the role of an automated planner in human decision making, in a platform that can prove to be a valuable test bed for research on the same.
Towards Intelligent Decision Support in Human Team Planning
Kim, Joseph (Massachusetts Institute of Technology) | Shah, Julie A. (Massachusetts Institute of Technology)
Inherent human limitations in teaming environments coupled with complex planning problems spur the integration of intelligent decision support (IDS) systems for human-agent planning. However, prior research in human-agent planning has been limited to dyadic interaction between a single human and a single planning agent. In this paper, we highlight an emerging research area of IDS for human team planning, i.e. environments where the agent works with a team of human planners to enhance the quality of their plans and the ease of making them. We review prior works in human-agent planning and identify research challenges for an agent participating in human team planning.
Toward Crowd-Sensitive Path Planning
Aroor, Anoop (City University of New York) | Epstein, Susan L. (Hunter College, City University of New York)
If a robot can predict crowds in parts of its environment that are inaccessible to its sensors, then it can plan to avoid them. This paper proposes a fast, online algorithm that learns average crowd densities in different areas. It also describes how these densities can be incorporated into existing navigation architectures. In simulation across multiple challenging crowd scenarios, the robot reaches its target faster, travels less, and risks fewer collisions than if it were to plan with the traditional A* algorithm.
Resolving Over-Constrained Temporal Problems with Uncertainty through Conflict-Directed Relaxation
Yu, Peng, Williams, Brian, Fang, Cheng, Cui, Jing, Haslum, Patrik
Over-subscription, that is, being assigned too many things to do, is commonly encountered in temporal scheduling problems. As human beings, we often want to do more than we can actually do, and underestimate how long it takes to perform each task. Decision makers can benefit from aids that identify when these failure situations are likely, the root causes of these failures, and resolutions to these failures. In this paper, we present a decision assistant that helps users resolve over-subscribed temporal problems. The system works like an experienced advisor that can quickly identify the cause of failure underlying temporal problems and compute resolutions. The core of the decision assistant is the Best-first Conflict-Directed Relaxation (BCDR) algorithm, which can detect conflicting sets of constraints within temporal problems, and computes continuous relaxations for them that weaken constraints to the minimum extent, instead of removing them completely. BCDR is an extension to the Conflict-Directed A* algorithm, first developed in the model-based reasoning community to compute most likely system diagnoses or reconfigurations. It generalizes the discrete conflicts and relaxations, to hybrid conflicts and relaxations, which denote minimal inconsistencies and minimal relaxations to both discrete and continuous relaxable constraints. In addition, BCDR is capable of handling temporal uncertainty, expressed as either set-bounded or probabilistic durations, and can compute preferred trade-offs between the risk of violating a schedule requirement, versus the loss of utility by weakening those requirements. BCDR has been applied to several decision support applications in different domains, including deep-sea exploration, urban travel planning and transit system management. It has demonstrated its effectiveness in helping users resolve over-subscribed scheduling problems and evaluate the robustness of existing solutions. In our benchmark experiments, BCDR has also demonstrated its efficiency on solving large-scale scheduling problems in the aforementioned domains. Thanks to its conflict-driven approach for computing relaxations, BCDR achieves one to two orders of magnitude improvements on runtime performance when compared to state-of-the-art numerical solvers.
AlphaGo Zero: Minimal Policy Improvement, Expectation Propagation and other Connections
This is a post about the new reinforcement learning technique that enables AlphaGo Zero to learn Go from scratch via self-play. The paper has been out for a week I guess it's now considered old - sorry for the latency. I'm no expert in RL, so I'm pretty sure many of you are going to come at me with pitchforks shouting "this is all trivial" or "this has been done before" or "this is no different from X". Please do, I'm here to learn. Background: The original AlphaGo used a combination of two neural networks - the policy and value networks - and a Monte Carlo Tree Search (MCTS) algorithm to play Go. For each move, the policy network is first evaluated to give an initial strategy $\pmb{p}$.