Country
Computing Stackelberg Equilibria in Discounted Stochastic Games
Vorobeychik, Yevgeniy (Sandia National Laboratories) | Singh, Satinder (University of Michigan)
Stackelberg games increasingly influence security policies deployed in real-world settings. Much of the work to date focuses on devising a fixed randomized strategy for the defender, accounting for an attacker who optimally responds to it. In practice, defense policies are often subject to constraints and vary over time, allowing an attacker to infer characteristics of future policies based on current observations. A defender must therefore account for an attacker's observation capabilities in devising a security policy. We show that this general modeling framework can be captured using stochastic Stackelberg games (SSGs), where a defender commits to a dynamic policy to which the attacker devises an optimal dynamic response. We then offer the following contributions. 1) We show that Markov stationary policies suffice in SSGs, 2) present a finite-time mixed-integer non-linear program for computing a Stackelberg equilibrium in SSGs, and 3) present a mixed-integer linear program to approximate it. 4) We illustrate our algorithms on a simple SSG representing an adversarial patrolling scenario, where we study the impact of attacker patience and risk aversion on optimal defense policies.
Last-Mile Restoration for Multiple Interdependent Infrastructures
Coffrin, Carleton (Brown University) | Hentenryck, Pascal Van (NICTA) | Bent, Russell (Los Alamos National Laboratory)
This paper considers the restoration of multiple interdependent infrastructures after a man-made or natural disaster. Modern infrastructures feature complex cyclic interdependencies and require a holistic restoration process. This paper presents the first scalable approach for the last-mile restoration of the joint electrical power and gas infrastructures. It builds on an earlier three-stage decomposition for restoring the power network that decouples the restoration ordering and the routing aspects. The key contributions of the paper are (1) mixed-integer programming models for finding a minimal restoration set and a restoration ordering and (2) a randomized adaptive decomposition to obtain high-quality solutions within the required time constraints. The approach is validated on a large selection of benchmarks based on the United States infrastructures and state-of-the-art weather and fragility simulation tools. The results show significant improvements over current field practices.
Influence-Based Abstraction for Multiagent Systems
Oliehoek, Frans Adriaan (Maastricht University) | Witwicki, Stefan J. (INESC-ID) | Kaelbling, Leslie Pack (Massachusetts Institute of Technology)
This paper presents a theoretical advance by which factored POSGs can be decomposed into local models. We formalize the interface between such local models as the influence agents can exert on one another; and we prove that this interface is sufficient for decoupling them. The resulting influence-based abstraction substantially generalizes previous work on exploiting weakly-coupled agent interaction structures. Therein lie several important contributions. First, our general formulation sheds new light on the theoretical relationships among previous approaches, and promotes future empirical comparisons that could come by extending them beyond the more specific problem contexts for which they were developed. More importantly, the influence-based approaches that we generalize have shown promising improvements in the scalability of planning for more restrictive models. Thus, our theoretical result here serves as the foundation for practical algorithms that we anticipate will bring similar improvements to more general planning contexts, and also into other domains such as approximate planning, decision-making in adversarial domains, and online learning.
From Streamlined Combinatorial Search to Efficient Constructive Procedures
Bras, Ronan Le (Cornell University) | Gomes, Carla (Cornell University) | Selman, Bart (Cornell University)
In recent years, significant progress in the area of search, constraint satisfaction, and automated reasoning has been driven in part by the study of challenge problems from combinatorics and finite algebra. This work has led to the discovery of interesting discrete structures with intricate mathematical properties. While some of those results have resolved open questions and conjectures, a shortcoming is that they generally do not provide further mathematical insights, from which one could derive more general observations. We propose an approach that integrates specialized combinatorial search, using so-called streamlining, with a human computation component. We use this approach to discover efficient constructive procedures for generating certain classes of combinatorial objects of any size. More specifically, using our framework, we discovered two complementary efficient constructions for generating so-called Spatially Balanced Latin squares (SBLS) of any order N, such that 2N+1 is prime. Previously constructions for SBLSs were not known. Our approach also enabled us to derive a new lower bound for so-called weak Schur numbers, improving on a series of earlier results for Schur numbers.
Planning Under Time Pressure
Burns, Ethan Andrew (University of New Hampshire)
Heuristic search is a technique used pervasively in the fieldsof artificial intelligence, automated planning and operations research to solve a wide range of problems from planning military deployments to planning tasks for a robot that must clean a messy kitchen. An automated agent can use heuristic search to construct a plan that, when executed, will achieve a desired task. The search algorithm explores different sequences of actions that the agent can execute, looking for a sequence that will lead it to a desired goal state. In many situations, an agent is given a task that it would like to solve as quickly as possible. The agent must allocate its time between searching for the actions that will achieve the task and actually executing them. We call this problem planning under time pressure.
Approximating the Sum Operation for Marginal-MAP Inference
Cheng, Qiang (Tsinghua University) | Chen, Feng (Tsinghua University) | Dong, Jianwu (Tsinghua University) | Xu, Wenli (Tsinghua University) | Ihler, Alexander (University of California, Irvine)
We study the marginal-MAP problem on graphical models, and present a novel approximation method based on direct approximation of the sum operation. A primary difficulty of marginal-MAP problems lies in the non-commutativity of the sum and max operations, so that even in highly structured models, marginalization may produce a densely connected graph over the variables to be maximized, resulting in an intractable potential function with exponential size. We propose a chain decomposition approach for summing over the marginalized variables, in which we produce a structured approximation to the MAP component of the problem consisting of only pairwise potentials. We show that this approach is equivalent to the maximization of a specific variational free energy, and it provides an upper bound of the optimal probability. Finally, experimental results demonstrate that our method performs favorably compared to previous methods.
Stochastic Safest and Shortest Path Problems
Teichteil-Kรถnigsbuch, Florent (ONERA)
Optimal solutions to Stochastic Shortest Path Problems (SSPs) usually require that there exists at least one policy that reaches the goal with probability 1 from the initial state. This condition is very strong and prevents from solving many interesting problems, for instance where all possible policies reach some dead-end states with a positive probability. We introduce a more general and richer dual optimization criterion, which minimizes the average (undiscounted) cost of only paths leading to the goal among all policies that maximize the probability to reach the goal. We present policy update equations in the form of dynamic programming for this new dual criterion, which are different from the standard Bellman equations, but produce the same solution if there exists one policy leading to the goal with probability 1 from the initial state. We demonstrate that our equations converge in infinite horizon without any condition on the structure of the problem or on its policies, which actually extends the class of SSPs that can be solved. We experimentally show that our dual criterion provides well-founded solutions to SSPs that can not be solved by the standard criterion, and that using a discount factor with the latter certainly provides solution policies but which are not optimal considering our well-founded criterion.
A Novel and Scalable Spatio-Temporal Technique for Ocean Eddy Monitoring
Faghmous, James H. (The University of Minnesota) | Chamber, Yashu (The University of Minnesota) | Boriah, Shyam (The University of Minnesota) | Vikebรธ, Frode ( Institute of Marine Research ) | Liess, Stefan (The University of Minnesota) | Mesquita, Michel dos Santos (Bjerknes Centre for Climate Research) | Kumar, Vipin (The University of Minnesota)
Swirls of ocean currents known as ocean eddies are a crucial component of the ocean's dynamics. In addition to dominating the ocean's kinetic energy, eddies play a significant role in the transport of water, salt, heat, and nutrients. Therefore, understanding current and future eddy patterns is a central climate challenge to address future sustainability of marine ecosystems. The emergence of sea surface height observations from satellite radar altimeter has recently enabled researchers to track eddies at a global scale. The majority of studies that identify eddies from observational data employ highly parametrized connected component algorithms using expert filtered data, effectively making reproducibility and scalability challenging. In this paper, we frame the challenge of monitoring ocean eddies as an unsupervised learning problem. We present a novel change detection algorithm that automatically identifies and monitors eddies in sea surface height data based on heuristics derived from basic eddy properties. Our method is accurate, efficient, and scalable. To demonstrate its performance we analyze eddy activity in the Nordic Sea (60-80N and 20W-20E), an area that has received limited attention and has proven to be difficult to analyze using other methods.
TD-DeltaPi: A Model-Free Algorithm for Efficient Exploration
Silva, Bruno C. da (University of Massachusetts Amherst) | Barto, Andrew G. (University of Massachusetts Amherst)
We study the problem of finding efficient exploration policies for the case in which an agent is momentarily not concerned with exploiting, and instead tries to compute a policy for later use. We first formally define the Optimal Exploration Problem as one of sequential sampling and show that its solutions correspond to paths of minimum expected length in the space of policies. We derive a model-free, local linear approximation to such solutions and use it to construct efficient exploration policies. We compare our model-free approach to other exploration techniques, including one with the best known PAC bounds, and show that ours is both based on a well-defined optimization problem and empirically efficient.
Optimization and Controlled Systems: A Case Study on Thermal Aware Workload Dispatching
Bartolini, Andrea (University of Bologna) | Lombardi, Michele (University of Bologna) | Milano, Michela (University of Bologna) | Benini, Luca ( DEIS, University of Bologna )
Although successfully employed on many industrial problems, Combinatorial Optimization still has limited applicability on several real-world domains, often due to modeling difficulties. This is typically the case for systems under the control of an on-line policy: even when the policy itself is well known, capturing its effect on the system in a declarative model is often impossible by conventional means. Such a difficulty is at the root of the classical, sharp separation between off- line and on-line approaches. In this paper, we investigate a general method to model controlled systems, based on the integration of Machine Learning and Constraint Programming (CP). Specifically, we use an Artificial Neural Network (ANN) to learn the behavior of a controlled system (a multicore CPU with thermal con- trollers) and plug it into a CP model by means of Neuron Constraints. The method obtains significantly better results compared to an approach with no ANN guidance. Neuron Constraints were first introduced in [Bartolini et al., 2011b] as a mean to model complex systems: providing evidence of their applicability to controlled systems is a significant step forward, broadening the application field of combinatorial methods and disclosing opportunities for hybrid off-line/on-line optimization.