AITopics

2002.12086

Country: Europe (0.28)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

AAAI ConferencesJun-20-2018

Algorithms and Conditional Lower Bounds for Planning Problems

Chatterjee, Krishnendu (Institute of Science and Technology Austria) | Dvořák, Wolfgang (Vienna University of Technology) | Henzinger, Monika (University of Vienna) | Svozil, Alexander (University of Vienna)

We consider planning problems for graphs, Markov decision processes (MDPs), and games on graphs. While graphs represent the most basic planning model, MDPs represent interaction with nature and games on graphs represent interaction with an adversarial environment.We consider two planning problems where there are k different target sets, and the problems are as follows: (a) the coverage problem asks whether there is a plan for each individual target set, and (b) the sequential target reachability problem asks whether the targets can be reached in sequence. For the coverage problem, we present a linear-time algorithm for graphs, and quadratic conditional lower bound for MDPs and games on graphs.For the sequential target problem, we present a linear-time algorithm for graphs, a sub-quadratic algorithm for MDPs, and a quadratic conditional lower bound for games on graphs.Our results with conditional lower bounds establish (i) model-separation results showing that for the coverage problem MDPs and games on graphs are harder than graphs and for the sequential reachability problem games on graphs are harder than MDPs and graphs;and (ii) objective-separation results showing that for MDPs the coverage problem is harder than the sequential target problem.

algorithm and conditional lower bound, planning problem

Twenty-Eighth International Conference on Automated Planning and Scheduling

Genre: Research Report > New Finding (0.73)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (0.80)

AAAI ConferencesJun-20-2018

Sensor Synthesis for POMDPs with Reachability Objectives

Chatterjee, Krishnendu (Institute of Science and Technology Austria) | Chmelik, Martin (TTTech Computertechnik AG) | Topcu, Ufuk (University of Texas at Austin)

Partially observable Markov decision processes (POMDPs) are widely used in probabilistic planning problems in which an agent interacts with an environment using noisy and imprecise sensors. We study a setting in which the sensors are only partially defined and the goal is to synthesize “weakest” additional sensors, such that in the resulting POMDP, there is a small-memory policy for the agent that almost-surely (with probability 1) satisfies a reachability objective. We show that the problem is NP-complete, and present a symbolic algorithm by encoding the problem into SAT instances. We illustrate trade-offs between the amount of memory of the policy and the number of additional sensors on a simple example. We have implemented our approach and consider three classical POMDP examples from the literature, and show that in all the examples the number of sensors can be significantly decreased (as compared to the existing solutions in the literature) without increasing the complexity of the policies.

pomdp, reachability objective, sensor synthesis

Twenty-Eighth International Conference on Automated Planning and Scheduling

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

arXiv.org Artificial IntelligenceApr-30-2018

Expectation Optimization with Probabilistic Guarantees in POMDPs with Discounted-sum Objectives

Chatterjee, Krishnendu, Elgyütt, Adrián, Novotný, Petr, Rouillé, Owen

Partially-observable Markov decision processes (POMDPs) with discounted-sum payoff are a standard framework to model a wide range of problems related to decision making under uncertainty. Traditionally, the goal has been to obtain policies that optimize the expectation of the discounted-sum payoff. A key drawback of the expectation measure is that even low probability events with extreme payoff can significantly affect the expectation, and thus the obtained policies are not necessarily risk-averse. An alternate approach is to optimize the probability that the payoff is above a certain threshold, which allows obtaining risk-averse policies, but ignores optimization of the expectation. We consider the expectation optimization with probabilistic guarantee (EOPG) problem, where the goal is to optimize the expectation ensuring that the payoff is above a given threshold with at least a specified probability. We present several results on the EOPG problem, including the first algorithm to solve it.

artificial intelligence, machine learning, probability, (17 more...)

1804.10601

Country:

Europe > Austria (0.14)
Europe > France (0.14)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

arXiv.org Artificial IntelligenceApr-26-2018

Computational Approaches for Stochastic Shortest Path on Succinct MDPs

Chatterjee, Krishnendu, Fu, Hongfei, Goharshady, Amir Kafshdar, Okati, Nastaran

We consider the stochastic shortest path (SSP) problem for succinct Markov decision processes (MDPs), where the MDP consists of a set of variables, and a set of nondeterministic rules that update the variables. First, we show that several examples from the AI literature can be modeled as succinct MDPs. Then we present computational approaches for upper and lower bounds for the SSP problem: (a)~for computing upper bounds, our method is polynomial-time in the implicit description of the MDP; (b)~for lower bounds, we present a polynomial-time (in the size of the implicit description) reduction to quadratic programming. Our approach is applicable even to infinite-state MDPs. Finally, we present experimental results to demonstrate the effectiveness of our approach on several classical examples from the AI literature.

artificial intelligence, mdp, optimization problem, (18 more...)

1804.08984

Country:

Europe > Austria (0.14)
Asia > Middle East (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.89)

arXiv.org Artificial IntelligenceApr-19-2018

Algorithms and Conditional Lower Bounds for Planning Problems

Chatterjee, Krishnendu, Dvořák, Wolfgang, Henzinger, Monika, Svozil, Alexander

We consider planning problems for graphs, Markov decision processes (MDPs), and games on graphs. While graphs represent the most basic planning model, MDPs represent interaction with nature and games on graphs represent interaction with an adversarial environment. We consider two planning problems where there are k different target sets, and the problems are as follows: (a) the coverage problem asks whether there is a plan for each individual target set, and (b) the sequential target reachability problem asks whether the targets can be reached in sequence. For the coverage problem, we present a linear-time algorithm for graphs and quadratic conditional lower bound for MDPs and games on graphs. For the sequential target problem, we present a linear-time algorithm for graphs, a sub-quadratic algorithm for MDPs, and a quadratic conditional lower bound for games on graphs. Our results with conditional lower bounds establish (i) model-separation results showing that for the coverage problem MDPs and games on graphs are harder than graphs and for the sequential reachability problem games on graphs are harder than MDPs and graphs; (ii) objective-separation results showing that for MDPs the coverage problem is harder than the sequential target problem.

artificial intelligence, graph, planning & scheduling, (19 more...)

1804.07031

Country:

Europe (0.46)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)

Genre: Research Report > New Finding (1.00)

Industry: Leisure & Entertainment (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (0.90)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.48)

AAAI ConferencesFeb-14-2017

Optimizing Expectation with Guarantees in POMDPs

Chatterjee, Krishnendu (The Institute of Science and Technology Austria) | Novotný, Petr (The Institute of Science and Technology Austria) | Pérez, Guillermo A. (Université Libre de Bruxelles) | Raskin, Jean-François (Université Libre de Bruxelles) | Žikelić, Đorđe (University of Cambridge)

A standard objective in partially-observable Markov decision processes (POMDPs) is to find a policy that maximizes the expected discounted-sum payoff. However, such policies may still permit unlikely but highly undesirable outcomes, which is problematic especially in safety-critical applications. Recently, there has been a surge of interest in POMDPs where the goal is to maximize the probability to ensure that the payoff is at least a given threshold, but these approaches do not consider any optimization beyond satisfying this threshold constraint. In this work we go beyond both the “expectation” and “threshold” approaches and consider a “guaranteed payoff optimization (GPO)” problem for POMDPs, where we are given a threshold t and the objective is to find a policy σ such that a) each possible outcome of σ yields a discounted-sum payoff of at least t, and b) the expected discounted-sum payoff of σ is optimal (or near-optimal) among all policies satisfying a). We present a practical approach to tackle the GPO problem and evaluate it on standard POMDP benchmarks.

artificial intelligence, machine learning, pomdp, (19 more...)

Thirty-First AAAI Conference on Artificial Intelligence

Country:

Europe > Austria (0.14)
Europe > Belgium (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

arXiv.org Artificial IntelligenceMay-11-2016

Stochastic Shortest Path with Energy Constraints in POMDPs

Brázdil, Tomáš, Chatterjee, Krishnendu, Chmelík, Martin, Gupta, Anchit, Novotný, Petr

We consider partially observable Markov decision processes (POMDPs) with a set of target states and positive integer costs associated with every transition. The traditional optimization objective (stochastic shortest path) asks to minimize the expected total cost until the target set is reached. We extend the traditional framework of POMDPs to model energy consumption, which represents a hard constraint. The energy levels may increase and decrease with transitions, and the hard constraint requires that the energy level must remain positive in all steps till the target is reached. First, we present a novel algorithm for solving POMDPs with energy levels, developing on existing POMDP solvers and using RTDP as its main method. Our second contribution is related to policy representation. For larger POMDP instances the policies computed by existing solvers are too large to be understandable. We present an automated procedure based on machine learning techniques that automatically extracts important decisions of the policy allowing us to compute succinct human readable policies. Finally, we show experimentally that our algorithm performs well and computes succinct policies on a number of POMDP instances from the literature that were naturally enhanced with energy levels.

artificial intelligence, machine learning, pomdp, (17 more...)

1602.07565

Country:

North America > United States > Texas (0.14)
Europe > Austria > Vienna (0.14)

Industry: Energy (0.66)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

AAAI ConferencesApr-19-2016

A Symbolic SAT-Based Algorithm for Almost-Sure Reachability with Small Strategies in POMDPs

Chatterjee, Krishnendu (IST Austria) | Chmelík, Martin (IST Austria) | Davies, Jessica (IST Austria)

The qualitative problem is of great importance as in several applications it is The de facto model for dynamic systems with probabilistic required that the correct behavior happens with probability 1, and nondeterministic behavior are Markov decision processes e.g., in the analysis of randomized embedded schedulers, (MDPs) (Howard 1960). MDPs provide the appropriate the important question is whether every thread progresses model to solve control and probabilistic planning problems with probability 1. Also in applications where it might be (Filar and Vrieze 1997; Puterman 1994), where the nondeterminism sufficient that the correct behavior happens with probability represents the choice of the control actions for at least λ 1,the correct choice of the threshold λ can the controller (or planner), while the stochastic response of be still challenging, due to simplifications and imprecisions the system to control actions is represented by the probabilistic introduced during modeling.

artificial intelligence, health & medicine, pomdp, (19 more...)

Thirtieth AAAI Conference on Artificial Intelligence

Genre: Research Report > New Finding (0.68)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

AAAI ConferencesMar-6-2015

Optimal Cost Almost-Sure Reachability in POMDPs

Chatterjee, Krishnendu (IST Austria) | Chmelik, Martin (IST Austria) | Gupta, Raghav (IIT Bombay) | Kanodia, Ayush (IIT Bombay)

We consider partially observable Markov decision processes (POMDPs) with a set of target states and every transition is associated with an integer cost. The optimization objective we study asks to minimize the expected total cost till the target set is reached, while ensuring that the target set is reached almost-surely (with probability 1). We show that for integer costs approximating the optimal cost is undecidable. For positive costs, our results are as follows: (i) we establish matching lower and upper bounds for the optimal cost and the bound is double exponential; (ii) we show that the problem of approximating the optimal cost is decidable and present approximation algorithms developing on the existing algorithms for POMDPs with finite-horizon objectives. While the worst-case running time of our algorithm is double exponential, we present efficient stopping criteria for the algorithm and show experimentally that it performs well in many examples of interest.

artificial intelligence, machine learning, pomdp, (19 more...)

Twenty-Ninth AAAI Conference on Artificial Intelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)