AITopics

One prominent way to deal with conflicting view-points among agents is to conduct an argumentative debate: by exchanging arguments, agents can seek to persuade each other. In this paper we investigate the problem, for an agent, of optimizing a sequence of moves to be put forward in a debate, against an opponent assumed to behave stochastically, and equipped with an unknown initial belief state. Despite the prohibitive number of states induced by a naive mapping to Markov models, we show that exploiting several features of such interaction settings allows for optimal resolution in practice, in particular: (1) as debates take place in a public space (or common ground), they can readily be modelled as Mixed Observability Markov Decision Processes, (2) as argumentation problems are highly structured, one can design optimization techniques to prune the initial instance. We report on the experimental evaluation of these techniques.

agent, argument, opponent, (17 more...)

Twenty-Fourth International Joint Conference on Artificial Intelligence

Country:

North America > United States > Indiana (0.04)
Europe > United Kingdom > England > Greater London > London (0.04)
Europe > France > Île-de-France > Paris > Paris (0.04)
(4 more...)

Industry: Leisure & Entertainment > Games (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.88)
Information Technology > Artificial Intelligence > Natural Language > Explanation & Argumentation (0.75)

Gilbert, Hugo (Sorbonne Universités, UPMC University of Paris 06, UMR 7606, LIP6 and CNRS, UMR 7606, LIP6) | Spanjaard, Olivier (Sorbonne Universités, UPMC University of Paris 06, UMR 7606, LIP6 and CNRS, UMR 7606, LIP6) | Viappiani, Paolo (Sorbonne Universités, UPMC University of Paris 06, UMR 7606, LIP6 and CNRS, UMR 7606, LIP6) | Weng, Paul (SYSU-CMU Joint Institute of Engineering, Guangzhou and SYSU-CMU Shunde International Joint Research Institute, Shunde)

Solving MDPs with Skew Symmetric Bilinear Utility Functions

In this paper we adopt Skew Symmetric Bilinear (SSB) utility functions to compare policies in Markov Decision Processes (MDPs). By considering pairs of alternatives, SSB utility theory generalizes von Neumann and Morgenstern's expected utility (EU) theory to encompass rational decision behaviors that EU cannot accommodate. We provide a game-theoretic analysis of the problem of identifying an SSB-optimal policy in finite horizon MDPs and propose an algorithm based on a double oracle approach for computing an optimal (possibly randomized) policy. Finally, we present and discuss experimental results where SSB-optimal policies are computed for a popular TV contest according to several instantiations of SSB utility functions.

augmented mdp, mdp, utility function, (16 more...)

Twenty-Fourth International Joint Conference on Artificial Intelligence

Country:

Europe > France > Île-de-France > Paris > Paris (0.04)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > United States > New York > New York County > New York City (0.04)
(2 more...)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Decision Support Systems (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.54)

Kordjamshidi, Parisa (University of Illinois at Urbana-Champaign) | Roth, Dan (University of Illinois at Urbana-Champaign) | Wu, Hao (University of Illinois at Urbana-Champaign)

Saul: Towards Declarative Learning Based Programming

We present Saul, a new probabilistic programming language designed to address some of the shortcomings of programming languages that aim at advancing and simplifying the development of AI systems. Such languages need to interact with messy, naturally occurring data, to allow a programmer to specify what needs to be done at an appropriate level of abstraction rather than at the data level, to be developed on a solid theory that supports moving to and reasoning at this level of abstraction and, finally, to support flexible integration of these learning and inference models within an application program. Saul is an object-functional programming language written in Scala that facilitates these by (1) allowing a programmer to learn, name and manipulate named abstractions over relational data; (2) supporting seamless incorporation of trainable (probabilistic or discriminative) components into the program, and (3) providing a level of inference over trainable models to support composition and make decisions that respect domain and application constraints. Saul is developed over a declaratively defined relational data model, can use piecewise learned factor graphs with declaratively specified learning and inference objectives, and it supports inference over probabilistic models augmented with declarative knowledge-based constraints.We describe the key constructs of Saul and exemplify its use in developing applications that require relational feature engineering and structured output prediction.

inference, programmer, saul, (16 more...)

Twenty-Fourth International Joint Conference on Artificial Intelligence

Country:

North America > United States > Illinois (0.04)
North America > United States > Massachusetts (0.04)
Europe > Portugal > Lisbon > Lisbon (0.04)
(2 more...)

Technology:

Information Technology > Software > Programming Languages (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(2 more...)

Wray, Kyle Hollins (University of Massachusetts, Amherst) | Zilberstein, Shlomo (University of Massachusetts, Amherst)

Multi-Objective POMDPs with Lexicographic Reward Preferences

We propose a model, Lexicographic Partially Observable Markov Decision Process (LPOMDP), which extends POMDPs with lexicographic preferences over multiple value functions. It allows for slack--slightly less-than-optimal values--for higher-priority preferences to facilitate improvement in lower-priority value functions. Many real life situations are naturally captured by LPOMDPs with slack. We consider a semi-autonomous driving scenario in which time spent on the road is minimized, while maximizing time spent driving autonomously. We propose two solutions to LPOMDPs--Lexicographic Value Iteration (LVI) and Lexicographic Point-Based Value Iteration (LPBVI), establishing convergence results and correctness within strong slack bounds. We test the algorithms using real-world road data provided by Open Street Map (OSM) within 10 major cities. Finally, we present GPU-based optimizations for point-based solvers, demonstrating that their application enables us to quickly solve vastly larger LPOMDPs and other variations of POMDPs.

lpomdp, slack, value function, (13 more...)

Twenty-Fourth International Joint Conference on Artificial Intelligence

Country:

North America > United States > Massachusetts > Hampshire County > Amherst (0.14)
North America > United States > Illinois > Cook County > Chicago (0.04)
Europe > Iceland > Capital Region > Reykjavik (0.04)

Industry:

Health & Medicine (0.94)
Transportation > Ground > Road (0.35)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Planning for Stochastic Games with Co-Safe Objectives

Song, Lei (University of Technology Sydney) | Feng, Yuan (University of Technology Sydney) | Zhang, Lijun (Chinese Academy of Sciences)

We consider planning problems for stochastic games with objectives specified by a branching-time logic, called probabilistic computation tree logic (PCTL). This problem has been shown to be undecidable if strategies with perfect recall, i.e., history-dependent, are considered. In this paper, we show that, if restricted to co-safe properties, a subset of PCTL properties capable to specify a wide range of properties in practice including reachability ones, the problem turns to be decidable, even when the class of general strategies is considered. We also give an algorithm for solving robust stochastic planning, where a winning strategy is tolerant to some perturbations of probabilities in the model. Our result indicates that satisfiability of co-safe PCTL is decidable as well.

objective, probability, stochastic game, (13 more...)

Twenty-Fourth International Joint Conference on Artificial Intelligence

Country:

Asia > China (0.04)
Oceania > Australia > New South Wales > Sydney (0.04)
North America > United States > New York > New York County > New York City (0.04)
Europe > Slovenia > Drava > Municipality of Benedikt > Benedikt (0.04)

Genre: Research Report (0.34)

Industry: Leisure & Entertainment > Games (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.47)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (0.34)

Point-Based Planning for Multi-Objective POMDPs

Roijers, Diederik Marijn (University of Amsterdam) | Whiteson, Shimon (University of Amsterdam) | Oliehoek, Frans A. (University of Liverpool)

Many sequential decision-making problems require an agent to reason about both multiple objectives and uncertainty regarding the environment's state. Such problems can be naturally modelled as multi-objective partially observable Markov decision processes (MOPOMDPs). We propose optimistic linear support with alpha reuse (OLSAR), which computes a bounded approximation of the optimal solution set for all possible weightings of the objectives. The main idea is to solve a series of scalarized single-objective POMDPs, each corresponding to a different weighting of the objectives. A key insight underlying OLSAR is that the policies and value functions produced when solving scalarized POMDPs in earlier iterations can be reused to more quickly solve scalarized POMDPs in later iterations. We show experimentally that OLSAR outperforms, both in terms of runtime and approximation quality, alternative methods and a variant of OLSAR that does not leverage reuse.

corner weight, solvescalarizedpomdp, vector, (14 more...)

Twenty-Fourth International Joint Conference on Artificial Intelligence

Country:

Europe > United Kingdom > England > Merseyside > Liverpool (0.04)
Europe > Netherlands > North Holland > Amsterdam (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Oliehoek, Frans Adriaan (University of Amsterdam and University of Liverpool) | Spaan, Matthijs T. J. (Delft University of Technology) | Witwicki, Stefan John (Swiss Federal Institute of Technology (EPFL))

Factored Upper Bounds for Multiagent Planning Problems under Uncertainty with Non-Factored Value Functions

Nowadays, multiagent planning under uncertainty scales to tens or even hundreds of agents. However, current methods either are restricted to problems with factored value functions, or provide solutions without any guarantees on quality. Methods in the former category typically build on heuristic search using upper bounds on the value function. Unfortunately, no techniques exist to compute such upper bounds for problems with non-factored value functions, which would additionally allow for meaningful benchmarking of methods of the latter category. To mitigate this problem, this paper introduces a family of influence-optimistic upper bounds for factored Dec-POMDPs without factored value functions. We demonstrate how we can achieve firm quality guarantees for problems with hundreds of agents.

agent, assumption, dec-pomdp, (16 more...)

Twenty-Fourth International Joint Conference on Artificial Intelligence

Country:

Europe > Netherlands > North Holland > Amsterdam (0.04)
North America > United States > Michigan (0.04)
Europe > Netherlands > South Holland > Delft (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.76)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Metareasoning for Planning Under Uncertainty

Lin, Christopher H. (University of Washington) | Kolobov, Andrey (Microsoft Research) | Kamar, Ece (Microsoft Research) | Horvitz, Eric (Microsoft Research)

The conventional model for online planning under uncertainty assumes that an agent can stop and plan without incurring costs for the time spent planning. However, planning time is not free in most real-world settings. For example, an autonomous drone is subject to nature's forces, like gravity, even while it thinks, and must either pay a price for counteracting these forces to stay in place, or grapple with the state change caused by acquiescing to them. Policy optimization in these settings requires metareasoning---a process that trades off the cost of planning and the potential policy improvement that can be achieved. We formalize and analyze the metareasoning problem for Markov Decision Processes (MDPs). Our work subsumes previously studied special cases of metareasoning and shows that in the general case, metareasoning is at most polynomially harder than solving MDPs with any given algorithm that disregards the cost of thinking. For reasons we discuss, optimal general metareasoning turns out to be impractical, motivating approximations. We present approximate metareasoning procedures which rely on special properties of the BRTDP planning algorithm and explore the effectiveness of our methods on a variety of problems.

agent, algorithm, brtdp, (17 more...)

Twenty-Fourth International Joint Conference on Artificial Intelligence

Country:

North America > United States > Washington > King County > Seattle (0.04)
North America > United States > Washington > King County > Redmond (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Metareasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.66)

Lang, Jérôme (CNRS, Université Paris-Dauphine) | Zanuttini, Bruno (Université de Caen Basse-Normandie)

Probabilistic Knowledge-Based Programs

We introduce Probabilistic Knowledge-Based Programs (PKBPs), a new, compact representation of policies for factored partially observable Markov decision processes. PKBPs use branching conditions such as if the probability of φ is larger than p, and many more. While similar in spirit to value-based policies, PKBPs leverage the factored representation for more compactness. They also cope with more general goals than standard state-based rewards, such as pure information-gathering goals. Compactness comes at the price of reactivity, since evaluating branching conditions on-line is not polynomial in general. In this sense, PKBPs are complementary to other representations. Our intended application is as a tool for experts to specify policies in a natural, compact language, then have them verified automatically. We study succinctness and the complexity of verification for PKBPs.

pkbp, probability, representation, (17 more...)

Twenty-Fourth International Joint Conference on Artificial Intelligence

Country:

North America > Canada > Ontario > Toronto (0.14)
Asia > Japan > Honshū > Chūbu > Ishikawa Prefecture > Kanazawa (0.04)
Europe > France (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Optimal Policy Generation for Partially Satisfiable Co-Safe LTL Specifications

Lacerda, Bruno (University of Birmingham) | Parker, David (University of Birmingham) | Hawes, Nick (University of Birmingham)

We present a method to calculate cost-optimal policies for co-safe linear temporal logic task specifications over a Markov decision process model of a stochastic system. Our key contribution is to address scenarios in which the task may not be achievable with probability one. We formalise a task progression metric and, using multi-objective probabilistic model checking, generate policies that are formally guaranteed to, in decreasing order of priority: maximise the probability of finishing the task; maximise progress towards completion, if this is not possible; and minimise the expected time or cost required. We illustrate and evaluate our approach in a robot task planning scenario, where the task is to visit a set of rooms that may be inaccessible during execution.

mdp, probability, specification, (15 more...)

Twenty-Fourth International Joint Conference on Artificial Intelligence

Country:

Asia > Middle East > Republic of Türkiye > Karaman Province > Karaman (0.04)
Europe > United Kingdom > England > West Midlands > Birmingham (0.04)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.35)