Agents
Visibility Induction for Discretized Pursuit-Evasion Games
Abdelrazek, Ahmed Abdelkader (Alexandria University) | El-Alfy, Hazem M (Alexandria University)
We study a two-player pursuit-evasion game, in which an agent moving amongst obstacles is to be maintained within ``sight" of a pursuing robot. Using a discretization of the environment, our main contribution is to design an efficient algorithm that decides, given initial positions of both pursuer and evader, if the evader can take any moving strategy to go out of sight of the pursuer at any time instant. If that happens, we say that the evader wins the game. We analyze the algorithm, present several optimizations and show results for different environments. For situations where the evader cannot win, we compute, in addition, a pursuit strategy that keeps the evader within sight, for every strategy the evader can take. Finally, if it is determined that the evader wins, we compute its optimal escape trajectory and the corresponding optimal pursuit trajectory.
Modeling Context Aware Dynamic Trust Using Hidden Markov Model
Liu, Xin (École Polytechnique Fédérale de Lausanne EPFL) | Datta, Anwitaman (Nanyang Technological University)
Modeling trust in complex dynamic environments is an important yet challenging issue since an intelligent agent may strategically change its behavior to maximize its profits. In thispaper, we propose a context aware trust model to predict dynamic trust by using a Hidden Markov Model (HMM) to model an agent's interactions. Although HMMs have already been applied in the past to model an agent's dynamic behavior to greatly improve the traditional static probabilistic trust approaches, most HMM based trust models only focus on outcomes of the past interactions without considering interaction context, which we believe, reflects immensely on the dynamic behavior or intent of an agent. Interaction contextual information is comprehensively studied and integrated into the model to more precisely approximate an agent's dynamic behavior. Evaluation using real auction data and synthetic data demonstrates the efficacy of our approach in comparison with previous state-of-the-art trust mechanisms.
Using Sliding Windows to Generate Action Abstractions in Extensive-Form Games
Hawkin, John Alexander (University of Alberta) | Holte, Robert (University of Alberta) | Szafron, Duane (University of Alberta)
In extensive-form games with a large number of actions, careful abstraction of the action space is critically important to performance. In this paper we extend previous work on action abstraction using no-limit poker games as our test domains. We show that in such games it is no longer necessary to choose, a priori, one specific range of possible bet sizes. We introduce an algorithm that adjusts the range of bet sizes considered for each bet individually in an iterative fashion. This flexibility results in a substantially improved game value in no-limit Leduc poker. When applied to no-limit Texas Hold'em our algorithm produces an action abstraction that is about one third the size of a state of the art hand-crafted action abstraction, yet has a better overall game value.
Planning in Factored Action Spaces with Symbolic Dynamic Programming
Raghavan, Aswin (Oregon State University) | Joshi, Saket (Oregon State University) | Fern, Alan (Oregon State University) | Tadepalli, Prasad (Oregon State University) | Khardon, Roni (Tufts University)
We consider symbolic dynamic programming (SDP) for solving Markov Decision Processes (MDP) with factored state and action spaces, where both states and actions are described by sets of discrete variables. Prior work on SDP has considered only the case of factored states and ignored structure in the action space, causing them to scale poorly in terms of the number of action variables. Our main contribution is to present the first SDP-based planning algorithm for leveraging both state and action space structure in order to compute compactly represented value functions and policies. Since our new algorithm can potentially require more space than when action structure is ignored, our second contribution is to describe an approach for smoothly trading-off space versus time via recursive conditioning. Finally, our third contribution is to introduce a novel SDP approximation that often significantly reduces planning time with little loss in quality by exploiting action structure in weakly coupled MDPs. We present empirical results in three domains with factored action spaces that show that our algorithms scale much better with the number of action variables as compared to state-of-the-art SDP algorithms.
Width and Complexity of Belief Tracking in Non-Deterministic Conformant and Contingent Planning
Bonet, Blai (Universidad Simon Bolivar) | Geffner, Hector (ICREA and Universitat Pompeu Fabra)
It has been shown recently that the complexity of belief tracking in deterministic conformant and contingent planning is exponential in a width parameter that is often bounded and small. In this work, we introduce a new width notion that applies to non-deterministic conformant and contingent problems as well. We also develop a belief tracking algorithm for non-deterministic problems that is exponential in the problem width, analyze the width of non-deterministic benchmarks, compare the new notion to the previous one over deterministic problems, and present experimental results.
A Distributed Approach to Summarizing Spaces of Multiagent Schedules
Jr., James Calvin Boerkoel (University of Michigan) | Durfee, Edmund H. (University of Michigan)
We introduce the Multiagent Disjunctive Temporal Problem (MaDTP), a new distributed formulation of the widely-adopted Disjunctive Temporal Problem (DTP) representation. An agent that generates a summary of all viable schedules, rather than a single schedule, can be more useful in dynamic environments. We show how a (Ma)DTP with the properties of minimality and decomposability provides a particularly efficacious solution space summary.However, in the multiagent case, these properties sacrifice an agent's strategic interests while incurring significant computational overhead. We introduce a new property called local decomposability that exploits loose-coupling between agents' problems, protects strategic interests, and supports typical queries. We provide and evaluate a new distributed algorithm that summarizes agents' solution spaces in significantly less time and space by using local, rather than full, decomposability.
Agent-Human Coordination with Communication Costs Under Uncertainty
Frieder, Asaf (Bar-Ilan University) | Lin, Raz (Bar-Ilan University) | Kraus, Sarit (Bar-Ilan University)
Coordination in mixed agent-human environments is an important, yet not a simple, problem. Little attention has been given to the issues raised in teams that consist of both computerized agents and people. In such situations different considerations are in order, as people tend to make mistakes and they are affected by cognitive, social and cultural factors. In this paper we present a novel agent designed to proficiently coordinate with a human counterpart. The agent uses a neural network model that is based on a pre-existing knowledge base which allows it to achieve an efficient modeling of a human's decisions and predict their behavior. A novel communication mechanism which takes into account the expected effect of communication on the other member will allow communication costs to be minimized. In extensive simulations involving more than 200 people we investigated our approach and showed that our agent achieves better coordination when involved, compared to settings in which only humans or another state-of-the-art agent are involved.
Algorithmic and Human Teaching of Sequential Decision Tasks
Cakmak, Maya (Georgia Institute of Technology) | Lopes, Manuel (INRIA)
A helpful teacher can significantly improve the learning rate of a learning agent. Teaching algorithms have been formally studied within the field of Algorithmic Teaching. These give important insights into how a teacher can select the most informative examples while teachinga new concept. However the field has so far focused purely on classification tasks. In this paper we introducea novel method for optimally teaching sequential decision tasks. We present an algorithm that automatically selects the set of most informative demonstrations andevaluate it on several navigation tasks. Next, we explore the idea of using this algorithm to produce instructions for humans on how to choose examples when teaching sequential decision tasks. We present a user study that demonstrates the utility of such instructions.
Possible Winners in Noisy Elections
Wojtas, Krzysztof (AGH University of Science and Technology) | Faliszewski, Piotr (AGH Univesity of Science and Technology)
We consider the problem of predicting winners in elections given complete knowledge about all possible candidates, all possible voters (together with their preferences), but in the case where it is uncertain either which candidates exactly register for the election or which voters cast their votes. Under reasonable assumptions our problems reduce to counting variants of election control problems. We either give polynomial-time algorithms or prove #P-completeness results for counting variants of control by adding/deleting candidates/voters for Plurality, k -Approval, Approval, Condorcet, and Maximin voting rules.
A Robust Bayesian Truth Serum for Small Populations
Witkowski, Jens (Albert-Ludwigs-Universität Freiburg) | Parkes, David C. (Harvard University)
Peer prediction mechanisms allow the truthful elicitation of private signals (e.g., experiences, or opinions) in regard to a true world state when this ground truth is unobservable. The original peer prediction method is incentive compatible for any number of agents n >= 2, but relies on a common prior, shared by all agents and the mechanism. The Bayesian Truth Serum (BTS) relaxes this assumption. While BTS still assumes that agents share a common prior, this prior need not be known to the mechanism. However, BTS is only incentive compatible for a large enough number of agents, and the particular number of agents required is uncertain because it depends on this private prior. In this paper, we present a robust BTS for the elicitation of binary information which is incentive compatible for every n >= 3, taking advantage of a particularity of the quadratic scoring rule. The robust BTS is the first peer prediction mechanism to provide strict incentive compatibility for every n >= 3 without relying on knowledge of the common prior. Moreover, and in contrast to the original BTS, our mechanism is numerically robust and ex post individually rational.