Europe
Policy Invariance under Reward Transformations for General-Sum Stochastic Games
Lu, X., Schwartz, H. M., Givigi, S. N.
We extend the potential-based shaping method from Markov decision processes to multi-player general-sum stochastic games. We prove that the Nash equilibria in a stochastic game remains unchanged after potential-based shaping is applied to the environment. The property of policy invariance provides a possible way of speeding convergence when learning to play a stochastic game.
Technical Note: Towards ROC Curves in Cost Space
Hernández-Orallo, José, Flach, Peter, Ferri, Cèsar
ROC curves and cost curves are two popular ways of visualising classifier performance, finding appropriate thresholds according to the operating condition, and deriving useful aggregated measures such as the area under the ROC curve (AUC) or the area under the optimal cost curve. In this note we present some new findings and connections between ROC space and cost space, by using the expected loss over a range of operating conditions. In particular, we show that ROC curves can be transferred to cost space by means of a very natural way of understanding how thresholds should be chosen, by selecting the threshold such that the proportion of positive predictions equals the operating condition (either in the form of cost proportion or skew). We call these new curves {ROC Cost Curves}, and we demonstrate that the expected loss as measured by the area under these curves is linearly related to AUC. This opens up a series of new possibilities and clarifies the notion of cost curve and its relation to ROC analysis. In addition, we show that for a classifier that assigns the scores in an evenly-spaced way, these curves are equal to the Brier Curves. As a result, this establishes the first clear connection between AUC and the Brier score.
Information, Utility & Bounded Rationality
Ortega, Pedro A., Braun, Daniel A.
Perfectly rational decision-makers maximize expected utility, but crucially ignore the resource costs incurred when determining optimal actions. Here we propose an axiomatic framework for bounded rational decision-making based on a thermodynamic interpretation of resource costs as information costs. We show that this axiomatic framework enforces a unique conversion law between utility and information, which can be characterized by a variational "free utility" principle akin to thermodynamical free energy. This variational principle constitutes a normative criterion that trades off utility and information costs, the latter measured by the Kullback-Leibler deviation between a distribution representing a desired policy and a reference distribution representing an initial default policy. We show that bounded optimal control solutions can be derived from this variational principle, which leads in general to stochastic policies. Furthermore, we show that risk-sensitive and robust (minimax) control schemes fall out naturally from this framework if the environment is considered as an adversarial opponent. When resource costs are ignored, the maximum expected utility principle is recovered.
On the Undecidability of Fuzzy Description Logics with GCIs with Lukasiewicz t-norm
Cerami, Marco, Straccia, Umberto
Recently there have been some unexpected results concerning Fuzzy Description Logics (FDLs) with General Concept Inclusions (GCIs). They show that, unlike the classical case, the DL ALC with GCIs does not have the finite model property under Lukasiewicz Logic or Product Logic and, specifically, knowledge base satisfiability is an undecidable problem for Product Logic. We complete here the analysis by showing that knowledge base satisfiability is also an undecidable problem for Lukasiewicz Logic.
HyFlex: A Benchmark Framework for Cross-domain Heuristic Search
Burke, Edmund, Curtois, Tim, Hyde, Matthew, Ochoa, Gabriela, Vazquez-Rodriguez, Jose A.
Automating the design of heuristic search methods is an active research field within computer science, artificial intelligence and operational research. In order to make these methods more generally applicable, it is important to eliminate or reduce the role of the human expert in the process of designing an effective methodology to solve a given computational search problem. Researchers developing such methodologies are often constrained on the number of problem domains on which to test their adaptive, self-configuring algorithms; which can be explained by the inherent difficulty of implementing their corresponding domain specific software components. This paper presents HyFlex, a software framework for the development of cross-domain search methodologies. The framework features a common software interface for dealing with different combinatorial optimisation problems, and provides the algorithm components that are problem specific. In this way, the algorithm designer does not require a detailed knowledge the problem domains, and thus can concentrate his/her efforts in designing adaptive general-purpose heuristic search algorithms. Four hard combinatorial problems are fully implemented (maximum satisfiability, one dimensional bin packing, permutation flow shop and personnel scheduling), each containing a varied set of instance data (including real-world industrial applications) and an extensive set of problem specific heuristics and search operators. The framework forms the basis for the first International Cross-domain Heuristic Search Challenge (CHeSC), and it is currently in use by the international research community. In summary, HyFlex represents a valuable new benchmark of heuristic search generality, with which adaptive cross-domain algorithms are being easily developed, and reliably compared.
Robustness of Anytime Bandit Policies
Salomon, Antoine, Audibert, Jean-Yves
This paper studies the deviations of the regret in a stochastic multi-armed bandit problem. When the total number of plays n is known beforehand by the agent, Audibert et al. (2009) exhibit a policy such that with probability at least 1-1/n, the regret of the policy is of order log(n). They have also shown that such a property is not shared by the popular ucb1 policy of Auer et al. (2002). This work first answers an open question: it extends this negative result to any anytime policy. The second contribution of this paper is to design anytime robust policies for specific multi-armed bandit problems in which some restrictions are put on the set of possible distributions of the different arms.
A Short Introduction to Preferences: Between AI and Social Choice
Rossi, Francesca, Venable, Kristen Brent, Walsh, Toby
Computational social choice is an expanding field that merges classical topics like economics and voting theory with more modern topics like artificial intelligence, multiagent systems, and computational complexity. This book provides a concise introduction to the main research lines in this field, covering aspects such as preference modelling, uncertainty reasoning, social choice, stable matching, and computational aspects of preference aggregation and manipulation. The book is centered around the notion of preference reasoning, both in the single-agent and the multi-agent setting. It presents the main approaches to modeling and reasoning with preferences, with particular attention to two popular and powerful formalisms, soft constraints and CP-nets. The authors consider preference elicitation and various forms of uncertainty in soft constraints.
Towards Spatial Methods for Socially Assistive Robotics: Validation with Children with Autism Spectrum Disorders
Feil-Seifer, David (University of Southern California)
Socially Assistive Robotics (SAR) defines the research regarding robots which provide assistance to users through social interaction. Socially assistive robots are being studied for therapeutic use with children with autism spectrum disorders (ASD). It has been observed that children with ASD interact with robots differently than with people or toys. This may indicate an intrinsic interest in such machines, which could be applied as a robot augmentation for an intervention for children with ASD. Preliminary studies suggest that robots may act as intrinsically-rewarding social partners for children with autism. However, enabling a robot to understand social behavior, and do so while interacting with the child, is a challenging problem. Children are highly individual and thus technology used for social interaction requires recognition of a wide-range of social behavior. This work addresses the challenge of designing behaviors for socially assistive robots in order to enable them to recognize and appropriately respond to a child’s free-form behavior in unstructured play contexts. The focus on free-form behavior is inspired by and grounded in existing approaches to therapeutic intervention with children with ASD. This model emphasizes creating circles of communication and fostering engagement through play. A key aspect of this approach is to recognize social behavior and use “engagements” to bolster social interaction behavior, and to study the ethical implications of therapeutic robotics applications.
Norm Compliance of Rule-Based Cognitive Agents
Rotolo, Antonino (University of Bologna)
Deliberation itself can be a computationally costly process and requires This paper shows how belief revision techniques an appropriate intention reconsideration policy which can be used in Defeasible Logic to change rulebased helps the agent to deliberate only when necessary. In this picture, theories characterizing the deliberation process it is still overlooked the problem of changing intentions of cognitive agents. We discuss intention reconsideration not because of the change of beliefs, but because the normative as a strategy to make agents compliant constraints require to do so.
Unsupervised Lexicon Acquisition for HPSG-Based Relation Extraction
Rozenfeld, Benjamin (Digital Trowel) | Feldman, Ronen (Hebrew University of Jerusalem)
The paper describes a method of relation extraction, which is based on parsing the input text using a combination of a generic HPSG-based grammar and a highly focused domain- and relation-specific lexicon. We also show a method of unsupervised acquisition of such a lexicon from a large unlabeled corpus. Together, the methods introduce a novel approach to the “Open IE” task, which is superior in accuracy and in quality of relation identification to the existing approaches.