Goto

Collaborating Authors

 Agents


Learning Generalized Models by Interrogating Black-Box Autonomous Agents

arXiv.org Artificial Intelligence

This paper develops a new approach for estimating the internal model of an autonomous agent that can plan and act, by interrogating it. In this approach, the user may ask an autonomous agent a series of questions, which the agent answers truthfully. Our main contribution is an algorithm that generates an interrogation policy in the form of a sequence of questions to be posed to the agent. Answers to these questions are used to derive a minimal, functionally indistinguishable class of agent models. While asking questions exhaustively for every aspect of the model can be infeasible even for small models, our approach generates questions in a hierarchical fashion to eliminate large classes of models that are inconsistent with the agent. Empirical evaluation of our approach shows that for a class of agents that may use arbitrary black-box transition systems for planning, our approach correctly and efficiently computes STRIPS-like agent models through this interrogation process.


SensAI+Expanse Adaptation on Human Behaviour Towards Emotional Valence Prediction

arXiv.org Artificial Intelligence

Leonel Garcia-Marques CICPSI Faculdade de Psicologia Universidade de Lisboa Portugal garcia_marques@sapo.pt Abstract --An agent, artificial or human, must be continuously adjusting its behaviour in order to thrive in a more or less demanding environment. An artificial agent with the ability to predict human emotional valence in a geospatial and temporal context requires proper adaptation to its mobile device environment with resource consumption strict restrictions (e.g., power from battery). The developed distributed system includes a mobile device embodied agent ( SensAI) plus Cloud-expanded ( Expanse) cognition and memory resources. The system is designed with several adaptive mechanisms in a best effort for the agent to cope with its interacting humans and to be resilient on collecting data for machine learning towards prediction. These mechanisms encompass homeostatic-like adjustments such as auto recovering from an unexpected failure in the mobile device, forgetting repeated data to save local memory, adjusting actions to a proper moment (e.g., notify only when human is interacting), and the Expanse complementary learning algorithms' parameters with auto adjustments. Regarding emotional valence prediction performance, results from a comparison study between state-of-the-art algorithms revealed Extreme Gradient Boosting on average the best model for prediction with efficient energy use, and explainable using feature importance inspection. Therefore, this work contributes with a smartphone sensing-based system, distributed in the Cloud, robust to unexpected behaviours from humans and the environment, able to predict emotional valence states with very good performance. I NTRODUCTION The scientific evidence of epigenetics reveal on/off mechanisms inside chromosomes of human agents and reinforces the importance of any entity continuous adaptation to its environment.


A Survey of Deep Reinforcement Learning in Video Games

arXiv.org Artificial Intelligence

Deep reinforcement learning (DRL) has made great achievements since proposed. Generally, DRL agents receive high-dimensional inputs at each step, and make actions according to deep-neural-network-based policies. This learning mechanism updates the policy to maximize the return with an end-to-end method. In this paper, we survey the progress of DRL methods, including value-based, policy gradient, and model-based algorithms, and compare their main techniques and properties. Besides, DRL plays an important role in game artificial intelligence (AI). We also take a review of the achievements of DRL in various video games, including classical Arcade games, first-person perspective games and multi-agent real-time strategy games, from 2D to 3D, and from single-agent to multi-agent. A large number of video game AIs with DRL have achieved super-human performance, while there are still some challenges in this domain. Therefore, we also discuss some key points when applying DRL methods to this field, including exploration-exploitation, sample efficiency, generalization and transfer, multi-agent learning, imperfect information, and delayed spare rewards, as well as some research directions.


A Logical Model for Supporting Social Commonsense Knowledge Acquisition

arXiv.org Artificial Intelligence

To make machine exhibit human-like abilities in the domains like robotics and conversation, social commonsense knowledge (SCK), i.e., common sense about social contexts and social roles, is absolutely necessarily. Therefor, our ultimate goal is to acquire large-scale SCK to support much more intelligent applications. Before that, we need to know clearly what is SCK and how to represent it, since automatic information processing requires data and knowledge are organized in structured and semantically related ways. For this reason, in this paper, we identify and formalize three basic types of SCK based on first-order theory. Firstly, we identify and formalize the interrelationships, such as having-role and having-social_relation, among social contexts, roles and players from the perspective of considering both contexts and roles as first-order citizens and not generating role instances. Secondly, we provide a four level structure to identify and formalize the intrinsic information, such as events and desires, of social contexts, roles and players, and illustrate the way of harvesting the intrinsic information of social contexts and roles from the exhibition of players in concrete contexts. And thirdly, enlightened by some observations of actual contexts, we further introduce and formalize the embedding of social contexts, and depict the way of excavating the intrinsic information of social contexts and roles from the embedded smaller and simpler contexts. The results of this paper lay the foundation not only for formalizing much more complex SCK but also for acquiring these three basic types of SCK.


The Temporal Dynamics of Belief-based Updating of Epistemic Trust: Light at the End of the Tunnel?

arXiv.org Artificial Intelligence

We start with the distinction of outcome- and belief-based Bayesian models of the sequential update of agents' beliefs and subjective reliability of sources (trust). We then focus on discussing the influential Bayesian model of belief-based trust update by Eric Olsson, which models dichotomic events and explicitly represents anti-reliability. After sketching some disastrous recent results for this perhaps most promising model of belief update, we show new simulation results for the temporal dynamics of learning belief with and without trust update and with and without communication. The results seem to shed at least a somewhat more positive light on the communicating-and-trust-updating agents. This may be a light at the end of the tunnel of belief-based models of trust updating, but the interpretation of the clear findings is much less clear.


Bidding in Spades

arXiv.org Artificial Intelligence

We present a Spades bidding algorithm that is superior to recreational human players and to publicly available bots. Like in Bridge, the game of Spades is composed of two independent phases, \textit{bidding} and \textit{playing}. This paper focuses on the bidding algorithm, since this phase holds a precise challenge: based on the input, choose the bid that maximizes the agent's winning probability. Our \emph{Bidding-in-Spades} (BIS) algorithm heuristically determines the bidding strategy by comparing the expected utility of each possible bid. A major challenge is how to estimate these expected utilities. To this end, we propose a set of domain-specific heuristics, and then correct them via machine learning using data from real-world players. The \BIS algorithm we present can be attached to any playing algorithm. It beats rule-based bidding bots when all use the same playing component. When combined with a rule-based playing algorithm, it is superior to the average recreational human.


Data-driven Discovery of Emergent Behaviors in Collective Dynamics

arXiv.org Machine Learning

Particle- and agent-based systems are a ubiquitous modeling tool in many disciplines. We consider the fundamental problem of inferring interaction kernels from observations of agent-based dynamical systems given observations of trajectories, in particular for collective dynamical systems exhibiting emergent behaviors with complicated interaction kernels, in a nonparametric fashion, and for kernels which are parametrized by a single unknown parameter. We extend the estimators introduced in \cite{PNASLU}, which are based on suitably regularized least squares estimators, to these larger classes of systems. We provide extensive numerical evidence that the estimators provide faithful approximations to the interaction kernels, and provide accurate predictions for trajectories started at new initial conditions, both throughout the ``training'' time interval in which the observations were made, and often much beyond. We demonstrate these features on prototypical systems displaying collective behaviors, ranging from opinion dynamics, flocking dynamics, self-propelling particle dynamics, synchronized oscillator dynamics, and a gravitational system. Our experiments also suggest that our estimated systems can display the same emergent behaviors of the observed systems, that occur at larger timescales than those used in the training data. Finally, in the case of families of systems governed by a parameterized family of interaction kernels, we introduce novel estimators that estimate the parameterized family of kernels, splitting it into a common interaction kernel and the action of parameters. We demonstrate this in the case of gravity, by learning both the ``common component'' $1/r^2$ and the dependency on mass, without any a priori knowledge of either one, from observations of planetary motions in our solar system.


Optimizing Collision Avoidance in Dense Airspace using Deep Reinforcement Learning

arXiv.org Artificial Intelligence

New methodologies will be needed to ensure the airspace remains safe and efficient as traffic densities rise to accommodate new unmanned operations. This paper explores how unmanned free-flight traffic may operate in dense airspace. We develop and analyze autonomous collision avoidance systems for aircraft operating in dense airspace where traditional collision avoidance systems fail. We propose a metric for quantifying the decision burden on a collision avoidance system as well as a metric for measuring the impact of the collision avoidance system on airspace. We use deep reinforcement learning to compute corrections for an existing collision avoidance approach to account for dense airspace. The results show that a corrected collision avoidance system can operate more efficiently than traditional methods in dense airspace while maintaining high levels of safety.


The Blockchain Game: Synthesis of Byzantine Systems and Nash Equilibria

arXiv.org Artificial Intelligence

--This position paper presents a synthesis viewpoint of blockchains from two orthogonal perspectives: fault-tolerant distributed systems and game theory. Specifically, we formulate a new game-theoretical problem in the context of blockchains and sketch a closed-form Nash equilibrium to the problem. Blockchains have drawn much research interest, way beyond its first realization, Bitcoin [3], a cryptocurrency application built upon blockchains. From system perspectives, various facets, especially performance and scalability, have been intensively studied by multiple computer systems communities including but not limited to: computer security [7], distributed systems [11], and database systems [9]. Works on the theoretical foundation of blockchains are, however, comparatively limited, and mostly in the cryptocurrency context [6], [8], [10], usually in a permissionless setup where nodes are free to join or leave the blockchain network. In permissioned blockchains such as Hyperledger Fabric [2], where Practical Byzantine Fault-Tolerance [4] (PBFT) is the de facto consensus protocol, much work focused on PBFT and its variants without in-depth reasoning on the node's (or, user's) rationality--analyses simply assume that a node is either faulty or non-faulty.


Strategic Abstention based on Preference Extensions: Positive Results and Computer-Generated Impossibilities

Journal of Artificial Intelligence Research

Voting rules allow multiple agents to aggregate their preferences in order to reach joint decisions. A common flaw of some voting rules, known as the no-show paradox, is that agents may obtain a more preferred outcome by abstaining from an election. We study strategic abstention for set-valued voting rules based on Kelly's and Fishburn's preference extensions. Our contribution is twofold. First, we show that, whenever there are at least five alternatives and seven agents, every Pareto-optimal majoritarian voting rule suffers from the no-show paradox with respect to Fishburn's extension. This is achieved by reducing the statement to a finite - yet very large - problem, which is encoded as a formula in propositional logic and then shown to be unsatisfiable by a SAT solver. We also provide a human-readable proof which we extracted from a minimal unsatisfiable core of the formula. Secondly, we prove that every voting rule that satisfies two natural conditions cannot be manipulated by strategic abstention with respect to Kelly's extension and give examples of well-known Pareto-optimal majoritarian voting rules that meet these requirements.