Goto

Collaborating Authors

 Agents


Resolving Conflicts in Clinical Guidelines using Argumentation

arXiv.org Artificial Intelligence

Automatically reasoning with conflicting generic clinical guidelines is a burning issue in patient-centric medical reasoning where patient-specific conditions and goals need to be taken into account. It is even more challenging in the presence of preferences such as patient's wishes and clinician's priorities over goals. We advance a structured argumentation formalism for reasoning with conflicting clinical guidelines, patient-specific information and preferences. Our formalism integrates assumption-based reasoning and goal-driven selection among reasoning outcomes. Specifically, we assume applicability of guideline recommendations concerning the generic goal of patient well-being, resolve conflicts among recommendations using patient's conditions and preferences, and then consider prioritised patient-centered goals to yield non-conflicting, goal-maximising and preference-respecting recommendations. We rely on the state-of-the-art Transition-based Medical Recommendation model for representing guideline recommendations and augment it with context given by the patient's conditions, goals, as well as preferences over recommendations and goals. We establish desirable properties of our approach in terms of sensitivity to recommendation conflicts and patient context.


Artificial intelligence -- and a few jokes -- will help keep future Mars crews sane

#artificialintelligence

When the first human explorers head for Mars, they're likely to have a non-human judging their performance and tweaking their interpersonal relationships when necessary. NASA and outside researchers are already working on artificial intelligence agents to monitor how future long-duration space crews interact, sort of like the holographic doctor on "Star Trek: Voyager." But there'll also be a need for the human touch -- in the form of crew members who could serve the roles of social directors or easygoing jokesters. That's the upshot of research initiatives discussed over the weekend here at the annual meeting of the American Association for the Advancement of Science. Using AI to assess astronauts' mental state is the focus of a NASA program known as Human Capabilities Assessments for Autonomous Missions, or H-CAAM, said Tom Williams, a researcher at NASA's Johnson Space Center who concentrates on human factors and performance for the space agency's Human Research Program.


Measuring Compositionality in Representation Learning

arXiv.org Machine Learning

Many machine learning algorithms represent input data with vector embeddings or discrete codes. When inputs exhibit compositional structure (e.g. objects built from parts or procedures from subroutines), it is natural to ask whether this compositional structure is reflected in the the inputs' learned representations. While the assessment of compositionality in languages has received significant attention in linguistics and adjacent fields, the machine learning literature lacks general-purpose tools for producing graded measurements of compositional structure in more general (e.g. vector-valued) representation spaces. We describe a procedure for evaluating compositionality by measuring how well the true representation-producing model can be approximated by a model that explicitly composes a collection of inferred representational primitives. We use the procedure to provide formal and empirical characterizations of compositional structure in a variety of settings, exploring the relationship between compositionality and learning dynamics, human judgments, representational similarity, and generalization.


Level-0 Models for Predicting Human Behavior in Games

Journal of Artificial Intelligence Research

Behavioral game theory seeks to describe the way actual people (as compared to idealized, "rational" agents) act in strategic situations. Our own recent work has identified iterative models, such as quantal cognitive hierarchy, as the state of the art for predicting human play in unrepeated, simultaneous-move games. Iterative models predict that agents reason iteratively about their opponents, building up from a specification of nonstrategic behavior called level-0. A modeler is in principle free to choose any description of level-0 behavior that makes sense for a given setting. However, in practice almost all existing work specifies this behavior as a uniform distribution over actions. In most games it is not plausible that even nonstrategic agents would choose an action uniformly at random, nor that other agents would expect them to do so. A more accurate model for level-0 behavior has the potential to dramatically improve predictions of human behavior, since a substantial fraction of agents may play level-0 strategies directly, and furthermore since iterative models ground all higher-level strategies in responses to the level-0 strategy. Our work considers models of the way in which level-0 agents construct a probability distribution over actions, given an arbitrary game. We considered a large space of alternatives and, in the end, recommend a model that achieved excellent performance across the board: a linear weighting of four binary features, each of which is general in the sense that it can be computed from any normal form game. Adding real-valued variants of the same four features yielded further improvements in performance, albeit with a corresponding increase in the number of parameters needing to be estimated. We evaluated the effects of combining these new level-0 models with several iterative models and observed large improvements in predictive accuracy.


On Voting Strategies and Emergent Communication

arXiv.org Artificial Intelligence

Humans use language to collectively execute complex strategies in addition to using it as a referential tool for referring to physical entities. While existing approaches that study the emergence of language in settings where the language mainly acts as a referential tool, in this paper, we study the role of emergent languages in discovering and implementing strategies in a multi-agent setting. The agents in our setup are connected via a network and are allowed to exchange messages in the form of sequences of discrete symbols. We formulate the problem as a voting game, where two candidate agents are contesting in an election and their goal is to convince the population members (other agents) in the network to vote for them by sending them messages. We use neural networks to parameterize the policies followed by agents in the game. We investigate the effect of choosing different training objectives and strategies for agents in the game and make observations about the emergent language in each case. To the best of our knowledge this is the first work that explores emergence of language for discovering and implementing strategies in a setting where agents are connected via an underlying network.


Accelerated Gossip in Networks of Given Dimension using Jacobi Polynomial Iterations

arXiv.org Machine Learning

Consider a network of agents connected by communication links, where each agent holds a real value. The gossip problem consists in estimating the average of the values diffused in the network in a distributed manner. We develop a method solving the gossip problem that depends only on the spectral dimension of the network, that is, in the communication network set-up, the dimension of the space in which the agents live. This contrasts with previous work that required the spectral gap of the network as a parameter, or suffered from slow mixing. Our method shows an important improvement over existing algorithms in the non-asymptotic regime, i.e., when the values are far from being fully mixed in the network. Our approach stems from a polynomial-based point of view on gossip algorithms, as well as an approximation of the spectral measure of the graphs with a Jacobi measure. We show the power of the approach with simulations on various graphs, and with performance guarantees on graphs of known spectral dimension, such as grids and random percolation bonds. An extension of this work to distributed Laplacian solvers is discussed. As a side result, we also use the polynomial-based point of view to show the convergence of the message passing algorithm for gossip of Moallemi \& Van Roy on regular graphs. The explicit computation of the rate of the convergence shows that message passing has a slow rate of convergence on graphs with small spectral gap.


Message-Dropout: An Efficient Training Method for Multi-Agent Deep Reinforcement Learning

arXiv.org Artificial Intelligence

In this paper, we propose a new learning technique named message-dropout to improve the performance for multi-agent deep reinforcement learning under two application scenarios: 1) classical multi-agent reinforcement learning with direct message communication among agents and 2) centralized training with decentralized execution. In the first application scenario of multi-agent systems in which direct message communication among agents is allowed, the message-dropout technique drops out the received messages from other agents in a block-wise manner with a certain probability in the training phase and compensates for this effect by multiplying the weights of the dropped-out block units with a correction probability. The applied message-dropout technique effectively handles the increased input dimension in multi-agent reinforcement learning with communication and makes learning robust against communication errors in the execution phase. In the second application scenario of centralized training with decentralized execution, we particularly consider the application of the proposed message-dropout to Multi-Agent Deep Deterministic Policy Gradient (MADDPG), which uses a centralized critic to train a decentralized actor for each agent. We evaluate the proposed message-dropout technique for several games, and numerical results show that the proposed message-dropout technique with proper dropout rate improves the reinforcement learning performance significantly in terms of the training speed and the steady-state performance in the execution phase.


Limited Lookahead in Imperfect-Information Games

arXiv.org Artificial Intelligence

Limited lookahead has been studied for decades in complete-information games. We initiate a new direction via two simultaneous deviation points: generalization to incomplete-information games and a game-theoretic approach. We study how one should act when facing an opponent whose lookahead is limited. We study this for opponents that differ based on their lookahead depth, based on whether they, too, have incomplete information, and based on how they break ties. We characterize the hardness of finding a Nash equilibrium or an optimal commitment strategy for either player, showing that in some of these variations the problem can be solved in polynomial time while in others it is PPAD-hard or NP-hard. We proceed to design algorithms for computing optimal commitment strategies---for when the opponent breaks ties favorably, according to a fixed rule, or adversarially. We then experimentally investigate the impact of limited lookahead. The limited-lookahead player often obtains the value of the game if she knows the expected values of nodes in the game tree for some equilibrium---but we prove this is not sufficient in general. Finally, we study the impact of noise in those estimates and different lookahead depths. This uncovers an incomplete-information game lookahead pathology.


Merkel Calls Russia a Partner, Urges Global Cooperation

U.S. News

"We are proud of our cars and so we should be," Merkel said, adding, however, that many were built in the United States and exported to China. "If that is viewed as a security threat to the United States, then we are shocked," she told the Munich Security Conference to applause from the audience.


Competitive Experience Replay

arXiv.org Machine Learning

Deep learning has achieved remarkable successes in solving challenging reinforcement learning (RL) problems when dense reward function is provided. However, in sparse reward environment it still often suffers from the need to carefully shape reward function to guide policy optimization. This limits the applicability of RL in the real world since both reinforcement learning and domain-specific knowledge are required. It is therefore of great practical importance to develop algorithms which can learn from a binary signal indicating successful task completion or other unshaped, sparse reward signals. We propose a novel method called competitive experience replay, which efficiently supplements a sparse reward by placing learning in the context of an exploration competition between a pair of agents. Our method complements the recently proposed hindsight experience replay (HER) by inducing an automatic exploratory curriculum. We evaluate our approach on the tasks of reaching various goal locations in an ant maze and manipulating objects with a robotic arm. Each task provides only binary rewards indicating whether or not the goal is achieved. Our method asymmetrically augments these sparse rewards for a pair of agents each learning the same task, creating a competitive game designed to drive exploration. Extensive experiments demonstrate that this method leads to faster converge and improved task performance.