As transparency becomes key for robotics and AI, it will be necessary to evaluate the methods through which transparency is provided, including automatically generated natural language (NL) explanations. Here, we explore parallels between the generation of such explanations and the much-studied field of evaluation of Natural Language Generation (NLG). Specifically, we investigate which of the NLG evaluation measures map well to explanations. We present the ExBAN corpus: a crowd-sourced corpus of NL explanations for Bayesian Networks. We run correlations comparing human subjective ratings with NLG automatic measures. We find that embedding-based automatic NLG evaluation methods, such as BERTScore and BLEURT, have a higher correlation with human ratings, compared to word-overlap metrics, such as BLEU and ROUGE. This work has implications for Explainable AI and transparent robotic and autonomous systems.
The development of recommender systems that optimize multi-turn interaction with users, and model the interactions of different agents (e.g., users, content providers, vendors) in the recommender ecosystem have drawn increasing attention in recent years. Developing and training models and algorithms for such recommenders can be especially difficult using static datasets, which often fail to offer the types of counterfactual predictions needed to evaluate policies over extended horizons. To address this, we develop RecSim NG, a probabilistic platform for the simulation of multi-agent recommender systems. RecSim NG is a scalable, modular, differentiable simulator implemented in Edward2 and TensorFlow. It offers: a powerful, general probabilistic programming language for agent-behavior specification; tools for probabilistic inference and latent-variable model learning, backed by automatic differentiation and tracing; and a TensorFlow-based runtime for running simulations on accelerated hardware. We describe RecSim NG and illustrate how it can be used to create transparent, configurable, end-to-end models of a recommender ecosystem, complemented by a small set of simple use cases that demonstrate how RecSim NG can help both researchers and practitioners easily develop and train novel algorithms for recommender systems.
This figure, reproduced in figure 1A is known by practitioners to provide a qualitative understanding of the overlapping domains of the respective methodologies, implying a continuum of models that offer various strengths and weaknesses for addressing specify types of materials modeling problems. While open to some interpretation, figures such as these are meant to illustrate a broad range of phenomena that can be captured, with each method achieving its own balance between ab initio calculation and empirical modeling. Indeed, our early work modeling of more complex phenomena within lattice-based kinetic Monte Carlo (KMC) atomistic simulations (figure 1B) is one example of attaining such a balance through empirically designed liquid local neighborhoods calibrated alongside analytical continuum models.
Abstract--Shared mental models are critical to team success; however, in practice, team members may have misaligned models due to a variety of factors. In safety-critical domains (e.g., aviation, healthcare), lack of shared mental models can lead to preventable errors and harm. Towards the goal of mitigating such preventable errors, here, we present a Bayesian approach to infer misalignment in team members' mental models during complex healthcare task execution. As an exemplary application, we demonstrate our approach using two simulated team-based scenarios, derived from actual teamwork in cardiac surgery. In these simulated experiments, our approach inferred model misalignment with over 75% recall, thereby providing a building block for enabling computer-assisted interventions to augment human cognition in the operating room and improve teamwork.
Multi-agent influence diagrams (MAIDs) are a popular form of Previous work on MAIDs has focussed on Nash equilibria as graphical model that, for certain classes of games, have been shown the core solution concept . Whilst this is arguably the most important to offer key complexity and explainability advantages over traditional solution concept in non-cooperative game theory, if there extensive form game (EFG) representations. In this paper, we are many Nash equilibria we often wish to remove some of those extend previous work on MAIDs by introducing the concept of a that are less'rational'. Many refinements to the Nash equilibrium MAID subgame, as well as subgame perfect and trembling hand have been proposed , with two of the most important being perfect equilibrium refinements. We then prove several equivalence subgame perfect Nash equilibria  and trembling hand perfect results between MAIDs and EFGs. Finally, we describe an open equilibria . The first rules out'non-credible' threats and the second source implementation for reasoning about MAIDs and computing requires that each player is still playing a best-response when their equilibria.
Wind farms are a crucial driver toward the generation of ecological and renewable energy. Due to their rapid increase in capacity, contemporary wind farms need to adhere to strict constraints on power output to ensure stability of the electricity grid. Specifically, a wind farm controller is required to match the farm's power production with a power demand imposed by the grid operator. This is a non-trivial optimization problem, as complex dependencies exist between the wind turbines. State-of-the-art wind farm control typically relies on physics-based heuristics that fail to capture the full load spectrum that defines a turbine's health status. When this is not taken into account, the long-term viability of the farm's turbines is put at risk. Given the complex dependencies that determine a turbine's lifetime, learning a flexible and optimal control strategy requires a data-driven approach. However, as wind farms are large-scale multi-agent systems, optimizing control strategies over the full joint action space is intractable. We propose a new learning method for wind farm control that leverages the sparse wind farm structure to factorize the optimization problem. Using a Bayesian approach, based on multi-agent Thompson sampling, we explore the factored joint action space for configurations that match the demand, while considering the lifetime of turbines. We apply our method to a grid-like wind farm layout, and evaluate configurations using a state-of-the-art wind flow simulator. Our results are competitive with a physics-based heuristic approach in terms of demand error, while, contrary to the heuristic, our method prolongs the lifetime of high-risk turbines.
This paper studies the problem of distributed classification with a network of heterogeneous agents. The agents seek to jointly identify the underlying target class that best describes a sequence of observations. The problem is first abstracted to a hypothesis-testing framework, where we assume that the agents seek to agree on the hypothesis (target class) that best matches the distribution of observations. Non-Bayesian social learning theory provides a framework that solves this problem in an efficient manner by allowing the agents to sequentially communicate and update their beliefs for each hypothesis over the network. Most existing approaches assume that agents have access to exact statistical models for each hypothesis. However, in many practical applications, agents learn the likelihood models based on limited data, which induces uncertainty in the likelihood function parameters. In this work, we build upon the concept of uncertain models to incorporate the agents' uncertainty in the likelihoods by identifying a broad set of parametric distribution that allows the agents' beliefs to converge to the same result as a centralized approach. Furthermore, we empirically explore extensions to non-parametric models to provide a generalized framework of uncertain models in non-Bayesian social learning.
We study a decentralized cooperative multi-agent multi-armed bandit problem with $K$ arms and $N$ agents connected over a network. In our model, each arm's reward distribution is same for all agents, and rewards are drawn independently across agents and over time steps. In each round, agents choose an arm to play and subsequently send a message to their neighbors. The goal is to minimize cumulative regret averaged over the entire network. We propose a decentralized Bayesian multi-armed bandit framework that extends single-agent Bayesian bandit algorithms to the decentralized setting. Specifically, we study an information assimilation algorithm that can be combined with existing Bayesian algorithms, and using this, we propose a decentralized Thompson Sampling algorithm and decentralized Bayes-UCB algorithm. We analyze the decentralized Thompson Sampling algorithm under Bernoulli rewards and establish a problem-dependent upper bound on the cumulative regret. We show that regret incurred scales logarithmically over the time horizon with constants that match those of an optimal centralized agent with access to all observations across the network. Our analysis also characterizes the cumulative regret in terms of the network structure. Through extensive numerical studies, we show that our extensions of Thompson Sampling and Bayes-UCB incur lesser cumulative regret than the state-of-art algorithms inspired by the Upper Confidence Bound algorithm. We implement our proposed decentralized Thompson Sampling under gossip protocol, and over time-varying networks, where each communication link has a fixed probability of failure.
Agent-based methods allow for defining simple rules that generate complex group behaviors. The governing rules of such models are typically set a priori and parameters are tuned from observed behavior trajectories. Instead of making simplifying assumptions across all anticipated scenarios, inverse reinforcement learning provides inference on the short-term (local) rules governing long term behavior policies by using properties of a Markov decision process. We use the computationally efficient linearly-solvable Markov decision process to learn the local rules governing collective movement for a simulation of the self propelled-particle (SPP) model and a data application for a captive guppy population. The estimation of the behavioral decision costs is done in a Bayesian framework with basis function smoothing. We recover the true costs in the SPP simulation and find the guppies value collective movement more than targeted movement toward shelter.
Emergency response to incidents such as accidents, medical calls, and fires is one of the most pressing problems faced by communities across the globe. In the last fifty years, researchers have developed statistical, analytical, and algorithmic approaches for designing emergency response management (ERM) systems. In this survey, we present models for incident prediction, resource allocation, and dispatch for emergency incidents. We highlight the strengths and weaknesses of prior work in this domain and explore the similarities and differences between different modeling paradigms. Finally, we present future research directions. To the best of our knowledge, our work is the first comprehensive survey that explores the entirety of ERM systems.