Agents
Bonsai joins Microsoft to cultivate our common vision: BRAINs for Autonomous Systems
Keen and I founded Bonsai in 2014 with the vision of putting AI in the hands of every developer. Over the past four years our team has worked tirelessly to make this vision a reality by combining the power of machine teaching and deep reinforcement learning into an end-to-end platform that is accessible not only to data scientists but software engineers and subject matter experts. The strongest initial commercial traction for this platform has been in the industrial verticals where customers are improving the operations of dynamic control systems across applications including robotics, HVAC, engines, wind turbines and machine tuning. The 30x performance improvement Siemens recently realized auto-calibrating CNC machines powered by a Bonsai BRAIN is just scratching the surface of the significant business impact deep reinforcement learning can bring to these real world systems. Going forward, we see a massive opportunity to empower enterprises & developers globally with the tools and technology needed to build and operate the BRAINs that power these intelligent autonomous systems.
Representing and Planning with Interacting Actions and Privacy
Shekhar, Shashank (Ben-Gurion University) | Brafman, Ronen I. (Ben-Gurion University)
Interacting actions — actions whose joint effect differs from the union of their individual effects — are challenging both to represent and to plan with due to their combinatorial nature. So far, there have been few attempts to provide a succinct language for representing them that can also support efficient centralized and distributed privacy preserving planning. In this paper we suggest an approach for representing interacting actions succinctly and show how such a domain model can be compiled into a standard single-agent planning problem as well as to privacy preserving multi-agent planning. We test the performance of our method on a number of novel domains involving interacting actions and privacy.
Adding Heuristics to Conflict-Based Search for Multi-Agent Path Finding
Felner, Ariel (Ben-Gurion University) | Li, Jiaoyang (University of Southern California) | Boyarski, Eli (Ben-Gurion University) | Ma, Hang (University of Southern California) | Cohen, Liron (University of Southern California) | Kumar, T. K. Satish (University of Southern California) | Koenig, Sven (University of Southern California)
Conflict-Based Search (CBS) and its enhancements are among the strongest algorithms for the multi-agent path-finding problem. However,existing variants of CBS do not use any heuristics that estimate future work. In this paper, we introduce different admissible heuristics for CBS by aggregating cardinal conflicts among agents. In our experiments, CBS with these heuristics outperforms previous state-of-the-art CBS variants by up to a factor of five.
Towards a Grounded Dialog Model for Explainable Artificial Intelligence
Madumal, Prashan, Miller, Tim, Vetere, Frank, Sonenberg, Liz
To generate trust with their users, Explainable Artificial Intelligence (XAI) systems need to include an explanation model that can communicate the internal decisions, behaviours and actions to the interacting humans. Successful explanation involves both cognitive and social processes. In this paper we focus on the challenge of meaningful interaction between an explainer and an explainee and investigate the structural aspects of an explanation in order to propose a human explanation dialog model. We follow a bottom-up approach to derive the model by analysing transcripts of 398 different explanation dialog types. We use grounded theory to code and identify key components of which an explanation dialog consists. We carry out further analysis to identify the relationships between components and sequences and cycles that occur in a dialog. We present a generalized state model obtained by the analysis and compare it with an existing conceptual dialog model of explanation.
A Reputation System for Artificial Societies
Kolonin, Anton, Goertzel, Ben, Duong, Deborah, Ikle, Matt
One approach to achieving artificial general intelligence (AGI) is through the emergence of complex structures and dynamic properties arising from decentralized networks of interacting artificial intelligence (AI) agents. Understanding the principles of consensus in societies and finding ways to make consensus more reliable becomes critically important as connectivity and interaction speed increase in modern distributed systems of hybrid collective intelligences, which include both humans and computer systems. We propose a new form of reputation-based consensus with greater resistance to reputation gaming than current systems have. We discuss options for its implementation, and provide initial practical results.
Agent-Mediated Social Choice
Computational studies of voting are mostly motivated by two intended applications: the coordination of societies of artificial agents, and the study of human collective decisions whose complexity requires the use of computational techniques. Both research directions are too often confined to theoretical studies, with unrealistic assumptions constraining their significance for real-world situations. Most practical applications of these results are therefore confined to low-stakes decisions, which are of great importance in expanding the use of algorithms in society, but are far from high-stakes choices such as political elections, referenda, or parliamentary decisions, which societies still make using old-fashioned technologies like paper ballots. In this paper I argue in favour of conceiving "voting avatars", artificial agents that are able to act as proxies for voters in collective decisions at any level of society. Besides being an ideal test-bed for a large number of techniques developed in the field of multiagent systems and artificial intelligence in general, agent-mediated social choice may also suggests innovative solutions to the low voter participation that is endemic in most practical implementations of electronic decision processes.
Facing Multiple Attacks in Adversarial Patrolling Games with Alarmed Targets
De Nittis, Giuseppe, Gatti, Nicola
We focus on adversarial patrolling games on arbitrary graphs, where the Defender can control a mobile resource, the targets are alarmed by an alarm system, and the Attacker can observe the actions of the mobile resource of the Defender and perform different attacks exploiting multiple resources. This scenario can be modeled as a zero-sum extensive-form game in which each player can play multiple times. The game tree is exponentially large both in the size of the graph and in the number of attacking resources. We show that when the number of the Attacker's resources is free, the problem of computing the equilibrium path is NP-hard, while when the number of resources is fixed, the equilibrium path can be computed in poly-time. We provide a dynamic-programming algorithm that, given the number of the Attacker's resources, computes the equilibrium path requiring poly-time in the size of the graph and exponential time in the number of the resources. Furthermore, since in real-world scenarios it is implausible that the Defender knows the number of attacking resources, we study the robustness of the Defender's strategy when she makes a wrong guess about that number. We show that even the error of just a single resource can lead to an arbitrary inefficiency, when the inefficiency is defined as the ratio of the Defender's utilities obtained with a wrong guess and a correct guess. However, a more suitable definition of inefficiency is given by the difference of the Defender's utilities: this way, we observe that the higher the error in the estimation, the higher the loss for the Defender. Then, we investigate the performance of online algorithms when no information about the Attacker's resources is available. Finally, we resort to randomized online algorithms showing that we can obtain a competitive factor that is twice better than the one that can be achieved by any deterministic online algorithm.
How to Maximize the Spread of Social Influence: A Survey
De Nittis, Giuseppe, Gatti, Nicola
This survey presents the main results achieved for the influence maximization problem in social networks. This problem is well studied in the literature and, thanks to its recent applications, some of which currently deployed on the field, it is receiving more and more attention in the scientific community. The problem can be formulated as follows: given a graph, with each node having a certain probability of influencing its neighbors, select a subset of vertices so that the number of nodes in the network that are influenced is maximized. Starting from this model, we introduce the main theoretical developments and computational results that have been achieved, taking into account different diffusion models describing how the information spreads throughout the network, various ways in which the sources of information could be placed, and how to tackle the problem in the presence of uncertainties affecting the network. Finally, we present one of the main application that has been developed and deployed exploiting tools and techniques previously discussed.
Beyond Local Nash Equilibria for Adversarial Networks
Oliehoek, Frans A., Savani, Rahul, Gallego, Jose, van der Pol, Elise, Groß, Roderich
Save for some special cases, current training methods for Generative Adversarial Networks (GANs) are at best guaranteed to converge to a `local Nash equilibrium` (LNE). Such LNEs, however, can be arbitrarily far from an actual Nash equilibrium (NE), which implies that there are no guarantees on the quality of the found generator or classifier. This paper proposes to model GANs explicitly as finite games in mixed strategies, thereby ensuring that every LNE is an NE. With this formulation, we propose a solution method that is proven to monotonically converge to a resource-bounded Nash equilibrium (RB-NE): by increasing computational resources we can find better solutions. We empirically demonstrate that our method is less prone to typical GAN problems such as mode collapse, and produces solutions that are less exploitable than those produced by GANs and MGANs, and closely resemble theoretical predictions about NEs.
A unified strategy for implementing curiosity and empowerment driven reinforcement learning
de Abril, Ildefons Magrans, Kanai, Ryota
Although there are many approaches to implement intrinsically motivated artificial agents, the combined usage of multiple intrinsic drives remains still a relatively unexplored research area. Specifically, we hypothesize that a mechanism capable of quantifying and controlling the evolution of the information flow between the agent and the environment could be the fundamental component for implementing a higher degree of autonomy into artificial intelligent agents. This paper propose a unified strategy for implementing two semantically orthogonal intrinsic motivations: curiosity and empowerment. Curiosity reward informs the agent about the relevance of a recent agent action, whereas empowerment is implemented as the opposite information flow from the agent to the environment that quantifies the agent's potential of controlling its own future. We show that an additional homeostatic drive is derived from the curiosity reward, which generalizes and enhances the information gain of a classical curious/heterostatic reinforcement learning agent. We show how a shared internal model by curiosity and empowerment facilitates a more efficient training of the empowerment function. Finally, we discuss future directions for further leveraging the interplay between these two intrinsic rewards.