Agents
Dungeon Crawl Stone Soup as an Evaluation Domain for Artificial Intelligence
Dannenhauer, Dustin, Floyd, Michael W., Decker, Jonathan, Aha, David W.
Dungeon Crawl Stone Soup is a popular, single-player, free and open-source rogue-like video game with a sufficiently complex decision space that makes it an ideal testbed for research in cognitive systems and, more generally, artificial intelligence. This paper describes the properties of Dungeon Crawl Stone Soup that are conducive to evaluating new approaches of AI systems. We also highlight an ongoing effort to build an API for AI researchers in the spirit of recent game APIs such as MALMO, ELF, and the Starcraft II API. Dungeon Crawl Stone Soup's complexity offers significant opportunities for evaluating AI and cognitive systems, including human user studies. In this paper we provide (1) a description of the state space of Dungeon Crawl Stone Soup, (2) a description of the components for our API, and (3) the potential benefits of evaluating AI agents in the Dungeon Crawl Stone Soup video game.
Explanation in Human-AI Systems: A Literature Meta-Review, Synopsis of Key Ideas and Publications, and Bibliography for Explainable AI
Mueller, Shane T., Hoffman, Robert R., Clancey, William, Emrey, Abigail, Klein, Gary
This is an integrative review that address the question, "What makes for a good explanation?" with reference to AI systems. Pertinent literatures are vast. Thus, this review is necessarily selective. That said, most of the key concepts and issues are expressed in this Report. The Report encapsulates the history of computer science efforts to create systems that explain and instruct (intelligent tutoring systems and expert systems). The Report expresses the explainability issues and challenges in modern AI, and presents capsule views of the leading psychological theories of explanation. Certain articles stand out by virtue of their particular relevance to XAI, and their methods, results, and key points are highlighted. It is recommended that AI/XAI researchers be encouraged to include in their research reports fuller details on their empirical or experimental methods, in the fashion of experimental psychology research reports: details on Participants, Instructions, Procedures, Tasks, Dependent Variables (operational definitions of the measures and metrics), Independent Variables (conditions), and Control Conditions.
Neural Fictitious Self-Play on ELF Mini-RTS
Kawamura, Keigo, Tsuruoka, Yoshimasa
Despite the notable successes in video games such as Atari 2600, current AI is yet to defeat human champions in the domain of real-time strategy (RTS) games. One of the reasons is that an RTS game is a multi-agent game, in which single-agent reinforcement learning methods cannot simply be applied because the environment is not a stationary Markov Decision Process. In this paper, we present a first step toward finding a game-theoretic solution to RTS games by applying Neural Fictitious Self-Play (NFSP), a game-theoretic approach for finding Nash equilibria, to Mini-RTS, a small but nontrivial RTS game provided on the ELF platform. More specifically, we show that NFSP can be effectively combined with policy gradient reinforcement learning and be applied to Mini-RTS. Experimental results also show that the scalability of NFSP can be substantially improved by pretraining the models with simple self-play using policy gradients, which by itself gives a strong strategy despite its lack of theoretical guarantee of convergence.
Interactively shaping robot behaviour with unlabeled human instructions
Najar, Anis, Sigaud, Olivier, Chetouani, Mohamed
In this paper, we propose a framework that enables a human teacher to shape a robot behaviour by interactively providing it with unlabeled instructions. We ground the meaning of instruction signals in the task learning process, and use them simultaneously for guiding the latter. We implement our framework as a modular architecture, named TICS (Task-Instruction-Contingency-Shaping) that combines different information sources: a predefined reward function, human evaluative feedback and unlabeled instructions. This approach provides a novel perspective for robotic task learning that lies between Reinforcement Learning and Supervised Learning paradigms. We evaluate our framework both in simulation and with a real robot. The experimental results demonstrate the effectiveness of our framework in accelerating the task learning process and in reducing the amount of required teaching signals.
Learning to Schedule Communication in Multi-agent Reinforcement Learning
Kim, Daewoo, Moon, Sangwoo, Hostallero, David, Kang, Wan Ju, Lee, Taeyoung, Son, Kyunghwan, Yi, Yung
Many real-world reinforcement learning tasks require multiple agents to make sequential decisions under the agents' interaction, where well-coordinated actions among the agents are crucial to achieve the target goal better at these tasks. One way to accelerate the coordination effect is to enable multiple agents to communicate with each other in a distributed manner and behave as a group. In this paper, we study a practical scenario when (i) the communication bandwidth is limited and (ii) the agents share the communication medium so that only a restricted number of agents are able to simultaneously use the medium, as in the state-of-the-art wireless networking standards. This calls for a certain form of communication scheduling. In that regard, we propose a multi-agent deep reinforcement learning framework, called SchedNet, in which agents learn how to schedule themselves, how to encode the messages, and how to select actions based on received messages. SchedNet is capable of deciding which agents should be entitled to broadcasting their (encoded) messages, by learning the importance of each agent's partially observed information. We evaluate SchedNet against multiple baselines under two different applications, namely, cooperative communication and navigation, and predator-prey. Our experiments show a non-negligible performance gap between SchedNet and other mechanisms such as the ones without communication and with vanilla scheduling methods, e.g., round robin, ranging from 32% to 43%.
Directed Formation Control of n Planar Agents with Distance and Area Constraints
Liu, Tairan, de Queiroz, Marcio, Zhang, Pengpeng, Khaledyan, Milad
In this paper, we take a first step towards generalizing a recently proposed method for dealing with the problem of convergence to incorrect equilibrium points of distance-based formation controllers. Specifically, we introduce a distance and area-based scheme for the formation control of $n$-agent systems in two dimensions using directed graphs and the single-integrator model. We show that under certain conditions on the edge lengths of the triangulated desired formation, the control ensures almost-global convergence to the correct formation.
Embodied Multimodal Multitask Learning
Chaplot, Devendra Singh, Lee, Lisa, Salakhutdinov, Ruslan, Parikh, Devi, Batra, Dhruv
Recent efforts on training visual navigation agents conditioned on language using deep reinforcement learning have been successful in learning policies for different multimodal tasks, such as semantic goal navigation and embodied question answering. In this paper, we propose a multitask model capable of jointly learning these multimodal tasks, and transferring knowledge of words and their grounding in visual objects across the tasks. The proposed model uses a novel Dual-Attention unit to disentangle the knowledge of words in the textual representations and visual concepts in the visual representations, and align them with each other. This disentangled task-invariant alignment of representations facilitates grounding and knowledge transfer across both tasks. We show that the proposed model outperforms a range of baselines on both tasks in simulated 3D environments. We also show that this disentanglement of representations makes our model modular, interpretable, and allows for transfer to instructions containing new words by leveraging object detectors.
Behind The AI of Horizon Zero Dawn (Part 1)
AI and Games is a crowdfunded series about research and applications of artificial intelligence in video games. If you like my work please consider supporting the show over on Patreon for early-access and behind-the-scenes updates. Horizon Zero Dawn stands as one of the most critically acclaimed of Sony's roster of Playstation 4 exclusives. As the hunter Aloy, players venture across the post-apocalyptic landscapes of the future to uncover the mysteries of her past and how the world fell in the years before. Humanities fall from grace has led to the rise of'the machines' – robots of varying shapes and sizes that that now run free across the lands.
Facebook and Google built a framework to study how AI agents talk to each other
The intricacies of evolutionary linguistics are myriad and underexplored, but new research involving artificial intelligence (AI) might unlock the door to new theories about how dialects develop among users. Their work isn't the first to investigate language with machine learning algorithms -- a paper published by Facebook researchers in June 2017 describes how two agents learned to "negotiate" with each other in chat messages. But they say that it's the first to use "latest-generation deep neural agents" capable of dealing with "rich perceptual input," and that it convincingly demonstrates that language can evolve from simple exchanges. The team deployed groups -- communities -- of agents equipped with the ability to communicate in a simulated environment, with complexities ranging from simple (a set of equations) to relatively complicated (a deep neural network). The "games" the agents were tasked with playing had several key properties: they were symmetric, enabling the agents to act as both "speakers" and "listeners"; they allowed the agents to communicate about something "external" to themselves, such as the sensory experience of something in their environment; and they took place in a world the agents could at least partially observe.
The Hanabi Challenge: A New Frontier for AI Research
Bard, Nolan, Foerster, Jakob N., Chandar, Sarath, Burch, Neil, Lanctot, Marc, Song, H. Francis, Parisotto, Emilio, Dumoulin, Vincent, Moitra, Subhodeep, Hughes, Edward, Dunning, Iain, Mourad, Shibl, Larochelle, Hugo, Bellemare, Marc G., Bowling, Michael
From the early days of computing, games have been important testbeds for studying how well machines can do sophisticated decision making. In recent years, machine learning has made dramatic advances with artificial agents reaching superhuman performance in challenge domains like Go, Atari, and some variants of poker. As with their predecessors of chess, checkers, and backgammon, these game domains have driven research by providing sophisticated yet well-defined challenges for artificial intelligence practitioners. We continue this tradition by proposing the game of Hanabi as a new challenge domain with novel problems that arise from its combination of purely cooperative gameplay and imperfect information in a two to five player setting. In particular, we argue that Hanabi elevates reasoning about the beliefs and intentions of other agents to the foreground. We believe developing novel techniques capable of imbuing artificial agents with such theory of mind will not only be crucial for their success in Hanabi, but also in broader collaborative efforts, and especially those with human partners. To facilitate future research, we introduce the open-source Hanabi Learning Environment, propose an experimental framework for the research community to evaluate algorithmic advances, and assess the performance of current state-of-the-art techniques.