Goto

Collaborating Authors

 Agents


DPLL(MAPF): an Integration of Multi-Agent Path Finding and SAT Solving Technologies

arXiv.org Artificial Intelligence

In multi-agent path finding (MAPF), the task is to find non-conflicting paths for multiple agents from their initial positions to given individual goal positions. MAPF represents a classical artificial intelligence problem often addressed by heuristic-search. An important alternative to search-based techniques is compilation of MAPF to a different formalism such as Boolean satisfiability (SAT). Contemporary SAT-based approaches to MAPF regard the SAT solver as an external tool whose task is to return an assignment of all decision variables of a Boolean model of input MAPF. We present in this short paper a novel compilation scheme called DPLL(MAPF) in which the consistency checking of partial assignments of decision variables with respect to the MAPF rules is integrated directly into the SAT solver. This scheme allows for far more automated compilation where the SAT solver and the consistency checking procedure work together simultaneously to create the Boolean model and to search for its satisfying assignment.


Catalytic Role Of Noise And Necessity Of Inductive Biases In The Emergence Of Compositional Communication

arXiv.org Artificial Intelligence

Communication is compositional if complex signals can be represented as a combination of simpler subparts. In this paper, we theoretically show that inductive biases on both the training framework and the data are needed to develop a compositional communication. Moreover, we prove that compositionality spontaneously arises in the signaling games, where agents communicate over a noisy channel. We experimentally confirm that a range of noise levels, which depends on the model and the data, indeed promotes compositionality. Finally, we provide a comprehensive study of this dependence and report results in terms of recently studied compositionality metrics: topographical similarity, conflict count, and context independence.


Personalized multi-faceted trust modeling to determine trust links in social media and its potential for misinformation management

arXiv.org Artificial Intelligence

In this paper, we present an approach for predicting trust links between peers in social media, one that is grounded in the artificial intelligence area of multiagent trust modeling. In particular, we propose a data-driven multi-faceted trust modeling which incorporates many distinct features for a comprehensive analysis. We focus on demonstrating how clustering of similar users enables a critical new functionality: supporting more personalized, and thus more accurate predictions for users. Illustrated in a trust-aware item recommendation task, we evaluate the proposed framework in the context of a large Yelp dataset. We then discuss how improving the detection of trusted relationships in social media can assist in supporting online users in their battle against the spread of misinformation and rumours, within a social networking environment which has recently exploded in popularity. We conclude with a reflection on a particularly vulnerable user base, older adults, in order to illustrate the value of reasoning about groups of users, looking to some future directions for integrating known preferences with insights gained through data analysis.


AI is now learning to evolve like earthly lifeforms

#artificialintelligence

This article is part of our reviews of AI research papers, a series of posts that explore the latest findings in artificial intelligence. Hundreds of millions of years of evolution have blessed our planet with a wide variety of lifeforms, each intelligent in its own fashion. Each species has evolved to develop innate skills, learning capacities, and a physical form that ensure its survival in its environment. But despite being inspired by nature and evolution, the field of artificial intelligence has largely focused on creating the elements of intelligence separately and fusing them together after development. While this approach has yielded great results, it has also limited the flexibility of AI agents in some of the basic skills found in even the simplest lifeforms.


Nvidia wants to fill the virtual and physical worlds with AI avatars

#artificialintelligence

Nvidia has announced a new platform for creating virtual agents named Omniverse Avatar. The platform combines a number of discrete technologies -- including speech recognition, synthetic speech, facial tracking, and 3D avatar animation -- which Nvidia says can be used to power a range of virtual agents. In a presentation at the company's annual GTC conference, Nvidia CEO Jensen Huang showed off a few demos using Omniverse Avatar tech. In one, a cute animated character in a digital kiosk talks a couple through the menu at a fast food restaurant, answering questions like which items are vegetarian. The character uses facial-tracking technology to maintain eye-contact with the customers and respond to their facial expressions. "This will be useful for smart retail, drive-throughs, and customer service," said Huang of the tech.


On the Use and Misuse of Absorbing States in Multi-agent Reinforcement Learning

arXiv.org Artificial Intelligence

The creation and destruction of agents in cooperative multi-agent reinforcement learning (MARL) is a critically under-explored area of research. Current MARL algorithms often assume that the number of agents within a group remains fixed throughout an experiment. However, in many practical problems, an agent may terminate before their teammates. This early termination issue presents a challenge: the terminated agent must learn from the group's success or failure which occurs beyond its own existence. We refer to propagating value from rewards earned by remaining teammates to terminated agents as the Posthumous Credit Assignment problem. Current MARL methods handle this problem by placing these agents in an absorbing state until the entire group of agents reaches a termination condition. Although absorbing states enable existing algorithms and APIs to handle terminated agents without modification, practical training efficiency and resource use problems exist. In this work, we first demonstrate that sample complexity increases with the quantity of absorbing states in a toy supervised learning task for a fully connected network, while attention is more robust to variable size input. Then, we present a novel architecture for an existing state-of-the-art MARL algorithm which uses attention instead of a fully connected layer with absorbing states. Finally, we demonstrate that this novel architecture significantly outperforms the standard architecture on tasks in which agents are created or destroyed within episodes as well as standard multi-agent coordination tasks.


Agent Spaces

arXiv.org Artificial Intelligence

Exploration is one of the most important tasks in Reinforcement Learning, but it is not well-defined beyond finite problems in the Dynamic Programming paradigm (see Subsection 2.4). We provide a reinterpretation of exploration which can be applied to any online learning method. We come to this definition by approaching exploration from a new direction. After finding that concepts of exploration created to solve simple Markov decision processes with Dynamic Programming are no longer broadly applicable, we reexamine exploration. Instead of extending the ends of dynamic exploration procedures, we extend their means. That is, rather than repeatedly sampling every state-action pair possible in a process, we define the act of modifying an agent to itself be explorative. The resulting definition of exploration can be applied in infinite problems and non-dynamic learning methods, which the dynamic notion of exploration cannot tolerate. To understand the way that modifications of an agent affect learning, we describe a novel structure on the set of agents: a collection of distances (see footnote 7) $d_{a} \in A$, which represent the perspectives of each agent possible in the process. Using these distances, we define a topology and show that many important structures in Reinforcement Learning are well behaved under the topology induced by convergence in the agent space.


PowerGridworld: A Framework for Multi-Agent Reinforcement Learning in Power Systems

arXiv.org Artificial Intelligence

We present the PowerGridworld software package to provide users with a lightweight, modular, and customizable framework for creating power-systems-focused, multi-agent Gym environments that readily integrate with existing training frameworks for reinforcement learning (RL). Although many frameworks exist for training multi-agent RL (MARL) policies, none can rapidly prototype and develop the environments themselves, especially in the context of heterogeneous (composite, multi-device) power systems where power flow solutions are required to define grid-level variables and costs. PowerGridworld is an open-source software package that helps to fill this gap. To highlight PowerGridworld's key features, we present two case studies and demonstrate learning MARL policies using both OpenAI's multi-agent deep deterministic policy gradient (MADDPG) and RLLib's proximal policy optimization (PPO) algorithms. In both cases, at least some subset of agents incorporates elements of the power flow solution at each time step as part of their reward (negative cost) structures.


Search in Imperfect Information Games

arXiv.org Artificial Intelligence

From the very dawn of the field, search with value functions was a fundamental concept of computer games research. Turing's chess algorithm from 1950 was able to think two moves ahead, and Shannon's work on chess from $1950$ includes an extensive section on evaluation functions to be used within a search. Samuel's checkers program from 1959 already combines search and value functions that are learned through self-play and bootstrapping. TD-Gammon improves upon those ideas and uses neural networks to learn those complex value functions -- only to be again used within search. The combination of decision-time search and value functions has been present in the remarkable milestones where computers bested their human counterparts in long standing challenging games -- DeepBlue for Chess and AlphaGo for Go. Until recently, this powerful framework of search aided with (learned) value functions has been limited to perfect information games. As many interesting problems do not provide the agent perfect information of the environment, this was an unfortunate limitation. This thesis introduces the reader to sound search for imperfect information games.


Democratic Forking: Choosing Sides with Social Choice

arXiv.org Artificial Intelligence

Any community in which membership is optional may eventually break apart, or fork. For example, forks may occur in political parties, business partnerships, social groups, cryptocurrencies, and federated governing bodies. Forking is typically the product of informal social processes or the organized action of an aggrieved minority, and it is not always amicable. Forks usually come at a cost, and can be seen as consequences of collective decisions that destabilize the community. Here, we provide a social choice setting in which agents can report preferences not only over a set of alternatives, but also over the possible forks that may occur in the face of disagreement. We study this social choice setting, concentrating on stability issues and concerns of strategic agent behavior.