Goto

Collaborating Authors

 Agents


Guided Feature Transformation (GFT): A Neural Language Grounding Module for Embodied Agents

arXiv.org Artificial Intelligence

Recently there has been a rising interest in training agents, embodied in virtual environments, to perform language-directed tasks by deep reinforcement learning. In this paper, we propose a simple but effective neural language grounding module for embodied agents that can be trained end to end from scratch taking raw pixels, unstructured linguistic commands, and sparse rewards as the inputs. We model the language grounding process as a language-guided transformation of visual features, where latent sentence embeddings are used as the transformation matrices. In several language-directed navigation tasks that feature challenging partial observation and require simple reasoning, our module significantly outperforms the state of the arts. We also release XWORLD 3D, an easy-to-customize 3D environment that can potentially be modified to evaluate a variety of embodied agents.


Depth-Limited Solving for Imperfect-Information Games

arXiv.org Artificial Intelligence

A fundamental challenge in imperfect-information games is that states do not have well-defined values. As a result, depth-limited search algorithms used in single-agent settings and perfect-information games do not apply. This paper introduces a principled way to conduct depth-limited solving in imperfect-information games by allowing the opponent to choose among a number of strategies for the remainder of the game at the depth limit. Each one of these strategies results in a different set of values for leaf nodes. This forces an agent to be robust to the different strategies an opponent may employ. We demonstrate the effectiveness of this approach by building a master-level heads-up no-limit Texas hold'em poker AI that defeats two prior top agents using only a 4-core CPU and 16 GB of memory. Developing such a powerful agent would have previously required a supercomputer.


Do deep reinforcement learning agents model intentions?

arXiv.org Artificial Intelligence

Inferring other agents' mental states such as their knowledge, beliefs and intentions is thought to be essential for effective interactions with other agents. Recently, multiagent systems trained via deep reinforcement learning have been shown to succeed in solving different tasks, but it remains unclear how each agent modeled or represented other agents in their environment. In this work we test whether deep reinforcement learning agents explicitly represent other agents' intentions (their specific aims or goals) during a task in which the agents had to coordinate the covering of different spots in a 2D environment. In particular, we tracked over time the performance of a linear decoder trained to predict the final goal of all agents from the hidden state of each agent's neural network controller. We observed that the hidden layers of agents represented explicit information about other agents' goals, i.e. the target landmark they ended up covering. We also performed a series of experiments, in which some agents were replaced by others with fixed goals, to test the level of generalization of the trained agents. We noticed that during the training phase the agents developed a differential preference for each goal, which hindered generalization. To alleviate the above problem, we propose simple changes to the MADDPG training algorithm which leads to better generalization against unseen agents. We believe that training protocols promoting more active intention reading mechanisms, e.g. by preventing simple symmetry-breaking solutions, is a promising direction towards achieving a more robust generalization in different cooperative and competitive tasks.


Correlation Clustering Based Coalition Formation For Multi-Robot Task Allocation

arXiv.org Artificial Intelligence

In this paper, we study the multi-robot task allocation problem where a group of robots needs to be allocated to a set of tasks so that the tasks can be finished optimally. One task may need more than one robot to finish it. Therefore the robots need to form coalitions to complete these tasks. Multi-robot coalition formation for task allocation is a well-known NP-hard problem. To solve this problem, we use a linear-programming based graph partitioning approach along with a region growing strategy which allocates (near) optimal robot coalitions to tasks in a negligible amount of time. Our proposed algorithm is fast (only taking 230 secs. for 100 robots and 10 tasks) and it also finds a near-optimal solution (up to 97.66% of the optimal). We have empirically demonstrated that the proposed approach in this paper always finds a solution which is closer (up to 9.1 times) to the optimal solution than a theoretical worst-case bound proved in an earlier work.


Safe Policy Learning from Observations

arXiv.org Artificial Intelligence

In this paper, we consider the problem of learning a policy by observing numerous non-expert agents. Our goal is to extract a policy that, with high-confidence, acts better than the average agents' performance. Such a setting is important for real-world problems where expert data is scarce but non-expert data can easily be obtained, e.g. by crowdsourcing. Our approach is to pose this problem as safe policy improvement in Reinforcement Learning. First, we evaluate an average behavior policy and approximate its value function. Then, we develop a stochastic policy improvement algorithm, termed Rerouted Behavior Improvement (RBI), that safely improves the average behavior. The primary advantages of RBI over current safe learning methods are its stability in the presence of value estimation errors and the elimination of a policy search process. We demonstrate these advantages in a Taxi grid-world domain and in four games from the Atari learning environment.


Knowledge Aggregation via Epsilon Model Spaces

arXiv.org Artificial Intelligence

In many practical applications, machine learning is divided over multiple agents, where each agent learns a different task and/or learns from a different dataset. We present Epsilon Model Spaces (EMS), a framework for learning a global model by aggregating local learnings performed by each agent. Our approach forgoes sharing of data between agents, makes no assumptions on the distribution of data across agents, and requires minimal communication between agents. We empirically validate our techniques on MNIST experiments and discuss how EMS can generalize to a wide range of problem settings, including federated averaging and catastrophic forgetting. We believe our framework to be among the first to lay out a general methodology for "combining" distinct models.


Blockchain to Improve Security and Knowledge in Inter-Agent Communication and Collaboration over Restrict Domains of the Internet Infrastructure

arXiv.org Artificial Intelligence

This paper describes the deployment and implementation of a blockchain to improve the security, knowledge and intelligence during the inter-agent communication and collaboration processes in restrict domains of the Internet Infrastructure. It is a work that proposes the application of a blockchain, platform independent, on a particular model of agents, but that can be used in similar proposals, once the results on the specific model were satisfactory.


How AI Redesigned Customer Service

#artificialintelligence

Across consumer-facing industries including hospitality and quality ranking sites, customer service is one of the most critical metrics determining the value of a product against competitors. Planned standards are well and good, but if a process doesn't work or malfunctions, consumers want help, and they want that help to be easy to access, able to handle the problem and resolve it quickly. Moreover, they want anybody (including our often hated self-check-in machines), to be understanding, considerate, and happy to assist. For decades this "human touch" has been the one thing the digital age could not offer, even as the shopping path, and the products themselves have become more and more interwoven with computer and internet technology. The U.S. 3D design company Autodesk has teamed up with Soul Machines, a New Zealand developer of human-like avatars, to produce its first digital customer service agent. AVA (Autodesk virtual agent) is ready to interact with customers 24 hours a day to resolve their concerns.



Informal Team Assignment in a Pursuit-Evasion Game

AAAI Conferences

Control architectures and algorithms for large autonomous swarms are receiving increased research interest. Control of swarm systems becomes more difficult as the size of the agent swarm increases, making centralized control approaches inadequate. This paper presents the informal team assignment algorithm. By leveraging agent roles and signaling actions, the algorithm provides a local agent mechanism leading to the emergence of cooperative teams. Tested in a modified pursuit-evasion domain, simulation results demonstrate that agent roles and inter-agent signaling spontaneously create small collaborative agent teams dedicated to shared task accomplishment. The result is in higher win ratios for signal and role capable swarms.