AITopics

1511.0196

Country:

Asia > Japan (0.04)
South America > Argentina > Pampas > Buenos Aires F.D. > Buenos Aires (0.04)
North America > Canada > British Columbia (0.04)
(17 more...)

Genre: Research Report (0.49)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.45)

Tewari, Ujwal Padam, Bidawatka, Vishal, Raveendran, Varsha, Sudhakaran, Vinay

Intelligent Coordination among Multiple Traffic Intersections Using Multi-Agent Reinforcement Learning

arXiv.org Artificial IntelligenceDec-8-2019

We use Asynchronous Advantage Actor Critic (A3C) for implementing an AI agent in the controllers that optimize flow of traffic across a single intersection and then extend it to multiple intersections by considering a multi-agent setting. We explore three different methodologies to address the multi-agent problem - (1) use of asynchronous property of A3C to control multiple intersections using a single agent (2) utilise self/competitive play among independent agents across multiple intersections and (3) ingest a global reward function among agents to introduce cooperative behavior between intersections. We observe that (1) & (2) leads to a reduction in traffic congestion. Additionally the use of (3) with (1) & (2) led to a further reduction in congestion.

agent, intersection, reward function, (13 more...)

1912.03851

Country:

Asia > India > Karnataka > Bengaluru (0.05)
North America > Canada (0.04)

Genre: Research Report (0.82)

Industry:

Transportation (0.71)
Leisure & Entertainment > Games (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Zhang, Kaiqing, Yang, Zhuoran, Başar, Tamer

Decentralized Multi-Agent Reinforcement Learning with Networked Agents: Recent Advances

arXiv.org Artificial IntelligenceDec-8-2019

Multi-agent reinforcement learning (MARL) has long been a significant and everlasting research topic in both machine learning and control. With the recent development of (single-agent) deep RL, there is a resurgence of interests in developing new MARL algorithms, especially those that are backed by theoretical analysis. In this paper, we review some recent advances a sub-area of this topic: decentralized MARL with networked agents. Specifically, multiple agents perform sequential decision-making in a common environment, without the coordination of any central controller. Instead, the agents are allowed to exchange information with their neighbors over a communication network. Such a setting finds broad applications in the control and operation of robots, unmanned vehicles, mobile sensor networks, and smart grid. This review is built upon several our research endeavors in this direction, together with some progresses made by other researchers along the line. We hope this review to inspire the devotion of more research efforts to this exciting yet challenging area.

agent, algorithm, function approximation, (10 more...)

1912.03821

Country:

North America > United States > Massachusetts > Middlesex County > Belmont (0.04)
North America > United States > Illinois > Champaign County > Champaign (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Germany (0.04)

Genre: Overview (1.00)

Industry:

Leisure & Entertainment > Games (0.68)
Energy > Power Industry (0.48)
Transportation (0.46)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Yang, Jiachen, Borovikov, Igor, Zha, Hongyuan

Hierarchical Cooperative Multi-Agent Reinforcement Learning with Skill Discovery

arXiv.org Machine LearningDec-7-2019

Human players in professional team sports achieve high level coordination by dynamically choosing complementary skills and executing primitive actions to perform these skills. As a step toward creating intelligent agents with this capability for fully cooperative multi-agent settings, we propose a two-level hierarchical multi-agent reinforcement learning (MARL) algorithm with unsupervised skill discovery. Agents learn useful and distinct skills at the low level via independent Q-learning, while they learn to select complementary latent skill variables at the high level via centralized multi-agent training with an extrinsic team reward. The set of low-level skills emerges from an intrinsic reward that solely promotes the decodability of latent skill variables from the trajectory of a low-level skill, without the need for hand-crafted rewards for each skill. For scalable decentralized execution, each agent independently chooses latent skill variables and primitive actions based on local observations. Our overall method enables the use of general cooperative MARL algorithms for training high level policies and single-agent RL for training low level skills. Experiments on a stochastic high dimensional team game show the emergence of useful skills and cooperative team play. The interpretability of the learned skills show the promise of the proposed method for achieving human-AI cooperation in team sports games.

agent, latent skill, possession, (12 more...)

1912.03558

Country: North America > United States > Massachusetts (0.04)

Genre: Research Report (0.50)

Industry:

Leisure & Entertainment > Sports (0.88)
Leisure & Entertainment > Games > Computer Games (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Karwowski, Jan, Mańdziuk, Jacek, Żychowski, Adam

Anchoring Theory in Sequential Stackelberg Games

arXiv.org Artificial IntelligenceDec-7-2019

An underlying assumption of Stackelberg Games (SGs) is perfect rationality of the players. However, in real-life situations (which are often modeled by SGs) the followers (terrorists, thieves, poachers or smugglers) -- as humans in general -- may act not in a perfectly rational way, as their decisions may be affected by biases of various kinds which bound rationality of their decisions. One of the popular models of bounded rationality (BR) is Anchoring Theory (AT) which claims that humans have a tendency to flatten probabilities of available options, i.e. they perceive a distribution of these probabilities as being closer to the uniform distribution than it really is. This paper proposes an efficient formulation of AT in sequential extensive-form SGs (named ATSG), suitable for Mixed-Integer Linear Program (MILP) solution methods. ATSG is implemented in three MILP/LP-based state-of-the-art methods for solving sequential SGs and two recently introduced non-MILP approaches: one relying on Monte Carlo sampling (O2UCT) and the other one (EASG) employing Evolutionary Algorithms. Experimental evaluation indicates that both non-MILP heuristic approaches scale better in time than MILP solutions while providing optimal or close-to-optimal solutions. Except for competitive time scalability, an additional asset of non-MILP methods is flexibility of potential BR formulations they are able to incorporate. While MILP approaches accept BR formulations with linear constraints only, no restrictions on the BR form are imposed in either of the two non-MILP methods.

follower, formulation, probability, (16 more...)

1912.03564

Country:

North America > United States > Texas > Travis County > Austin (0.04)
North America > United States > New York > Tompkins County > Ithaca (0.04)
North America > Canada > Quebec > Montreal (0.04)
(2 more...)

Genre: Research Report > Promising Solution (0.34)

Industry: Leisure & Entertainment > Games (1.00)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (0.67)

#artificialintelligenceDec-6-2019, 14:44:54 GMT

Amazon proposes a home robot that asks you questions when it's confused

AI models invariably encounter ambiguous situations that they struggle to respond to with instructions alone. That's problematic for autonomous agents tasked with, say, navigating an apartment, because they run the risk of becoming stuck when presented with several paths. To solve this, researchers at Amazon's Alexa AI division developed a framework that endows agents with the ability to ask for help in certain situations. Using what's called a model-confusion-based method, the agents ask questions based on their level of confusion as determined by a predefined confidence threshold, which the researchers claim boosts the agents' success by at least 15%. "Consider the situation in which you want a robot assistant to get your wallet on the bed … with two doors in the scene and an instruction that only tells it to walk through the doorway," wrote the team in a preprint paper describing their work.

amazon propose, home robot, robot, (5 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.57)
Information Technology > Artificial Intelligence > Robots > Robots in the Home (0.42)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.37)

#artificialintelligenceDec-5-2019, 19:05:14 GMT

MIT creates an AI that understands the laws of physics intuitively - Innowire

Dubbed ADEPT, the system is able to, like a human being, understand some laws of physics intuitively. It can look at an object in a video, predict how it should act based on what it knows of the laws of physics and then register surprise if what it was looking at subsequently vanishes or teleports. The team behind ADEPT say their model will allow other researchers to create smarter AIs in the future, as well give us a better understanding of how infants understand the world around them. "By the time infants are three months old, they have some notion that objects don't wink in and out of existence, and can't move through each other or teleport," said Kevin A. Smith, one of the researchers that created ADEPT. "We wanted to capture and formalize that knowledge to build infant cognition into artificial-intelligence agents. We're now getting near human-like in the way models can pick apart basic implausible or plausible scenes."

adept, infant, mit create, (8 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.38)

arXiv.org Machine LearningDec-5-2019

Collective Learning

Farina, Francesco

Department of Electrical, Electronic and Information Engineering Alma Mater Studiorum - Universit a di Bologna Bologna, Italy Abstract In this paper, we introduce the concept of collective learning (CL) which exploits the notion of collective intelligence in the field of distributed semi-supervised learning. The proposed framework draws inspiration from the learning behavior of human beings, who alternate phases involving collaboration, confrontation and exchange of views with other consisting of studying and learning on their own. On this regard, CL comprises two main phases: a self-training phase in which learning is performed on local private (labeled) data only and a collective training phase in which proxy-labels are assigned to shared (unlabeled) data by means of a consensus-based algorithm. In the considered framework, heterogeneous systems can be connected over the same network, each with different computational capabilities and resources and everyone in the network may take advantage of the cooperation and will eventually reach higher performance with respect to those it can reach on its own. An extensive experimental campaign on an image classification problem emphasizes the properties of CL by analyzing the performance achieved by the cooperating agents. 1 Introduction The notion of collective intelligence has been firstly introduced in [Engelbart, 1962] and widespread in the sociological field by Pierre L evy in [L evy and Bononno, 1997]. By borrowing the words of L evy, collective intelligence " is a form of universally distributed intelligence, constantly enhanced, coordinated in real time, and resulting in the effective mobilization of skills ". Moreover, " the basis and goal of collective intelligence is mutual recognition and enrichment of individuals rather than the cult of fetishized or hypostatized communities ". In this paper, we aim to exploit some concepts borrowed from the notion of collective intelligence in a distributed machine learning scenario. In fact, by cooperating with each other, machines may exhibit performance higher than those they can obtain by learning on their own. We call this framework collective learning (CL) . Distributed systems 1 have received a steadily growing attention in the last years and1 When talking about distributed systems, the word distributed can be used with different meanings.

agent, artificial intelligence, machine learning, (13 more...)

1912.0258

Country:

Europe > Italy > Emilia-Romagna > Metropolitan City of Bologna > Bologna (0.44)
North America > United States > Virginia (0.04)
North America > United States > California > San Mateo County > Menlo Park (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.50)

Industry: Education (0.88)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.48)

arXiv.org Machine LearningDec-5-2019

Learning Human Objectives by Evaluating Hypothetical Behavior

Reddy, Siddharth, Dragan, Anca D., Levine, Sergey, Legg, Shane, Leike, Jan

We seek to align agent behavior with a user's objectives in a reinforcement learning setting with unknown dynamics, an unknown reward function, and unknown unsafe states. The user knows the rewards and unsafe states, but querying the user is expensive. To address this challenge, we propose an algorithm that safely and interactively learns a model of the user's reward function. We start with a generative model of initial states and a forward dynamics model trained on off-policy data. Our method uses these models to synthesize hypothetical behaviors, asks the user to label the behaviors with rewards, and trains a neural network to predict the rewards. The key idea is to actively synthesize the hypothetical behaviors from scratch by maximizing tractable proxies for the value of information, without interacting with the environment. We call this method reward query synthesis via trajectory optimization (ReQueST). We evaluate ReQueST with simulated users on a state-based 2D navigation task and the image-based Car Racing video game. The results show that ReQueST significantly outperforms prior methods in learning reward models that transfer to new environments with different initial state distributions. Moreover, ReQueST safely trains the reward model to detect unsafe states, and corrects reward hacking before deploying the agent.

query, reward model, trajectory, (14 more...)

1912.05652

Country: North America > United States > California > Alameda County > Berkeley (0.04)

Genre: Research Report > New Finding (0.48)

Industry:

Education (0.49)
Health & Medicine (0.46)
Leisure & Entertainment > Games (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
(2 more...)

Marot, Antoine, Donnot, Benjamin, Romero, Camilo, Veyrin-Forrer, Luca, Lerousseau, Marvin, Donon, Balthazar, Guyon, Isabelle

Learning to run a power network challenge for training topology controllers

arXiv.org Machine LearningDec-5-2019

For power grid operations, a large body of research focuses on using generation redispatching, load shedding or demand side management flexibilities. However, a less costly and potentially more flexible option would be grid topology reconfiguration, as already partially exploited by Coreso (European RSC) and RTE (French TSO) operations. Beyond previous work on branch switching, bus reconfigurations are a broader class of action and could provide some substantial benefits to route electricity and optimize the grid capacity to keep it within safety margins. Because of its non-linear and combinatorial nature, no existing optimal power flow solver can yet tackle this problem. We here propose a new framework to learn topology controllers through imitation and reinforcement learning. We present the design and the results of the first "Learning to Run a Power Network" challenge released with this framework. We finally develop a method providing performance upper-bounds (oracle), which highlights remaining unsolved challenges and suggests future directions of improvement.

agent, overload, scenario, (12 more...)

1912.04211

Country:

Europe > Portugal > Porto > Porto (0.05)
Europe > France > Île-de-France > Paris > Paris (0.04)

Genre: Research Report (0.40)

Industry:

Energy > Power Industry (1.00)
Leisure & Entertainment (0.93)
Energy > Renewable > Wind (0.46)
Energy > Renewable > Solar (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)