AITopics | Agents

We consider the networked multi-agent reinforcement learning (MARL) problem in a fully decentralized setting, where agents learn to coordinate to achieve the joint success. This problem is widely encountered in many areas including traffic control, distributed control, and smart grids. We assume that the reward function for each agent can be different and observed only locally by the agent itself. Furthermore, each agent is located at a node of a communication network and can exchanges information only with its neighbors. Using softmax temporal consistency and a decentralized optimization method, we obtain a principled and data-efficient iterative algorithm. In the first step of each iteration, an agent computes its local policy and value gradients and then updates only policy parameters. In the second step, the agent propagates to its neighbors the messages based on its value function and then updates its own value function. Hence we name the algorithm value propagation. We prove a non-asymptotic convergence rate 1/T with the nonlinear function approximation. To the best of our knowledge, it is the first MARL algorithm with convergence guarantee in the control, off-policy and non-linear function approximation setting. We empirically demonstrate the effectiveness of our approach in experiments.

agent, propagation, value propagation, (12 more...)

arXiv.org Machine Learning

1901.09326

Country: North America > Canada > Ontario > Toronto (0.04)

Genre: Research Report (0.64)

Industry: Energy > Power Industry (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.67)

Add feedback

Identifying artificial intelligence 'blind spots'

#artificialintelligenceJan-26-2019, 21:17:56 GMT

A novel model developed by MIT and Microsoft researchers identifies instances in which autonomous systems have "learned" from training examples that don't match what's actually happening in the real world. Engineers could use this model to improve the safety of artificial intelligence systems, such as driverless vehicles and autonomous robots. The AI systems powering driverless cars, for example, are trained extensively in virtual simulations to prepare the vehicle for nearly every event on the road. But sometimes the car makes an unexpected error in the real world because an event occurs that should, but doesn't, alter the car's behavior. Consider a driverless car that wasn't trained, and more importantly doesn't have the sensors necessary, to differentiate between distinctly different scenarios, such as large, white cars and ambulances with red, flashing lights on the road.

artificial intelligence, blind spot, machine learning, (16 more...)

#artificialintelligence

Industry:

Transportation > Passenger (0.74)
Transportation > Ground > Road (0.74)
Information Technology > Robotics & Automation (0.74)
Automobiles & Trucks (0.74)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.69)

Add feedback

Google's AI, Chatbase Adds a New Service to Design More Versatile Virtual Agents Quicker

#artificialintelligenceJan-26-2019, 10:50:02 GMT

Chatbase is a conversational AI platform that replaces the risky status quo approach with a data-driven one based on Google's world-class machine learning and search capabilities. The results include faster development (by up to 10x) of a more helpful and versatile virtual agent, and happier customers! Initially, Chatbase provided a free-to-use analytics service for measuring and optimizing any AI-powered chatbot. After analyzing hundreds of thousands of bots and billions of messages in the first 18 months of existence, Google had two revelations about how to help bot builders in a more impactful way: one, that customer service virtual agents would become the primary use case for the technology; and two, that using ML to glean insights from live-chat transcripts at scale would drastically shorten development time for those agents while creating a better consumer experience. With those lessons learned, Chatbase Virtual Agent Modeling (currently available via an EAP) was born.

agent, artificial intelligence, virtual agent, (10 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)

Add feedback

AI bots team up to wrangle digital swine in Minecraft

#artificialintelligenceJan-26-2019, 08:50:54 GMT

Wrangling a pig--even a virtual one--is much easier if you get a friend to help. This much seems clear from a contest organized by Microsoft researchers to test how artificially intelligent agents could cooperate to solve tricky problems. How best to cooperate with your pig-wrangling pal is another question. The competition addresses an area of artificial intelligence that has had relatively little attention so far. AI researchers often develop software capable of performing a specific human task, such as playing chess or Go, and then measure it according to its ability to defeat a human player.

agent, artificial intelligence, machine learning, (10 more...)

#artificialintelligence

Country: Europe > United Kingdom (0.17)

Industry: Leisure & Entertainment > Games > Computer Games (0.54)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.80)
Information Technology > Artificial Intelligence > Machine Learning (0.79)

Add feedback

Probabilistic Recursive Reasoning for Multi-Agent Reinforcement Learning

Wen, Ying, Yang, Yaodong, Luo, Rui, Wang, Jun, Pan, Wei

arXiv.org Machine LearningJan-26-2019

Humans are capable of attributing latent mental contents such as beliefs, or intentions to others. The social skill is critical in everyday life to reason about the potential consequences of their behaviors so as to plan ahead. It is known that humans use this reasoning ability recursively, i.e. considering what others believe about their own beliefs. In this paper, we start from level-$1$ recursion and introduce a probabilistic recursive reasoning (PR2) framework for multi-agent reinforcement learning. Our hypothesis is that it is beneficial for each agent to account for how the opponents would react to its future behaviors. Under the PR2 framework, we adopt variational Bayes methods to approximate the opponents' conditional policy, to which each agent finds the best response and then improve their own policy. We develop decentralized-training-decentralized-execution algorithms, PR2-Q and PR2-Actor-Critic, that are proved to converge in the self-play scenario when there is one Nash equilibrium. Our methods are tested on both the matrix game and the differential game, which have a non-trivial equilibrium where common gradient-based methods fail to converge. Our experiments show that it is critical to reason about how the opponents believe about what the agent believes. We expect our work to contribute a new idea of modeling the opponents to the multi-agent reinforcement learning community.

agent, conference paper, opponent, (13 more...)

arXiv.org Machine Learning

1901.09207

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > California > San Mateo County > Menlo Park (0.04)
(2 more...)

Genre: Research Report (0.50)

Industry: Leisure & Entertainment > Games (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.48)

Add feedback

Multi-Agent Generalized Recursive Reasoning

Wen, Ying, Yang, Yaodong, Lu, Rui, Wang, Jun

arXiv.org Artificial IntelligenceJan-26-2019

We propose a new reasoning protocol called generalized recursive reasoning (GR2), and embed it into the multi-agent reinforcement learning (MARL) framework. The GR2 model defines reasoning categories: level-$0$ agent acts randomly, and level-$k$ agent takes the best response to a mixed type of agents that are distributed over level $0$ to $k-1$. The GR2 leaners can take into account the bounded rationality, and it does not need the assumption that the opponent agents play Nash strategy in all stage games, which many MARL algorithms require. We prove that when the level $k$ is large, the GR2 learners will converge to at least one Nash Equilibrium (NE). In addition, if lower-level agents play the NE, high-level agents will surely follow as well. We evaluate the GR2 Soft Actor-Critic algorithms in a series of games and high-dimensional environment; results show that the GR2 methods have faster convergence speed than strong MARL baselines.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

arXiv.org Artificial Intelligence

1901.09216

Country: Europe (0.28)

Genre: Research Report (0.70)

Industry: Leisure & Entertainment > Games (0.94)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

AI Dominates Human Professional Players in StarCraft II

#artificialintelligenceJan-25-2019, 23:50:38 GMT

An artificial intelligence has defeated two top-ranked human players in the computer game StarCraft II, using some strategies rarely encountered before. On Thursday, gamers were able to watch the AI agent, called AlphaStar, expertly command armies of "Protoss" units against the professional players. The result: The AI beat the humans 10 out of the 11 matches. "I was surprised by how strong the agent was," said Dario "TLO" Wünsch, one of the human players. "AlphaStar takes well-known strategies and turns them on their head."

artificial intelligence, deepmind, human player, (9 more...)

#artificialintelligence

Industry: Leisure & Entertainment > Games > Computer Games (1.00)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.36)

Add feedback

A simple blueprint for building AI-powered customer service on GCP Google Cloud Blog

#artificialintelligenceJan-25-2019, 22:04:44 GMT

As a Google Cloud customer engineer based in Amsterdam, I work with a lot of banks and insurance companies in the Netherlands. All of them have this common requirement: to help customer service agents (many of whom are poorly trained interns due to the expense of hiring) handle large numbers of customer calls, especially at the end of the year when many consumers want to change or update their insurance plan. Most of these requests are predictable and easily resolved with the exchange of a small amount of information, which is a perfect use case for an AI-powered customer service agent. Virtual agents can provide non-queued service around the clock, and can easily be programmed to handle simple requests as well as do a hand-off to well-trained live agents for more complicated issues. Furthermore, a well-designed solution can help ensure that consumer requests, regardless of the channel in which they are received (phone, chat, IoT), are routed to the correct resource.

artificial intelligence, building ai-powered customer service, customer service, (5 more...)

#artificialintelligence

Country: Europe > Netherlands > North Holland > Amsterdam (0.27)

Industry:

Information Technology > Services (0.63)
Banking & Finance > Insurance (0.60)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.87)

Add feedback

Identifying artificial intelligence 'blind spots'

#artificialintelligenceJan-25-2019, 22:04:04 GMT

A novel model developed by MIT and Microsoft researchers identifies instances in which autonomous systems have "learned" from training examples that don't match what's actually happening in the real world. Engineers could use this model to improve the safety of artificial intelligence systems, such as driverless vehicles and autonomous robots. The AI systems powering driverless cars, for example, are trained extensively in virtual simulations to prepare the vehicle for nearly every event on the road. But sometimes the car makes an unexpected error in the real world because an event occurs that should, but doesn't, alter the car's behavior. Consider a driverless car that wasn't trained, and more importantly doesn't have the sensors necessary, to differentiate between distinctly different scenarios, such as large, white cars and ambulances with red, flashing lights on the road.

artificial intelligence, blind spot, machine learning, (16 more...)

#artificialintelligence

Industry:

Transportation > Passenger (0.74)
Transportation > Ground > Road (0.74)
Information Technology > Robotics & Automation (0.74)
Automobiles & Trucks (0.74)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.69)

Add feedback

Self-driving cars, robots: Identifying AI 'blind spots'

#artificialintelligenceJan-25-2019, 18:29:38 GMT

The AI systems powering driverless cars, for example, are trained extensively in virtual simulations to prepare the vehicle for nearly every event on the road. But sometimes the car makes an unexpected error in the real world because an event occurs that should, but doesn't, alter the car's behavior. Consider a driverless car that wasn't trained, and more importantly doesn't have the sensors necessary, to differentiate between distinctly different scenarios, such as large, white cars and ambulances with red, flashing lights on the road. If the car is cruising down the highway and an ambulance flicks on its sirens, the car may not know to slow down and pull over, because it does not perceive the ambulance as different from a big white car. In a pair of papers -- presented at last year's Autonomous Agents and Multiagent Systems conference and the upcoming Association for the Advancement of Artificial Intelligence conference -- the researchers describe a model that uses human input to uncover these training "blind spots."

blind spot, ramakrishnan, real world, (16 more...)

#artificialintelligence

Industry: