AITopics

1810.08159

Country:

North America > United States > Massachusetts > Hampshire County > Amherst (0.04)
North America > United States > California > Santa Clara County > Santa Clara (0.04)

Genre: Research Report (0.64)

Industry:

Automobiles & Trucks (0.93)
Transportation > Ground > Road (0.68)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.51)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.49)
(3 more...)

#artificialintelligenceApr-2-2020, 07:31:33 GMT

DeepMind's Agent57 AI agent can best human players across a suite of 57 Atari games – TechCrunch

Development of artificial intelligence agents tends to frequently be measured by their performance in games, but there's a good reason for that: Games tend to offer a wide proficiency curve, in terms of being relatively simple to grasp the basics, but difficult to master, and they almost always have a built-in scoring system to evaluate performance. DeepMind's agents have tackled board game Go, as well as real-time strategy video game StarCraft. But the Alphabet company's most recent feat is Agent57, a learning agent that can beat the average human on each of 57 Atari games with a wide range of difficulty, characteristics and gameplay styles. Being better than humans at 57 Atari games may seem like an odd benchmark against which to measure the performance of a deep learning agent, but it's actually a standard that goes all the way back to 2012, with a selection of Atari classics including Pitfall, Solaris, Montezuma's Revenge and many others. Taken together, these games represent a broad range of difficulty levels, as well as requiring a range of different strategies in order to achieve success.

agent, deepmind, human player, (7 more...)

#artificialintelligence

Industry: Leisure & Entertainment > Games > Computer Games (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.98)

Siebert, Luciano Cavalcante, Mercuur, Rijk, Dignum, Virginia, Hoven, Jeroen van den, Jonker, Catholijn

Improving Confidence in the Estimation of Values and Norms

arXiv.org Artificial IntelligenceApr-2-2020

Autonomous agents (AA) will increasingly be interacting with us in our daily lives. While we want the benefits attached to AAs, it is essential that their behavior is aligned with our values and norms. Hence, an AA will need to estimate the values and norms of the humans it interacts with, which is not a straightforward task when solely observing an agent's behavior. This paper analyses to what extent an AA is able to estimate the values and norms of a simulated human agent (SHA) based on its actions in the ultimatum game. We present two methods to reduce ambiguity in profiling the SHAs: one based on search space exploration and another based on counterfactual analysis. We found that both methods are able to increase the confidence in estimating human values and norms, but differ in their applicability, the latter being more efficient when the number of interactions with the agent is to be minimized. These insights are useful to improve the alignment of AAs with human values and norms.

artificial intelligence, machine learning, value and norm, (17 more...)

2004.01056

Country:

Europe > Netherlands > South Holland > Delft (0.04)
Europe > Netherlands > South Holland > Leiden (0.04)
North America > United States > Virginia (0.04)
Europe > Sweden (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)

arXiv.org Artificial IntelligenceApr-2-2020

Information State Embedding in Partially Observable Cooperative Multi-Agent Reinforcement Learning

Mao, Weichao, Zhang, Kaiqing, Miehling, Erik, Başar, Tamer

Multi-agent reinforcement learning (MARL) under partial observability has long been considered challenging, primarily due to the requirement for each agent to maintain a belief over all other agents' local histories -- a domain that generally grows exponentially over time. In this work, we investigate a partially observable MARL problem in which agents are cooperative. To enable the development of tractable algorithms, we introduce the concept of an information state embedding that serves to compress agents' histories. We quantify how the compression error influences the resulting value functions for decentralized control. Furthermore, we propose three natural embeddings, based on finite-memory truncation, principal component analysis, and recurrent neural networks. The output of these embeddings are then used as the information state, and can be fed into any MARL algorithm. The proposed embed-then-learn pipeline opens the black-box of existing MARL algorithms, allowing us to establish some theoretical guarantees (error bounds of value functions) while still achieving competitive performance with many end-to-end approaches.

agent, information, information state, (15 more...)

2004.01098

Country:

North America > United States > Illinois > Champaign County > Urbana (0.14)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.64)

Industry:

Information Technology (0.46)
Transportation (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.91)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.71)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.64)

Chu, Tianshu, Chinchali, Sandeep, Katti, Sachin

Multi-agent Reinforcement Learning for Networked System Control

arXiv.org Machine LearningApr-2-2020

This paper considers multi-agent reinforcement learning (MARL) in networked system control. Specifically, each agent learns a decentralized control policy based on local observations and messages from connected neighbors. We formulate such a networked MARL (NMARL) problem as a spatiotemporal Markov decision process and introduce a spatial discount factor to stabilize the training of each local agent. Further, we propose a new differentiable communication protocol, called NeurComm, to reduce information loss and non-stationarity in NMARL. Based on experiments in realistic NMARL scenarios of adaptive traffic signal control and cooperative adaptive cruise control, an appropriate spatial discount factor effectively enhances the learning curves of non-communicative MARL algorithms, while NeurComm outperforms existing communication protocols in both learning efficiency and control performance.

communication, neurcomm, scenario, (13 more...)

2004.01339

Country:

Europe > Monaco (0.06)
North America > United States > California > Santa Clara County > Stanford (0.04)

Genre: Research Report (0.40)

Industry:

Transportation > Infrastructure & Services (0.66)
Transportation > Ground > Road (0.66)
Consumer Products & Services > Travel (0.49)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Kajić, Ivana, Aygün, Eser, Precup, Doina

Learning to cooperate: Emergent communication in multi-agent navigation

arXiv.org Machine LearningApr-2-2020

Emergent communication in artificial agents has been studied to understand language evolution, as well as to develop artificial systems that learn to communicate with humans. We show that agents performing a cooperative navigation task in various gridworld environments learn an interpretable communication protocol that enables them to efficiently, and in many cases, optimally, solve the task. An analysis of the agents' policies reveals that emergent signals spatially cluster the state space, with signals referring to specific locations and spatial directions such as "left", "up", or "upper left room". Using populations of agents, we show that the emergent protocol has basic compositional structure, thus exhibiting a core property of natural language.

agent, receiver, sender, (16 more...)

2004.01097

Country:

North America > Canada > Quebec > Montreal (0.04)
North America > Canada > Ontario > Waterloo Region > Waterloo (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.83)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.72)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)

arXiv.org Machine LearningApr-2-2020

Planning as Inference in Epidemiological Models

Wood, Frank, Warrington, Andrew, Naderiparizi, Saeid, Weilbach, Christian, Masrani, Vaden, Harvey, William, Scibior, Adam, Beronov, Boyan, Nasseri, Ali

In this work we demonstrate how existing software tools can be used to automate parts of infectious disease-control policy-making via performing inference in existing epidemiological dynamics models. The kind of inference tasks undertaken include computing, for planning purposes, the posterior distribution over putatively controllable, via direct policy-making choices, simulation model parameters that give rise to acceptable disease progression outcomes. Neither the full capabilities of such inference automation software tools nor their utility for planning is widely disseminated at the current time. Timely gains in understanding about these tools and how they can be used may lead to more fine-grained and less economically damaging policy prescriptions, particularly during the current COVID-19 pandemic.

inference, simulation, simulator, (14 more...)

2003.13221

Country:

North America > United States > Pennsylvania > Allegheny County (0.04)
North America > United States > Florida > Broward County > Fort Lauderdale (0.04)
North America > Canada > British Columbia (0.04)
(13 more...)

Genre: Research Report (0.82)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)
Health & Medicine > Epidemiology (1.00)

Technology:

Information Technology > Software (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
(2 more...)

Capezzuto, Luca, Tarapore, Danesh, Ramchurn, Sarvapali D.

Anytime and Efficient Coalition Formation with Spatial and Temporal Constraints

arXiv.org Artificial IntelligenceApr-2-2020

The Coalition Formation with Spatial and Temporal constraints Problem (CFSTP) is a multi-agent task allocation problem where the agents are cooperative and few, the tasks are many, spatially distributed, with deadlines and workloads, and the objective is to find a schedule that maximises the number of completed tasks. The current state-of-the-art CFSTP solver, the Coalition Formation with Look-Ahead (CFLA) algorithm, has two main limitations. First, its time complexity is quadratic with the number of tasks and exponential with the number of agents, which makes it not efficient. Second, its look-ahead technique is not effective in real-world scenarios, such as open multi-agent systems, where new tasks can appear at any time. Motivated by this, we propose an extension of CFLA, which we call Coalition Formation with Improved Look-Ahead (CFLA+). Since CFLA+ inherits the limitations of CFLA, we also develop a novel algorithm to solve the CFSTP, the first to be both anytime and efficient, which we call Cluster-based Coalition Formation (CCF). We empirically show that, in settings where the look-ahead technique is highly effective, CCF completes up to 20% (resp. 10%) more tasks than CFLA (resp. CFLA+) while being up to four orders of magnitude faster. Our results affirm CCF as the new state-of-the-art CFSTP solver.

agent, algorithm, cfla, (16 more...)

2003.13806

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Europe > United Kingdom > England > Hampshire > Southampton (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.35)

#artificialintelligenceApr-1-2020, 08:56:02 GMT

DeepMind's Agent57 beats humans at 57 classic Atari games

In a preprint paper published this week by DeepMind, Google parent company Alphabet's U.K.-based research division, a team of scientists describe Agent57, which they say is the first system that outperforms humans on all 57 Atari games in the Arcade Learning Environment data set. Assuming the claim holds water, Agent57 could lay the groundwork for more capable AI decision-making models than have been previously released. This could be a boon for enterprises looking to boost productivity through workplace automation; imagine AI that automatically completes not only mundane, repetitive tasks like data entry, but which reasons about its environment. "With Agent57, we have succeeded in building a more generally intelligent agent that has above-human performance on all tasks in the Atari57 benchmark," wrote the study's coauthors. "Agent57 was able to scale with increasing amounts of computation: the longer it trained, the higher its score got."

large language model, machine learning, natural language, (16 more...)

#artificialintelligence

Genre: Research Report (0.51)

Industry: Leisure & Entertainment > Games > Computer Games (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.81)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.81)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.56)

Barik, Adarsh, Honorio, Jean

Provable Sample Complexity Guarantees for Learning of Continuous-Action Graphical Games with Nonparametric Utilities

arXiv.org Machine LearningApr-1-2020

Game theory has been extensively used as a framework to model and study the strategic interactions amongst rational but selfish individual players who are trying to maximize their payoffs. Game theory has been applied in many fields including but not limited to social and political science, economics, communication, system design and computer science. In non-cooperative games each player decides its action based on the actions of others players. These games are characterized by the equilibrium solution concept such as Nash equilibrium (NE) [18] which serves a descriptive role of the stable outcome of the overall behavior of self-interested players (e.g., people, companies, governments, groups or autonomous systems) interacting strategically with each other in distributed settings. Graphical games, introduced within the AI community about two decades ago, graphical games [16], are a representation of multiplayer games which capture and exploit locality or sparsity of direct influences. They are most appropriate for large-scale population games in which the payoffs of each player are determined by the actions of only a small number of other players. Indeed, graphical games played a prominent role in establishing the computational complexity of computing NE in normal-form games as well as in succinctly representable multiplayer games (see, e.g., [5, 6, 7] and the references therein). Graphical games have been studied for both discrete and continuous actions.

artificial intelligence, game theory, machine learning, (17 more...)

2004.01022

Country:

North America > United States > Indiana > Tippecanoe County > Lafayette (0.04)
North America > United States > Indiana > Tippecanoe County > West Lafayette (0.04)
Europe > Portugal (0.04)
Europe > Iceland (0.04)

Genre: Research Report (0.82)

Industry: Leisure & Entertainment > Games (1.00)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.34)