AITopics

1907.0981

Country: Europe > United Kingdom (0.14)

Genre: Research Report > New Finding (0.69)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.95)
Information Technology > Game Theory (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Davis, Trevor, Schmid, Martin, Bowling, Michael

Low-Variance and Zero-Variance Baselines for Extensive-Form Games

Extensive-form games (EFGs) are a common model of multi-agent interactions with imperfect information. State-of-the-art algorithms for solving these games typically perform full walks of the game tree that can prove prohibitively slow in large games. Alternatively, sampling-based methods such as Monte Carlo Counterfactual Regret Minimization walk one or more trajectories through the tree, touching only a fraction of the nodes on each iteration, at the expense of requiring more iterations to converge due to the variance of sampled values. In this paper, we extend recent work that uses baseline estimates to reduce this variance. We introduce a framework of baseline-corrected values in EFGs that generalizes the previous work. Within our framework, we propose new baseline functions that result in significantly reduced variance compared to existing techniques. We show that one particular choice of such a function --- predictive baseline --- is provably optimal under certain sampling schemes. This allows for efficient computation of zero-variance value estimates even along sampled trajectories.

baseline, information, variance, (16 more...)

1907.09633

Country:

North America > Canada > Alberta (0.14)
North America > United States > Texas (0.04)

Genre: Research Report (0.64)

Industry: Leisure & Entertainment > Games (1.00)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Oliehoek, Frans A., Witwicki, Stefan, Kaelbling, Leslie P.

A Sufficient Statistic for Influence in Structured Multiagent Environments

Making decisions in complex environments is a key challenge in artificial intelligence (AI). Situations involving multiple decision makers are particularly complex, leading to computation intractability of principled solution methods. A body of work in AI [4, 3, 41, 45, 47, 2] has tried to mitigate this problem by trying to bring down interaction to its core: how does the policy of one agent influence another agent? If we can find more compact representations of such influence, this can help us deal with the complexity, for instance by searching the space of influences rather than that of policies [45]. However, so far these notions of influence have been restricted in their applicability to special cases of interaction. In this paper we formalize influence-based abstraction (IBA), which facilitates the elimination of latent state factors without any loss in value, for a very general class of problems described as factored partially observable stochastic games (fPOSGs) [33]. This generalizes existing descriptions of influence, and thus can serve as the foundation for improvements in scalability and other insights in decision making in complex settings.

agent, artificial intelligence, machine learning, (19 more...)

1907.09278

Country: North America > United States (0.45)

Genre: Research Report (0.49)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.93)

Cazenille, Leo, Bredeche, Nicolas, Halloy, José

Automatic Calibration of Artificial Neural Networks for Zebrafish Collective Behaviours using a Quality Diversity Algorithm

During the last two decades, various models have been proposed for fish collective motion. These models are mainly developed to decipher the biological mechanisms of social interaction between animals. They consider very simple homogeneous unbounded environments and it is not clear that they can simulate accurately the collective trajectories. Moreover when the models are more accurate, the question of their scalability to either larger groups or more elaborate environments remains open. This study deals with learning how to simulate realistic collective motion of collective of zebrafish, using real-world tracking data. The objective is to devise an agent-based model that can be implemented on an artificial robotic fish that can blend into a collective of real fish. We present a novel approach that uses Quality Diversity algorithms, a class of algorithms that emphasise exploration over pure optimisation. In particular, we use CVT-MAP-Elites, a variant of the state-of-the-art MAP-Elites algorithm for high dimensional search space. Results show that Quality Diversity algorithms not only outperform classic evolutionary reinforcement learning methods at the macroscopic level (i.e. group behaviour), but are also able to generate more realistic biomimetic behaviours at the microscopic level (i.e. individual behaviour).

artificial intelligence, machine learning, trajectory, (17 more...)

doi: 10.1007/978-3-030-24741-6_4

1907.09209

Country: Asia > Japan (0.28)

Genre:

Research Report > Experimental Study (0.93)
Research Report > New Finding (0.66)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Censi, Andrea, Bolognani, Saverio, Zilly, Julian G., Mousavi, Shima Sadat, Frazzoli, Emilio

Today Me, Tomorrow Thee: Efficient Resource Allocation in Competitive Settings using Karma Games

We present a new type of coordination mechanism among multiple agents for the allocation of a finite resource, such as the allocation of time slots for passing an intersection. We consider the setting where we associate one counter to each agent, which we call karma value, and where there is an established mechanism to decide resource allocation based on agents exchanging karma. The idea is that agents might be inclined to pass on using resources today, in exchange for karma, which will make it easier for them to claim the resource use in the future. To understand whether such a system might work robustly, we only design the protocol and not the agents' policies. We take a game-theoretic perspective and compute policies corresponding to Nash equilibria for the game. We find, surprisingly, that the Nash equilibria for a society of self-interested agents are very close in social welfare to a centralized cooperative solution. These results suggest that many resource allocation problems can have a simple, elegant, and robust solution, assuming the availability of a karma accounting mechanism.

agent, artificial intelligence, game theory, (16 more...)

1907.09198

Country: Europe (0.28)

Genre: Research Report (0.84)

Industry:

Transportation > Infrastructure & Services (0.68)
Leisure & Entertainment > Games (0.68)
Transportation > Ground > Road (0.46)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)

Albrecht, Stefano V., Ramamoorthy, Subramanian

Comparative Evaluation of Multiagent Learning Algorithms in a Diverse Set of Ad Hoc Team Problems

This paper is concerned with evaluating different multiagent learning (MAL) algorithms in problems where individual agents may be heterogenous, in the sense of utilizing different learning strategies, without the opportunity for prior agreements or information regarding coordination. Such a situation arises in ad hoc team problems, a model of many practical multiagent systems applications. Prior work in multiagent learning has often been focussed on homogeneous groups of agents, meaning that all agents were identical and a priori aware of this fact. Also, those algorithms that are specifically designed for ad hoc team problems are typically evaluated in teams of agents with fixed behaviours, as opposed to agents which are adapting their behaviours. In this work, we empirically evaluate five MAL algorithms, representing major approaches to multiagent learning but originally developed with the homogeneous setting in mind, to understand their behaviour in a set of ad hoc team problems. All teams consist of agents which are continuously adapting their behaviours. The algorithms are evaluated with respect to a comprehensive characterisation of repeated matrix games, using performance criteria that include considerations such as attainment of equilibrium, social welfare and fairness. Our main conclusion is that there is no clear winner. However, the comparative evaluation also highlights the relative strengths of different algorithms with respect to the type of performance criteria, e.g., social welfare vs. attainment of equilibrium.

agent, algorithm, artificial intelligence, (18 more...)

1907.09189

Country: Europe (0.28)

Genre:

Research Report > New Finding (0.46)
Research Report > Experimental Study (0.46)

Industry:

Leisure & Entertainment > Games (0.68)
Information Technology (0.46)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)

Apt, Krzysztof R., Wojtczak, Dominik

Open Problems in a Logic of Gossips

arXiv.org Artificial IntelligenceJul-21-2019

Gossip protocols are programs used in a setting in which each agent holds a secret and the aim is to reach a situation in which all agents know all secrets. Such protocols rely on a point-to-point or group communication. Distributed epistemic gossip protocols use epistemic formulas in the component programs for the agents. The advantage of the use of epistemic logic is that the resulting protocols are very concise and amenable for a simple verification. Recently, we introduced a natural modal logic that allows one to express distributed epistemic gossip protocols and to reason about their correctness. We proved that the resulting protocols are implementable and that all aspects of their correctness, including termination, are decidable. To establish these results we showed that both the definition of semantics and of truth of the underlying logic are decidable. We also showed that the analogous results hold for an extension of this logic with the 'common knowledge' operator. However, several, often deceptively simple, questions about this logic and the corresponding gossip protocols remain open. The purpose of this paper is to list and elucidate these questions and provide for them an appropriate background information in the form of partial of related results.

agent, gossip protocol, protocol, (14 more...)

doi: 10.4204/EPTCS.297.1

1907.09097

Country:

Europe > Poland > Masovia Province > Warsaw (0.04)
Europe > United Kingdom > England > Merseyside > Liverpool (0.04)
Europe > Netherlands > North Holland > Amsterdam (0.04)
Asia > Middle East > Iran (0.04)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)

Surynek, Pavel, Kumar, T. K. Satish, Koenig, Sven

Multi-Agent Path Finding with Capacity Constraints

arXiv.org Artificial IntelligenceJul-21-2019

In multi-agent path finding (MAPF) the task is to navigate agents from their starting positions to given individual goals. The problem takes place in an undirected graph whose vertices represent positions and edges define the topology. Agents can move to neighbor vertices across edges. In the standard MAPF, space occupation by agents is modeled by a capacity constraint that permits at most one agent per vertex. We suggest an extension of MAPF in this paper that permits more than one agent per vertex. Propositional satisfiability (SAT) models for these extensions of MAPF are studied. We focus on modeling capacity constraints in SAT-based formulations of MAPF and evaluation of performance of these models. We extend two existing SAT-based formulations with vertex capacity constraints: MDD-SAT and SMT-CBS where the former is an approach that builds the model in an eager way while the latter relies on lazy construction of the model.

agent, artificial intelligence, constraint, (16 more...)

1907.12648

Country:

Europe (0.93)
North America > United States > California (0.28)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)

Belardinelli, Francesco, Grandi, Umberto

Social Choice Methods for Database Aggregation

arXiv.org Artificial IntelligenceJul-21-2019

Knowledge can be represented compactly in multiple ways, from a set of propositional formulas, to a Kripke model, to a database. In this paper we study the aggregation of information coming from multiple sources, each source submitting a database modelled as a first-order relational structure. In the presence of integrity constraints, we identify classes of aggregators that respect them in the aggregated database, provided these are satisfied in all individual databases. We also characterise languages for first-order queries on which the answer to a query on the aggregated database coincides with the aggregation of the answers to the query obtained on each individual database. This contribution is meant to be a first step on the application of techniques from social choice theory to knowledge representation in databases.

artificial intelligence, logic & formal reasoning, nullu, (17 more...)

doi: 10.4204/EPTCS.297.4

1907.10492

Country: Europe (0.46)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (0.93)

arXiv.org Machine LearningJul-21-2019

Deep Reinforcement Learning for Autonomous Internet of Things: Model, Applications and Challenges

Lei, Lei, Tan, Yue, Liu, Shiwen, Zheng, Kan, Xuemin, null, Shen, null

The Internet of Things (IoT) extends the Internet connectivity into billions of IoT devices around the world, which collect and share information to reflect the status of physical world. The Autonomous Control System (ACS), on the other hand, performs control functions on the physical systems without external intervention over an extended period of time. The integration of IoT and ACS results in a new concept - autonomous IoT (AIoT). The sensors collect information on the system status, based on which intelligent agents in IoT devices as well as Edge/Fog/Cloud servers make control decisions for the actuators to react. In order to achieve autonomy, a promising method is for the intelligent agents to leverage the techniques in the field of artificial intelligence, especially reinforcement learning (RL) and deep reinforcement learning (DRL) for decision making. In this paper, we first provide comprehensive survey of the state-of-art research, and then propose a general model for the applications of RL/DRL in AIoT. Finally, the challenges and open issues for future research are identified.

artificial intelligence, machine learning, reinforcement learning, (14 more...)

arXiv.org Machine Learning

1907.09059

Genre:

Overview (1.00)
Research Report > Promising Solution (0.87)

Industry:

Information Technology > Smart Houses & Appliances (1.00)
Energy > Renewable (1.00)
Energy > Power Industry (1.00)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.47)