AITopics

2008.06495

Country:

North America > United States > Texas (0.04)
North America > United States > California > Los Angeles County > Long Beach (0.04)

Genre: Research Report (1.00)

Industry: Leisure & Entertainment > Games > Bridge (1.00)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.34)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.34)

Mann, Vipul, Sivaram, Abhishek, Das, Laya, Venkatasubramanian, Venkat

Robust and Efficient Swarm Communication Topologies for Hostile Environments

arXiv.org Artificial IntelligenceAug-21-2020

Swarm Intelligence-based optimization techniques combine systematic exploration of the search space with information available from neighbors and rely strongly on communication among agents. These algorithms are typically employed to solve problems where the function landscape is not adequately known and there are multiple local optima that could result in premature convergence for other algorithms. Applications of such algorithms can be found in communication systems involving design of networks for efficient information dissemination to a target group, targeted drug-delivery where drug molecules search for the affected site before diffusing, and high-value target localization with a network of drones. In several of such applications, the agents face a hostile environment that can result in loss of agents during the search. Such a loss changes the communication topology of the agents and hence the information available to agents, ultimately influencing the performance of the algorithm. In this paper, we present a study of the impact of loss of agents on the performance of such algorithms as a function of the initial network configuration. We use particle swarm optimization to optimize an objective function with multiple sub-optimal regions in a hostile environment and study its performance for a range of network topologies with loss of agents. The results reveal interesting trade-offs between efficiency, robustness, and performance for different topologies that are subsequently leveraged to discover general properties of networks that maximize performance. Moreover, networks with small-world properties are seen to maximize performance under hostile conditions.

artificial intelligence, evolutionary algorithm, machine learning, (17 more...)

doi: 10.1016/j.swevo.2021.100848

2008.09575

Country: North America > United States > New York > New York County > New York City (0.04)

Genre: Research Report > New Finding (0.93)

Industry: Energy (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.46)

Zhao, Wenshuai, Queralta, Jorge Peña, Qingqing, Li, Westerlund, Tomi

Towards Closing the Sim-to-Real Gap in Collaborative Multi-Robot Deep Reinforcement Learning

arXiv.org Machine LearningAug-18-2020

Current research directions in deep reinforcement learning include bridging the simulation-reality gap, improving sample efficiency of experiences in distributed multi-agent reinforcement learning, together with the development of robust methods against adversarial agents in distributed learning, among many others. In this work, we are particularly interested in analyzing how multi-agent reinforcement learning can bridge the gap to reality in distributed multi-robot systems where the operation of the different robots is not necessarily homogeneous. These variations can happen due to sensing mismatches, inherent errors in terms of calibration of the mechanical joints, or simple differences in accuracy. While our results are simulation-based, we introduce the effect of sensing, calibration, and accuracy mismatches in distributed reinforcement learning with proximal policy optimization (PPO). We discuss on how both the different types of perturbances and how the number of agents experiencing those perturbances affect the collaborative learning effort. The simulations are carried out using a Kuka arm model in the Bullet physics engine. This is, to the best of our knowledge, the first work exploring the limitations of PPO in multi-robot systems when considering that different robots might be exposed to different environments where their sensors or actuators have induced errors. With the conclusions of this work, we set the initial point for future work on designing and developing methods to achieve robust reinforcement learning on the presence of real-world perturbances that might differ within a multi-robot system.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

2008.07875

Country: Europe > Finland > Southwest Finland > Turku (0.04)

Genre: Research Report > New Finding (0.34)

Industry: Leisure & Entertainment > Games > Computer Games (0.51)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.50)

Chen, Xiaoyi, Chaudhari, Pratik

MIDAS: Multi-agent Interaction-aware Decision-making with Adaptive Strategies for Urban Autonomous Navigation

arXiv.org Machine LearningAug-17-2020

Autonomous navigation in crowded, complex urban environments requires interacting with other agents on the road. A common solution to this problem is to use a prediction model to guess the likely future actions of other agents. While this is reasonable, it leads to overly conservative plans because it does not explicitly model the mutual influence of the actions of interacting agents. This paper builds a reinforcement learning-based method named MIDAS where an ego-agent learns to affect the control actions of other cars in urban driving scenarios. MIDAS uses an attention-mechanism to handle an arbitrary number of other agents and includes a ''driver-type'' parameter to learn a single policy that works across different planning objectives. We build a simulation environment that enables diverse interaction experiments with a large number of agents and methods for quantitatively studying the safety, efficiency, and interaction among vehicles. MIDAS is validated using extensive experiments and we show that it (i) can work across different road geometries, (ii) results in an adaptive ego policy that can be tuned easily to satisfy performance criteria such as aggressive or cautious driving, (iii) is robust to changes in the driving policies of external agents, and (iv) is more efficient and safer than existing approaches to interaction-aware decision-making.

agent, artificial intelligence, machine learning, (17 more...)

2008.07081

Country:

North America > United States > Pennsylvania (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > District of Columbia > Washington (0.04)

Genre: Research Report (0.82)

Industry:

Transportation > Ground > Road (1.00)
Automobiles & Trucks (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Felli, Paolo, Gianola, Alessandro, Montali, Marco

SMT-based Safety Verification of Parameterised Multi-Agent Systems

arXiv.org Artificial IntelligenceAug-14-2020

In this paper we study the verification of parameterised multi-agent systems (MASs), and in particular the task of verifying whether unwanted states, characterised as a given state formula, are reachable in a given MAS, i.e., whether the MAS is unsafe. The MAS is parameterised and the model only describes the finite set of possible agent templates, while the actual number of concrete agent instances for each template is unbounded and cannot be foreseen. This makes the state-space infinite. As safety may of course depend on the number of agent instances in the system, the verification result must be correct irrespective of such number. We solve this problem via infinite-state model checking based on satisfiability modulo theories (SMT), relying on the theory of array-based systems: we present parameterised MASs as particular array-based systems, under two execution semantics for the MAS, which we call concurrent and interleaved. We prove our decidability results under these assumptions and illustrate our implementation approach, called SAFE: the Swarm Safety Detector, based on the third-party model checker MCMT, which we evaluate experimentally. Finally, we discuss how this approach lends itself to richer parameterised and data-aware MAS settings beyond the state-of-the-art solutions in the literature, which we leave as future work.

artificial intelligence, formula, formulae, (17 more...)

2008.04774

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > California > San Diego County > San Diego (0.04)
Europe > Italy (0.04)

Genre:

Workflow (0.67)
Research Report > Promising Solution (0.34)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.34)

Dubey, Abhimanyu, Pentland, Alex

Cooperative Multi-Agent Bandits with Heavy Tails

arXiv.org Machine LearningAug-14-2020

We study the heavy-tailed stochastic bandit problem in the cooperative multi-agent setting, where a group of agents interact with a common bandit problem, while communicating on a network with delays. Existing algorithms for the stochastic bandit in this setting utilize confidence intervals arising from an averaging-based communication protocol known as~\textit{running consensus}, that does not lend itself to robust estimation for heavy-tailed settings. We propose \textsc{MP-UCB}, a decentralized multi-agent algorithm for the cooperative stochastic bandit that incorporates robust estimation with a message-passing protocol. We prove optimal regret bounds for \textsc{MP-UCB} for several problem settings, and also demonstrate its superiority to existing methods. Furthermore, we establish the first lower bounds for the cooperative bandit problem, in addition to providing efficient algorithms for robust bandit estimation of location.

agent, artificial intelligence, data mining, (17 more...)

2008.06244

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > California (0.04)

Genre: Research Report (0.50)

Industry: Health & Medicine (0.56)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (1.00)

Dubey, Abhimanyu, Pentland, Alex

Kernel Methods for Cooperative Multi-Agent Contextual Bandits

arXiv.org Machine LearningAug-14-2020

Cooperative multi-agent decision making involves a group of agents cooperatively solving learning problems while communicating over a network with delays. In this paper, we consider the kernelised contextual bandit problem, where the reward obtained by an agent is an arbitrary linear function of the contexts' images in the related reproducing kernel Hilbert space (RKHS), and a group of agents must cooperate to collectively solve their unique decision problems. For this problem, we propose \textsc{Coop-KernelUCB}, an algorithm that provides near-optimal bounds on the per-agent regret, and is both computationally and communicatively efficient. For special cases of the cooperative problem, we also provide variants of \textsc{Coop-KernelUCB} that provides optimal per-agent regret. In addition, our algorithm generalizes several existing results in the multi-agent bandit setting. Finally, on a series of both synthetic and real-world multi-agent network benchmarks, we demonstrate that our algorithm significantly outperforms existing benchmarks.

agent, algorithm, artificial intelligence, (13 more...)

2008.0622

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > Georgia > Fulton County > Atlanta (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.82)

Industry:

Education (0.34)
Health & Medicine (0.31)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (1.00)

Gutierrez, Julian, Najib, Muhammad, Perelli, Giuseppe, Wooldridge, Michael

Automated Temporal Equilibrium Analysis: Verification and Synthesis of Multi-Player Games

arXiv.org Artificial IntelligenceAug-12-2020

In the context of multi-agent systems, the rational verification problem is concerned with checking which temporal logic properties will hold in a system when its constituent agents are assumed to behave rationally and strategically in pursuit of individual objectives. Typically, those objectives are expressed as temporal logic formulae which the relevant agent desires to see satisfied. Unfortunately, rational verification is computationally complex, and requires specialised techniques in order to obtain practically useable implementations. In this paper, we present such a technique. This technique relies on a reduction of the rational verification problem to the solution of a collection of parity games. Our approach has been implemented in the Equilibrium Verification Environment (EVE) system. The EVE system takes as input a model of a concurrent/multi-agent system represented using the Simple Reactive Modules Language (SRML), where agent goals are represented as Linear Temporal Logic (LTL) formulae, together with a claim about the equilibrium behaviour of the system, also expressed as an LTL formula. EVE can then check whether the LTL claim holds on some (or every) computation of the system that could arise through agents choosing Nash equilibrium strategies; it can also check whether a system has a Nash equilibrium, and synthesise individual strategies for players in the multi-player game. After presenting our basic framework, we describe our new technique and prove its correctness. We then describe our implementation in the EVE system, and present experimental results which show that EVE performs favourably in comparison to other existing tools that support rational verification.

artificial intelligence, equilibrium, nash equilibrium, (15 more...)

2008.05638

Country:

North America > United States > New York > New York County > New York City (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
(5 more...)

Genre: Research Report (1.00)

Industry: Leisure & Entertainment > Games (1.00)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.65)

Blumenkamp, Jan, Prorok, Amanda

The Emergence of Adversarial Communication in Multi-Agent Reinforcement Learning

arXiv.org Artificial IntelligenceAug-6-2020

Many real-world problems require the coordination of multiple autonomous agents. Recent work has shown the promise of Graph Neural Networks (GNNs) to learn explicit communication strategies that enable complex multi-agent coordination. These works use models of cooperative multi-agent systems whereby agents strive to achieve a shared global goal. When considering agents with self-interested local objectives, the standard design choice is to model these as separate learning systems (albeit sharing the same environment). Such a design choice, however, precludes the existence of a single, differentiable communication channel, and consequently prohibits the learning of inter-agent communication strategies. In this work, we address this gap by presenting a learning model that accommodates individual non-shared rewards and a differentiable communication channel that is common among all agents. We focus on the case where agents have self-interested objectives, and develop a learning algorithm that elicits the emergence of adversarial communications. We perform experiments on multi-agent coverage and path planning problems, and employ a post-hoc interpretability technique to visualize the messages that agents communicate to each other. We show how a single self-interested agent is capable of learning highly manipulative communication strategies that allows it to significantly outperform a cooperative team of agents.

agent, artificial intelligence, machine learning, (17 more...)

2008.02616

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.28)
Asia > Middle East > Jordan (0.04)
North America > United States > Massachusetts > Middlesex County > Belmont (0.04)

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.92)

arXiv.org Artificial IntelligenceAug-4-2020

Social Choice Optimization

García-Camino, Andrés

Social choice is the theory about collective decision towards social welfare starting from individual opinions, preferences, interests or welfare. The field of Computational Social Welfare is somewhat recent and it is gaining impact in the Artificial Intelligence Community. Classical literature makes the assumption of single-peaked preferences, i.e. there exist a order in the preferences and there is a global maximum in this order. This year some theoretical results were published about Two-stage Approval Voting Systems (TAVs), Multi-winner Selection Rules (MWSR) and Incomplete (IPs) and Circular Preferences (CPs). The purpose of this paper is three-fold: Firstly, I want to introduced Social Choice Optimisation as a generalisation of TAVs where there is a max stage and a min stage implementing thus a Minimax, well-known Artificial Intelligence decision-making rule to minimize hindering towards a (Social) Goal. Secondly, I want to introduce, following my Open Standardization and Open Integration Theory (in refinement process) put in practice in my dissertation, the Open Standardization of Social Inclusion, as a global social goal of Social Choice Optimization.

artificial intelligence, machine learning, optimization problem, (15 more...)

2007.15393

Country: Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)

Genre: Research Report > New Finding (0.69)

Industry: Transportation (0.69)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.47)