AITopics

2206.1519

Country:

South America > Brazil > São Paulo (0.04)
North America > United States > Indiana > Monroe County > Bloomington (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
(5 more...)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.67)

Mondal, Washim Uddin, Aggarwal, Vaneet, Ukkusuri, Satish V.

On the Near-Optimality of Local Policies in Large Cooperative Multi-Agent Reinforcement Learning

We show that in a cooperative $N$-agent network, one can design locally executable policies for the agents such that the resulting discounted sum of average rewards (value) well approximates the optimal value computed over all (including non-local) policies. Specifically, we prove that, if $|\mathcal{X}|, |\mathcal{U}|$ denote the size of state, and action spaces of individual agents, then for sufficiently small discount factor, the approximation error is given by $\mathcal{O}(e)$ where $e\triangleq \frac{1}{\sqrt{N}}\left[\sqrt{|\mathcal{X}|}+\sqrt{|\mathcal{U}|}\right]$. Moreover, in a special case where the reward and state transition functions are independent of the action distribution of the population, the error improves to $\mathcal{O}(e)$ where $e\triangleq \frac{1}{\sqrt{N}}\sqrt{|\mathcal{X}|}$. Finally, we also devise an algorithm to explicitly construct a local policy. With the help of our approximation results, we further establish that the constructed local policy is within $\mathcal{O}(\max\{e,\epsilon\})$ distance of the optimal policy, and the sample complexity to achieve such a local policy is $\mathcal{O}(\epsilon^{-3})$, for any $\epsilon>0$.

machine learning, marl, reinforcement learning, (15 more...)

2209.03491

Genre: Research Report (0.63)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.82)

Ornia, Daniel Jarne, Mazo, Manuel Jr

Robust Event-Driven Interactions in Cooperative Multi-Agent Learning

Lately, with the wide adoption of Deep Learning techniques for compact representations of value functions and policies in model-free problems [16, 21, 34], the field of Multi-Agent Reinforcement Learning (MARL) has seen an explosion in the applications of such algorithms to solve real-world problems [19]. However, this has naturally led to a trend where both the amount of data handled in such data driven approaches and the complexity of the targeted problems grow exponentially. In a MARL setting where communication between agents is required, this may inevitably lead to restrictive requirements in the frequency and reliability of the communication to and from each agents (as it was already pointed out in [23]). The effect of asynchronous communication in dynamic programming problems was studied already in [2]. In particular, one of the first examples of how communication affects learning and policy performance in MARL is found in [31], where the author investigates the impact of agents sharing different combinations of state variable subsets or Q values.

agent, communication, robust event-driven interaction, (10 more...)

doi: 10.1007/978-3-031-15839-1_16

2204.03361

Country:

Europe > Netherlands > South Holland > Delft (0.05)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.84)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.54)

Khadivpour, Faraz, Banerjee, Arghasree, Guzdial, Matthew

Responsibility: An Example-based Explainable AI approach via Training Process Inspection

Explainable Artificial Intelligence (XAI) methods are intended to help human users better understand the decision making of an AI agent. However, many modern XAI approaches are unintuitive to end users, particularly those without prior AI or ML knowledge. In this paper, we present a novel XAI approach we call Responsibility that identifies the most responsible training example for a particular decision. This example can then be shown as an explanation: "this is what I (the AI) learned that led me to do that". We present experimental results across a number of domains along with the results of an Amazon Mechanical Turk user study, comparing responsibility and existing XAI methods on an image classification task. Our results demonstrate that responsibility can help improve accuracy for both human end users and secondary ML models.

responsibility, training data, xai approach, (16 more...)

2209.03433

Country:

North America > Canada > Alberta (0.14)
North America > Canada > Quebec > Montreal (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (0.86)

Technology:

Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (1.00)
Information Technology > Artificial Intelligence > Natural Language > Explanation & Argumentation (0.84)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.66)

Jestel, Christian, Surmann, Hartmut, Stenzel, Jonas, Urbann, Oliver, Brehler, Marius

Obtaining Robust Control and Navigation Policies for Multi-Robot Navigation via Deep Reinforcement Learning

Multi-robot-navigation is one of the main challenges in mobile robotics. Multiple robots must be coordinated simultaneously to finish their task and have to navigate through a complex dynamic environment without causing collisions. One approach to enable the coordination of multi-robot navigation is prioritized planning, where robots plan their trajectories sequentially one after another. Prioritized planning algorithms tend to find a deadlock-free solution for route planning and centralized as well as decentralized planning solutions exist [1]. With a centralized approach all robots are coordinated by a single system, whereas navigation conflicts are resolved via communication between the robots in decentralized approaches. Prioritized path planning approaches tend to find solutions for scenarios with a high number of robots, while other approaches or reactive collisionavoidance algorithms like ORCA [2] fail. However, the main drawback of centralized approaches is the bad scalability as the planning complexity increases drastically with the number of robots and the size and complexity of the environment [3]. Additionally, a reliable and synchronized communication between the centralized planner and all robots is essential. Decentralized approaches often rely on communication between robots in order to share state information (e.g.

agent, algorithm, robot, (12 more...)

doi: 10.1109/ICARA51699.2021.9376457

2209.03097

Country:

North America > United States > New York > New York County > New York City (0.14)
Europe > Germany (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
(10 more...)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)

Nair, Jayprakash S., Kulkarni, Divya D., Joshi, Ajitem, Suresh, Sruthy

On Decentralizing Federated Reinforcement Learning in Multi-Robot Scenarios

Federated Learning (FL) allows for collaboratively aggregating learned information across several computing devices and sharing the same amongst them, thereby tackling issues of privacy and the need of huge bandwidth. FL techniques generally use a central server or cloud for aggregating the models received from the devices. Such centralized FL techniques suffer from inherent problems such as failure of the central node and bottlenecks in channel bandwidth. When FL is used in conjunction with connected robots serving as devices, a failure of the central controlling entity can lead to a chaotic situation. This paper describes a mobile agent based paradigm to decentralize FL in multi-robot scenarios. Using Webots, a popular free open-source robot simulator, and Tartarus, a mobile agent platform, we present a methodology to decentralize federated learning in a set of connected robots. With Webots running on different connected computing systems, we show how mobile agents can perform the task of Decentralized Federated Reinforcement Learning (dFRL). Results obtained from experiments carried out using Q-learning and SARSA by aggregating their corresponding Q-tables, show the viability of using decentralized FL in the domain of robotics. Since the proposed work can be used in conjunction with other learning algorithms and also real robots, it can act as a vital tool for the study of decentralized FL using heterogeneous learning algorithms concurrently in multi-robot scenarios.

agent, mobile agent, robot, (14 more...)

2207.09372

Country:

Asia > India > Assam > Guwahati (0.04)
North America > United States > New York > New York County > New York City (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

JARVIS: A Neuro-Symbolic Commonsense Reasoning Framework for Conversational Embodied Agents

Zheng, Kaizhi, Zhou, Kaiwen, Gu, Jing, Fan, Yue, Wang, Jialu, Di, Zonglin, He, Xuehai, Wang, Xin Eric

Building a conversational embodied agent to execute real-life tasks has been a long-standing yet quite challenging research goal, as it requires effective human-agent communication, multi-modal understanding, long-range sequential decision making, etc. Traditional symbolic methods have scaling and generalization issues, while end-to-end deep learning models suffer from data scarcity and high task complexity, and are often hard to explain. To benefit from both worlds, we propose JARVIS, a neuro-symbolic commonsense reasoning framework for modular, generalizable, and interpretable conversational embodied agents. First, it acquires symbolic representations by prompting large language models (LLMs) for language understanding and sub-goal planning, and by constructing semantic maps from visual observations. Then the symbolic module reasons for sub-goal planning and action generation based on task- and action-level common sense. Extensive experiments on the TEACh dataset validate the efficacy and efficiency of our JARVIS framework, which achieves state-of-the-art (SOTA) results on all three dialog-based embodied tasks, including Execution from Dialog History (EDH), Trajectory from Dialog (TfD), and Two-Agent Task Completion (TATC) (e.g., our method boosts the unseen Success Rate on EDH from 6.1\% to 15.8\%). Moreover, we systematically analyze the essential factors that affect the task performance and also demonstrate the superiority of our method in few-shot settings. Our JARVIS model ranks first in the Alexa Prize SimBot Public Benchmark Challenge.

agent, information, navigation, (16 more...)

2208.13266

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > California > Santa Cruz County > Santa Cruz (0.04)
(4 more...)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

#artificialintelligenceSep-6-2022, 22:35:15 GMT

Forget chess, DeepMind's training its new AI to play football

Researchers from DeepMind, the UK's juggernaut AI lab, have forsaken the noble games of chess and Go for a more plebeian delight: football. The Google sister company yesterday published a research paper and accompanying blog post detailing its new neural probabilistic motor primitives (NPMP) -- a method by which artificial intelligence agents can learn to operate physical bodies. An NPMP is a general-purpose motor control module that translates short-horizon motor intentions to low-level control signals, and it's trained offline or via RL by imitating motion capture (MoCap) data, recorded with trackers on humans or animals performing motions of interest. Up front: Essentially, the DeepMind team created an AI system that can learn how to do things inside of a physics simulator by watching videos of other agents performing those tasks. And, of course, if you've got a giant physics engine and an endless supply of curious robots, the only rational thing to do is to teach it how to dribble and shoot: We optimized teams of agents to play simulated football via reinforcement learning, constraining the solution space to that of plausible movements learned using human motion capture data. Background: In order to train AI to operate and control robots in the world, researchers have to prepare the machines for reality.

agent, deepmind, football, (9 more...)

#artificialintelligence

Industry:

Leisure & Entertainment > Sports (1.00)
Leisure & Entertainment > Games > Chess (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.86)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.86)

Venkata, Sanjay Sarma Oruganti, Parasuraman, Ramviyas, Pidaparti, Ramana

KT-BT: A Framework for Knowledge Transfer Through Behavior Trees in Multi-Robot Systems

arXiv.org Artificial IntelligenceSep-6-2022

Multi-Robot and Multi-Agent Systems demonstrate collective (swarm) intelligence through systematic and distributed integration of local behaviors in a group. Agents sharing knowledge about the mission and environment can enhance performance at individual and mission levels. However, this is difficult to achieve, partly due to the lack of a generic framework for transferring part of the known knowledge (behaviors) between agents. This paper presents a new knowledge representation framework and a transfer strategy called KT-BT: Knowledge Transfer through Behavior Trees. The KT-BT framework follows a query-response-update mechanism through an online Behavior Tree framework, where agents broadcast queries for unknown conditions and respond with appropriate knowledge using a condition-action-control sub-flow. We embed a novel grammar structure called stringBT that encodes knowledge, enabling behavior sharing. We theoretically investigate the properties of the KT-BT framework in achieving homogeneity of high knowledge across the entire group compared to a heterogeneous system without the capability of sharing their knowledge. We extensively verify our framework in a simulated multi-robot search and rescue problem. The results show successful knowledge transfers and improved group performance in various scenarios. We further study the effects of opportunities and communication range on group performance, knowledge spread, and functional heterogeneity in a group of agents, presenting interesting insights.

agent, knowledge, robot, (16 more...)

2209.02886

Country:

North America > United States > Georgia > Clarke County > Athens (0.14)
North America > United States > California > San Diego County > San Diego (0.04)

Genre: Research Report > New Finding (0.87)

Industry:

Leisure & Entertainment > Games > Computer Games (0.46)
Automobiles & Trucks (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)

arXiv.org Artificial IntelligenceSep-6-2022

A Zeroth-Order Momentum Method for Risk-Averse Online Convex Games

Wang, Zifan, Shen, Yi, Bell, Zachary I., Nivison, Scott, Zavlanos, Michael M., Johansson, Karl H.

We consider risk-averse learning in repeated unknown games where the goal of the agents is to minimize their individual risk of incurring significantly high cost. Specifically, the agents use the conditional value at risk (CVaR) as a risk measure and rely on bandit feedback in the form of the cost values of the selected actions at every episode to estimate their CVaR values and update their actions. A major challenge in using bandit feedback to estimate CVaR is that the agents can only access their own cost values, which, however, depend on the actions of all agents. To address this challenge, we propose a new risk-averse learning algorithm with momentum that utilizes the full historical information on the cost values. We show that this algorithm achieves sub-linear regret and matches the best known algorithms in the literature. We provide numerical experiments for a Cournot game that show that our method outperforms existing methods.

agent, algorithm, gradient estimate, (15 more...)

2209.02838

Country:

North America > United States > North Carolina > Durham County > Durham (0.04)
Europe > Sweden > Stockholm > Stockholm (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.50)