AITopics

2209.10513

Country: Asia > India (0.04)

Genre:

Overview (0.67)
Research Report > New Finding (0.46)

Industry: Information Technology > Smart Houses & Appliances (0.66)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.93)
Information Technology > Communications > Networks > Sensor Networks (0.90)

arXiv.org Artificial IntelligenceSep-21-2022

Distributed Online Non-convex Optimization with Composite Regret

Jiang, Zhanhong, Balu, Aditya, Lee, Xian Yeow, Lee, Young M., Hegde, Chinmay, Sarkar, Soumik

Regret has been widely adopted as the metric of choice for evaluating the performance of online optimization algorithms for distributed, multi-agent systems. However, data/model variations associated with agents can significantly impact decisions and requires consensus among agents. Moreover, most existing works have focused on developing approaches for (either strongly or non-strongly) convex losses, and very few results have been obtained regarding regret bounds in distributed online optimization for general non-convex losses. To address these two issues, we propose a novel composite regret with a new network regret-based metric to evaluate distributed online optimization algorithms. We concretely define static and dynamic forms of the composite regret. By leveraging the dynamic form of our composite regret, we develop a consensus-based online normalized gradient (CONGD) approach for pseudo-convex losses, and it provably shows a sublinear behavior relating to a regularity term for the path variation of the optimizer. For general non-convex losses, we first shed light on the regret for the setting of distributed online non-convex learning based on recent advances such that no deterministic algorithm can achieve the sublinear regret. We then develop the distributed online non-convex optimization with composite regret (DINOCO) without access to the gradients, depending on an offline optimization oracle. DINOCO is shown to achieve sublinear regret; to our knowledge, this is the first regret bound for general distributed online non-convex learning.

artificial intelligence, machine learning, optimization, (16 more...)

2209.10105

Country:

North America > United States > Iowa (0.04)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
North America > United States > New York (0.04)

Genre: Research Report (0.81)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

arXiv.org Artificial IntelligenceSep-21-2022

Towards a Standardised Performance Evaluation Protocol for Cooperative MARL

Gorsane, Rihab, Mahjoub, Omayma, de Kock, Ruan, Dubb, Roland, Singh, Siddarth, Pretorius, Arnu

Multi-agent reinforcement learning (MARL) has emerged as a useful approach to solving decentralised decision-making problems at scale. Research in the field has been growing steadily with many breakthrough algorithms proposed in recent years. In this work, we take a closer look at this rapid development with a focus on evaluation methodologies employed across a large body of research in cooperative MARL. By conducting a detailed meta-analysis of prior work, spanning 75 papers accepted for publication from 2016 to 2022, we bring to light worrying trends that put into question the true rate of progress. We further consider these trends in a wider context and take inspiration from single-agent RL literature on similar issues with recommendations that remain applicable to MARL. Combining these recommendations, with novel insights from our analysis, we propose a standardised performance evaluation protocol for cooperative MARL. We argue that such a standard protocol, if widely adopted, would greatly improve the validity and credibility of future research, make replication and reproducibility easier, as well as improve the ability of the field to accurately gauge the rate of progress over time by being able to make sound comparisons across different works.

artificial intelligence, machine learning, reinforcement learning, (13 more...)

2209.10485

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > New York > New York County > New York City (0.04)
Asia > China > Guangdong Province > Shenzhen (0.04)
(7 more...)

Genre: Research Report (1.00)

Industry: Transportation > Ground (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Teaching Autonomous Systems Hands-On: Leveraging Modular Small-Scale Hardware in the Robotics Classroom

Betz, Johannes, Zheng, Hongrui, Zang, Zirui, Sauerbeck, Florian, Walas, Krzysztof, Dimitrov, Velin, Behl, Madhur, Zheng, Rosa, Biswas, Joydeep, Krovi, Venkat, Mangharam, Rahul

Although robotics courses are well established in higher education, the courses often focus on theory and sometimes lack the systematic coverage of the techniques involved in developing, deploying, and applying software to real hardware. Additionally, most hardware platforms for robotics teaching are low-level toys aimed at younger students at middle-school levels. To address this gap, an autonomous vehicle hardware platform, called F1TENTH, is developed for teaching autonomous systems hands-on. This article describes the teaching modules and software stack for teaching at various educational levels with the theme of "racing" and competitions that replace exams. The F1TENTH vehicles offer a modular hardware platform and its related software for teaching the fundamentals of autonomous driving algorithms. From basic reactive methods to advanced planning algorithms, the teaching modules enhance students' computational thinking through autonomous driving with the F1TENTH vehicle. The F1TENTH car fills the gap between research platforms and low-end toy cars and offers hands-on experience in learning the topics in autonomous systems. Four universities have adopted the teaching modules for their semester-long undergraduate and graduate courses for multiple years. Student feedback is used to analyze the effectiveness of the F1TENTH platform. More than 80% of the students strongly agree that the hardware platform and modules greatly motivate their learning, and more than 70% of the students strongly agree that the hardware-enhanced their understanding of the subjects. The survey results show that more than 80% of the students strongly agree that the competitions motivate them for the course.

artificial intelligence, student, vehicle, (18 more...)

2209.11181

Country:

North America > United States > Virginia > Albemarle County > Charlottesville (0.14)
North America > Canada > Ontario > National Capital Region > Ottawa (0.14)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
(13 more...)

Genre:

Research Report > New Finding (1.00)
Questionnaire & Opinion Survey (1.00)
Instructional Material > Course Syllabus & Notes (1.00)

Industry:

Transportation > Ground > Road (1.00)
Education > Educational Setting > K-12 Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)

Multi-Robot-Assisted Human Crowd Evacuation using Navigation Velocity Fields

Zheng, Tongjia, Yuan, Zhenyuan, Nayyar, Mollik, Wagner, Alan R., Zhu, Minghui, Lin, Hai

This work studies a robot-assisted crowd evacuation problem where we control a small group of robots to guide a large human crowd to safe locations. The challenge lies in how to model human-robot interactions and design robot controls to indirectly control a human population that significantly outnumbers the robots. To address the challenge, we treat the crowd as a continuum and formulate the evacuation objective as driving the crowd density to target locations. We propose a novel mean-field model which consists of a family of microscopic equations that explicitly model how human motions are locally guided by the robots and an associated macroscopic equation that describes how the crowd density is controlled by the navigation velocity fields generated by all robots. Then, we design density feedback controllers for the robots to dynamically adjust their states such that the generated navigation velocity fields drive the crowd density to a target density. Stability guarantees of the proposed controllers are proven. Agent-based simulations are included to evaluate the proposed evacuation algorithms.

artificial intelligence, evacuation, robot, (13 more...)

2209.09795

Country:

North America > United States > Pennsylvania > Centre County > University Park (0.04)
North America > United States > Indiana > St. Joseph County > Notre Dame (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)

Liu, Guan-Horng, Chen, Tianrong, So, Oswin, Theodorou, Evangelos A.

Deep Generalized Schr\"odinger Bridge

Mean-Field Game (MFG) serves as a crucial mathematical framework in modeling the collective behavior of individual agents interacting stochastically with a large population. In this work, we aim at solving a challenging class of MFGs in which the differentiability of these interacting preferences may not be available to the solver, and the population is urged to converge exactly to some desired distribution. These setups are, despite being well-motivated for practical purposes, complicated enough to paralyze most (deep) numerical solvers. Nevertheless, we show that Schr\"odinger Bridge - as an entropy-regularized optimal transport model - can be generalized to accepting mean-field structures, hence solving these MFGs. This is achieved via the application of Forward-Backward Stochastic Differential Equations theory, which, intriguingly, leads to a computational framework with a similar structure to Temporal Difference learning. As such, it opens up novel algorithmic connections to Deep Reinforcement Learning that we leverage to facilitate practical training. We show that our proposed objective function provides necessary and sufficient conditions to the mean-field problem. Our method, named Deep Generalized Schr\"odinger Bridge (DeepGSB), not only outperforms prior methods in solving classical population navigation MFGs, but is also capable of solving 1000-dimensional opinion depolarization, setting a new state-of-the-art numerical solver for high-dimensional MFGs. Our code will be made available at https://github.com/ghliu/DeepGSB.

arxiv preprint arxiv, machine learning, reinforcement learning, (17 more...)

2209.09893

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > California > Los Angeles County > Santa Monica (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre:

Research Report (0.50)
Instructional Material > Course Syllabus & Notes (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Vecchio, Giuseppe, Palazzo, Simone, Guastella, Dario C., Carlucho, Ignacio, Albrecht, Stefano V., Muscato, Giovanni, Spampinato, Concetto

MIDGARD: A Simulation Platform for Autonomous Navigation in Unstructured Environments

We present MIDGARD, an open-source simulation platform for autonomous robot navigation in outdoor unstructured environments. MIDGARD is designed to enable the training of autonomous agents (e.g., unmanned ground vehicles) in photorealistic 3D environments, and to support the generalization skills of learning-based agents through the variability in training scenarios. MIDGARD's main features include a configurable, extensible, and difficulty-driven procedural landscape generation pipeline, with fast and photorealistic scene rendering based on Unreal Engine. Additionally, MIDGARD has built-in support for OpenAI Gym, a programming interface for feature extension (e.g., integrating new types of sensors, customizing exposing internal simulation variables), and a variety of simulated agent sensors (e.g., RGB, depth and instance/semantic segmentation). We evaluate MIDGARD's capabilities as a benchmarking tool for robot navigation utilizing a set of state-of-the-art reinforcement learning algorithms. The results demonstrate MIDGARD's suitability as a simulation and training environment, as well as the effectiveness of our procedural generation approach in controlling scene difficulty, which directly reflects on accuracy metrics. MIDGARD build, source code and documentation are available at https://midgardsim.org/.

artificial intelligence, machine learning, reinforcement learning, (20 more...)

2205.08389

Country:

North America > United States (0.04)
Europe > Netherlands > North Holland > Amsterdam (0.04)
Europe > Italy (0.04)
Asia > Middle East > Republic of Türkiye > Karaman Province > Karaman (0.04)

Genre: Research Report > New Finding (0.48)

Industry: Education > Educational Technology > Educational Software > Computer Based Training (0.54)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.35)

Target-Guided Open-Domain Conversation Planning

Kishinami, Yosuke, Akama, Reina, Sato, Shiki, Tokuhisa, Ryoko, Suzuki, Jun, Inui, Kentaro

Prior studies addressing target-oriented conversational tasks lack a crucial notion that has been intensively studied in the context of goal-oriented artificial intelligence agents, namely, planning. In this study, we propose the task of Target-Guided Open-Domain Conversation Planning (TGCP) task to evaluate whether neural conversational agents have goal-oriented conversation planning abilities. Using the TGCP task, we investigate the conversation planning abilities of existing retrieval models and recent strong generative models. The experimental results reveal the challenges facing current technology.

artificial intelligence, machine learning, natural language, (19 more...)

2209.09746

Country:

Asia > Japan > Honshū > Tōhoku (0.05)
Asia > Myanmar > Tanintharyi Region > Dawei (0.04)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.35)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.34)

Saulnier, Kelsey, Zhou, Lifeng, Pappas, George, Kumar, Vijay

Resilient Consensus via Voronoi Communication Graphs

Consensus algorithms form the foundation for many distributed algorithms by enabling multiple robots to converge to consistent estimates of global variables using only local communication. However, standard consensus protocols can be easily led astray by non-cooperative team members. As such, the study of resilient forms of consensus is necessary for designing resilient distributed algorithms. W-MSR consensus is one such resilient consensus algorithm that allows for resilient consensus with only local knowledge of the communication graph and no a priori model for the data being shared. However, the verification that a given communication graph meets the strict graph connectivity requirement makes W-MSR difficult to use in practice. In this paper, we show that a commonly used communication graph structure in robotics literature, the communication graph built based on the Voronoi tessellation, automatically results in a sufficiently connected graph to reject a single non-cooperative team member. Further, we show how this graph can be enhanced to reject two non-cooperative team members and provide a roadmap for modifications for further resilience. This contribution will allow for the easy application of resilient consensus to algorithms that already rely on Voronoi-based communication such as distributed coverage and exploration algorithms.

artificial intelligence, neighbor, robot, (17 more...)

2209.04381

Country:

North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)
North America > United States > New York > New York County > New York City (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)

Datta, Siddhartha, Shadbolt, Nigel

Low-Loss Subspace Compression for Clean Gains against Multi-Agent Backdoor Attacks

Recent exploration of the multi-agent backdoor attack demonstrated the backfiring effect, a natural defense against backdoor attacks where backdoored inputs are randomly classified. This yields a side-effect of low accuracy w.r.t. clean labels, which motivates this paper's work on the construction of multi-agent backdoor defenses that maximize accuracy w.r.t. clean labels and minimize that of poison labels. Founded upon agent dynamics and low-loss subspace construction, we contribute three defenses that yield improved multi-agent backdoor robustness.

artificial intelligence, deep learning, machine learning, (19 more...)

2203.03692

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
Asia > Nepal (0.04)

Genre: Research Report (0.40)

Industry: Information Technology > Security & Privacy (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)