Goto

Collaborating Authors

 Agents


Diversifying Message Aggregation in Multi-Agent Communication via Normalized Tensor Nuclear Norm Regularization

arXiv.org Artificial Intelligence

Aggregating messages is a key component for the communication of multi-agent reinforcement learning (Comm-MARL). Recently, it has witnessed the prevalence of graph attention networks (GAT) in Comm-MARL, where agents can be represented as nodes and messages can be aggregated via the weighted passing. While successful, GAT can lead to homogeneity in the strategies of message aggregation, and the ``core'' agent may excessively influence other agents' behaviors, which can severely limit the multi-agent coordination. To address this challenge, we first study the adjacency tensor of the communication graph and demonstrate that the homogeneity of message aggregation could be measured by the normalized tensor rank. Since the rank optimization problem is known to be NP-hard, we define a new nuclear norm, which is a convex surrogate of normalized tensor rank, to replace the rank. Leveraging the norm, we further propose a plug-and-play regularizer on the adjacency tensor, named Normalized Tensor Nuclear Norm Regularization (NTNNR), to actively enrich the diversity of message aggregation during the training stage. We extensively evaluate GAT with the proposed regularizer in both cooperative and mixed cooperative-competitive scenarios. The results demonstrate that aggregating messages using NTNNR-enhanced GAT can improve the efficiency of the training and achieve higher asymptotic performance than existing message aggregation methods. When NTNNR is applied to existing graph-attention Comm-MARL methods, we also observe significant performance improvements on the StarCraft II micromanagement benchmarks.


Neural Payoff Machines: Predicting Fair and Stable Payoff Allocations Among Team Members

arXiv.org Artificial Intelligence

In many multi-agent settings, participants can form teams to achieve collective outcomes that may far surpass their individual capabilities. Measuring the relative contributions of agents and allocating them shares of the reward that promote long-lasting cooperation are difficult tasks. Cooperative game theory offers solution concepts identifying distribution schemes, such as the Shapley value, that fairly reflect the contribution of individuals to the performance of the team or the Core, which reduces the incentive of agents to abandon their team. Applications of such methods include identifying influential features and sharing the costs of joint ventures or team formation. Unfortunately, using these solutions requires tackling a computational barrier as they are hard to compute, even in restricted settings. In this work, we show how cooperative game-theoretic solutions can be distilled into a learned model by training neural networks to propose fair and stable payoff allocations. We show that our approach creates models that can generalize to games far from the training distribution and can predict solutions for more players than observed during training. An important application of our framework is Explainable AI: our approach can be used to speed-up Shapley value computations on many instances.


A Framework for Understanding and Visualizing Strategies of RL Agents

arXiv.org Artificial Intelligence

Recent years have seen significant advances in explainable AI as the need to understand deep learning models has gained importance with the increased emphasis on trust and ethics in AI. Comprehensible models for sequential decision tasks are a particular challenge as they require understanding not only individual predictions but a series of predictions that interact with environmental dynamics. We present a framework for learning comprehensible models of sequential decision tasks in which agent strategies are characterized using temporal logic formulas. Given a set of agent traces, we first cluster the traces using a novel embedding method that captures frequent action patterns. We then search for logical formulas that explain the agent strategies in the different clusters. We evaluate our framework on combat scenarios in StarCraft II (SC2), using traces from a handcrafted expert policy and a trained reinforcement learning agent. We implemented a feature extractor for SC2 environments that extracts traces as sequences of high-level features describing both the state of the environment and the agent's local behavior from agent replays. We further designed a visualization tool depicting the movement of units in the environment that helps understand how different task conditions lead to distinct agent behavior patterns in each trace cluster. Experimental results show that our framework is capable of separating agent traces into distinct groups of behaviors for which our approach to strategy inference produces consistent, meaningful, and easily understood strategy descriptions.


A Route Network Planning Method for Urban Air Delivery

arXiv.org Artificial Intelligence

High-tech giants and start-ups are investing in drone technologies to provide urban air delivery service, which is expected to solve the last-mile problem and mitigate road traffic congestion. However, air delivery service will not scale up without proper traffic management for drones in dense urban environment. Currently, a range of Concepts of Operations (ConOps) for unmanned aircraft system traffic management (UTM) are being proposed and evaluated by researchers, operators, and regulators. Among these, the tube-based (or corridor-based) ConOps has emerged in operations in some regions of the world for drone deliveries and is expected to continue serving certain scenarios that with dense and complex airspace and requires centralized control in the future. Towards the tube-based ConOps, we develop a route network planning method to design routes (tubes) in a complex urban environment in this paper. In this method, we propose a priority structure to decouple the network planning problem, which is NP-hard, into single-path planning problems. We also introduce a novel space cost function to enable the design of dense and aligned routes in a network. The proposed method is tested on various scenarios and compared with other state-of-the-art methods. Results show that our method can generate near-optimal route networks with significant computational time-savings.


Assurance Cases as Foundation Stone for Auditing AI-enabled and Autonomous Systems: Workshop Results and Political Recommendations for Action from the ExamAI Project

arXiv.org Artificial Intelligence

The European Machinery Directive and related harmonized standards do consider that software is used to generate safety-relevant behavior of the machinery but do not consider all kinds of software. In particular, software based on machine learning (ML) are not considered for the realization of safety-relevant behavior. This limits the introduction of suitable safety concepts for autonomous mobile robots and other autonomous machinery, which commonly depend on ML-based functions. We investigated this issue and the way safety standards define safety measures to be implemented against software faults. Functional safety standards use Safety Integrity Levels (SILs) to define which safety measures shall be implemented. They provide rules for determining the SIL and rules for selecting safety measures depending on the SIL. In this paper, we argue that this approach can hardly be adopted with respect to ML and other kinds of Artificial Intelligence (AI). Instead of simple rules for determining an SIL and applying related measures against faults, we propose the use of assurance cases to argue that the individually selected and applied measures are sufficient in the given case. To get a first rating regarding the feasibility and usefulness of our proposal, we presented and discussed it in a workshop with experts from industry, German statutory accident insurance companies, work safety and standardization commissions, and representatives from various national, European, and international working groups dealing with safety and AI. In this paper, we summarize the proposal and the workshop discussion. Moreover, we check to which extent our proposal is in line with the European AI Act proposal and current safety standardization initiatives addressing AI and Autonomous Systems


VLMbench: A Compositional Benchmark for Vision-and-Language Manipulation

arXiv.org Artificial Intelligence

Benefiting from language flexibility and compositionality, humans naturally intend to use language to command an embodied agent for complex tasks such as navigation and object manipulation. In this work, we aim to fill the blank of the last mile of embodied agents -- object manipulation by following human guidance, e.g., "move the red mug next to the box while keeping it upright." To this end, we introduce an Automatic Manipulation Solver (AMSolver) system and build a Vision-and-Language Manipulation benchmark (VLMbench) based on it, containing various language instructions on categorized robotic manipulation tasks. Specifically, modular rule-based task templates are created to automatically generate robot demonstrations with language instructions, consisting of diverse object shapes and appearances, action types, and motion constraints. We also develop a keypoint-based model 6D-CLIPort to deal with multi-view observations and language input and output a sequence of 6 degrees of freedom (DoF) actions. We hope the new simulator and benchmark will facilitate future research on language-guided robotic manipulation.


Collective Conditioned Reflex: A Bio-Inspired Fast Emergency Reaction Mechanism for Designing Safe Multi-Robot Systems

arXiv.org Artificial Intelligence

A multi-robot system (MRS) is a group of coordinated robots designed to cooperate with each other and accomplish given tasks. Due to the uncertainties in operating environments, the system may encounter emergencies, such as unobserved obstacles, moving vehicles, and extreme weather. Animal groups such as bee colonies initiate collective emergency reaction behaviors such as bypassing obstacles and avoiding predators, similar to muscle-conditioned reflex which organizes local muscles to avoid hazards in the first response without delaying passage through the brain. Inspired by this, we develop a similar collective conditioned reflex mechanism for multi-robot systems to respond to emergencies. In this study, Collective Conditioned Reflex (CCR), a bio-inspired emergency reaction mechanism, is developed based on animal collective behavior analysis and multi-agent reinforcement learning (MARL). The algorithm uses a physical model to determine if the robots are experiencing an emergency; then, rewards for robots involved in the emergency are augmented with corresponding heuristic rewards, which evaluate emergency magnitudes and consequences and decide local robots' participation. CCR is validated on three typical emergency scenarios: \textit{turbulence, strong wind, and hidden obstacle}. Simulation results demonstrate that CCR improves robot teams' emergency reaction capability with faster reaction speed and safer trajectory adjustment compared with baseline methods.


Generative Thermal Design Through Boundary Representation and Multi-Agent Cooperative Environment

arXiv.org Artificial Intelligence

GANs generate new designs from an existing dataset utilizing a generator and a discriminator which are usually Deep Generative design has been growing across the Neural Networks (DNNs). The objective function of GANs design community as a viable method for design should be differentiable to utilize gradient-based optimization space exploration. Thermal design is more complex while reward of a deep RL can be defined based on the than mechanical or aerodynamic design because design requirements (Chen & Ahmed, 2021b). of the additional convection-diffusion equation and its pertinent boundary interaction. We Shape and Topology Optimization (TO) play a major role in present a generative thermal design using cooperative Generative models in engineering design (Chen & Ahmed, multi-agent deep reinforcement learning 2021a). Engineering design often require Finite Element and continuous geometric representation of the Analysis (FEA) or Computational Fluid Dynamics (CFD) fluid and solid domain. The proposed framework to assess the performance of the output design (Hoyer et al., consists of a pre-trained neural network surrogate 2019). These numerical approaches are computationally model as an environment to predict heat transfer expensive and require human expertise (Regenwetter et al., and pressure drop of the generated geometries.


What Artificial Neural Networks Can Tell Us About Human Language Acquisition

arXiv.org Artificial Intelligence

Rapid progress in machine learning for natural language processing has the potential to transform debates about how humans learn language. However, the learning environments and biases of current artificial learners and humans diverge in ways that weaken the impact of the evidence obtained from learning simulations. For example, today's most effective neural language models are trained on roughly one thousand times the amount of linguistic data available to a typical child. To increase the relevance of learnability results from computational models, we need to train model learners without significant advantages over humans. If an appropriate model successfully acquires some target linguistic knowledge, it can provide a proof of concept that the target is learnable in a hypothesized human learning scenario. Plausible model learners will enable us to carry out experimental manipulations to make causal inferences about variables in the learning environment, and to rigorously test poverty-of-the-stimulus-style claims arguing for innate linguistic knowledge in humans on the basis of speculations about learnability. Comparable experiments will never be possible with human subjects due to practical and ethical considerations, making model learners an indispensable resource. So far, attempts to deprive current models of unfair advantages obtain sub-human results for key grammatical behaviors such as acceptability judgments. But before we can justifiably conclude that language learning requires more prior domain-specific knowledge than current models possess, we must first explore non-linguistic inputs in the form of multimodal stimuli and multi-agent interaction as ways to make our learners more efficient at learning from limited linguistic input.


A Survey of Ad Hoc Teamwork Research

arXiv.org Artificial Intelligence

Ad hoc teamwork is the research problem of designing agents that can collaborate with new teammates without prior coordination. This survey makes a two-fold contribution: First, it provides a structured description of the different facets of the ad hoc teamwork problem. Second, it discusses the progress that has been made in the field so far, and identifies the immediate and long-term open problems that need to be addressed in ad hoc teamwork.