Agents
Learning Latent Traits for Simulated Cooperative Driving Tasks
DeCastro, Jonathan A., Gopinath, Deepak, Rosman, Guy, Sumner, Emily, Hakimi, Shabnam, Stent, Simon
To construct effective teaming strategies between humans and AI systems in complex, risky situations requires an understanding of individual preferences and behaviors of humans. Previously this problem has been treated in case-specific or data-agnostic ways. In this paper, we build a framework capable of capturing a compact latent representation of the human in terms of their behavior and preferences based on data from a simulated population of drivers. Our framework leverages, to the extent available, knowledge of individual preferences and types from samples within the population to deploy interaction policies appropriate for specific drivers. We then build a lightweight simulation environment, HMIway-env, for modelling one form of distracted driving behavior, and use it to generate data for different driver types and train intervention policies. We finally use this environment to quantify both the ability to discriminate drivers and the effectiveness of intervention policies.
An Enhanced Graph Representation for Machine Learning Based Automatic Intersection Management
Klimke, Marvin, Gerigk, Jasper, Völz, Benjamin, Buchholz, Michael
The improvement of traffic efficiency at urban intersections receives strong research interest in the field of automated intersection management. So far, mostly non-learning algorithms like reservation or optimization-based ones were proposed to solve the underlying multi-agent planning problem. At the same time, automated driving functions for a single ego vehicle are increasingly implemented using machine learning methods. In this work, we build upon a previously presented graph-based scene representation and graph neural network to approach the problem using reinforcement learning. The scene representation is improved in key aspects by using edge features in addition to the existing node features for the vehicles. This leads to an increased representation quality that is leveraged by an updated network architecture. The paper provides an in-depth evaluation of the proposed method against baselines that are commonly used in automatic intersection management. Compared to a traditional signalized intersection and an enhanced first-in-first-out scheme, a significant reduction of induced delay is observed at varying traffic densities. Finally, the generalization capability of the graph-based representation is evaluated by testing the policy on intersection layouts not seen during training. The model generalizes virtually without restrictions to smaller intersection layouts and within certain limits to larger ones.
Fast Convergence of Optimistic Gradient Ascent in Network Zero-Sum Extensive Form Games
Piliouras, Georgios, Ratliff, Lillian, Sim, Ryann, Skoulakis, Stratis
The study of learning in games has thus far focused primarily on normal form games. In contrast, our understanding of learning in extensive form games (EFGs) and particularly in EFGs with many agents lags far behind, despite them being closer in nature to many real world applications. We consider the natural class of Network Zero-Sum Extensive Form Games, which combines the global zero-sum property of agent payoffs, the efficient representation of graphical games as well the expressive power of EFGs. We examine the convergence properties of Optimistic Gradient Ascent (OGA) in these games. We prove that the time-average behavior of such online learning dynamics exhibits $O(1/T)$ rate convergence to the set of Nash Equilibria. Moreover, we show that the day-to-day behavior also converges to Nash with rate $O(c^{-t})$ for some game-dependent constant $c>0$.
Representation Learning of Image Schema
Yunus, Fajrian, Clavel, Chloé, Pelachaud, Catherine
Image schema is a recurrent pattern of reasoning where one entity is mapped into another. Image schema is similar to conceptual metaphor and is also related to metaphoric gesture. Our main goal is to generate metaphoric gestures for an Embodied Conversational Agent. We propose a technique to learn the vector representation of image schemas. As far as we are aware of, this is the first work which addresses that problem. Our technique uses Ravenet et al's algorithm which we use to compute the image schemas from the text input and also BERT and SenseBERT which we use as the base word embedding technique to calculate the final vector representation of the image schema. Our representation learning technique works by clustering: word embedding vectors which belong to the same image schema should be relatively closer to each other, and thus form a cluster. With the image schemas representable as vectors, it also becomes possible to have a notion that some image schemas are closer or more similar to each other than to the others because the distance between the vectors is a proxy of the dissimilarity between the corresponding image schemas. Therefore, after obtaining the vector representation of the image schemas, we calculate the distances between those vectors. Based on these, we create visualizations to illustrate the relative distances between the different image schemas.
Technology and Consciousness
We report on a series of eight workshops held in the summer of 2017 on the topic "technology and consciousness." The workshops covered many subjects but the overall goal was to assess the possibility of machine consciousness, and its potential implications. In the body of the report, we summarize most of the basic themes that were discussed: the structure and function of the brain, theories of consciousness, explicit attempts to construct conscious machines, detection and measurement of consciousness, possible emergence of a conscious technology, methods for control of such a technology and ethical considerations that might be owed to it. An appendix outlines the topics of each workshop and provides abstracts of the talks delivered. Update: Although this report was published in 2018 and the workshops it is based on were held in 2017, recent events suggest that it is worth bringing forward. In particular, in the Spring of 2022, a Google engineer claimed that LaMDA, one of their "large language models" is sentient or even conscious. This provoked a flurry of commentary in both the scientific and popular press, some of it interesting and insightful, but almost all of it ignorant of the prior consideration given to these topics and the history of research into machine consciousness. Thus, we are making a lightly refreshed version of this report available in the hope that it will provide useful background to the current debate and will enable more informed commentary. Although this material is five years old, its technical points remain valid and up to date, but we have "refreshed" it by adding a few footnotes highlighting recent developments.
Display of 3D Illuminations using Flying Light Specks
This paper presents techniques to display 3D illuminations using Flying Light Specks, FLSs. Each FLS is a miniature (hundreds of micrometers) sized drone with one or more light sources to generate different colors and textures with adjustable brightness. It is network enabled with a processor and local storage. Synchronized swarms of cooperating FLSs render illumination of virtual objects in a pre-specified 3D volume, an FLS display. We present techniques to display both static and motion illuminations. Our display techniques consider the limited flight time of an FLS on a fully charged battery and the duration of time to charge the FLS battery. Moreover, our techniques assume failure of FLSs is the norm rather than an exception. We present a hardware and a software architecture for an FLS-display along with a family of techniques to compute flight paths of FLSs for illuminations. With motion illuminations, one technique (ICF) minimizes the overall distance traveled by the FLSs significantly when compared with the other techniques.
Vision-based Relative Detection and Tracking for Teams of Micro Aerial Vehicles
Ge, Rundong, Lee, Moonyoung, Radhakrishnan, Vivek, Zhou, Yang, Li, Guanrui, Loianno, Giuseppe
In this paper, we address the vision-based detection and tracking problems of multiple aerial vehicles using a single camera and Inertial Measurement Unit (IMU) as well as the corresponding perception consensus problem (i.e., uniqueness and identical IDs across all observing agents). We design several vision-based decentralized Bayesian multi-tracking filtering strategies to resolve the association between the incoming unsorted measurements obtained by a visual detector algorithm and the tracked agents. We compare their accuracy in different operating conditions as well as their scalability according to the number of agents in the team. This analysis provides useful insights about the most appropriate design choice for the given task. We further show that the proposed perception and inference pipeline which includes a Deep Neural Network (DNN) as visual target detector is lightweight and capable of concurrently running control and planning with Size, Weight, and Power (SWaP) constrained robots on-board. Experimental results show the effective tracking of multiple drones in various challenging scenarios such as heavy occlusions.
Task Allocation with Load Management in Multi-Agent Teams
Wu, Haochen, Ghadami, Amin, Bayrak, Alparslan Emrah, Smereka, Jonathon M., Epureanu, Bogdan I.
In operations of multi-agent teams ranging from homogeneous robot swarms to heterogeneous human-autonomy teams, unexpected events might occur. While efficiency of operation for multi-agent task allocation problems is the primary objective, it is essential that the decision-making framework is intelligent enough to manage unexpected task load with limited resources. Otherwise, operation effectiveness would drastically plummet with overloaded agents facing unforeseen risks. In this work, we present a decision-making framework for multi-agent teams to learn task allocation with the consideration of load management through decentralized reinforcement learning, where idling is encouraged and unnecessary resource usage is avoided. We illustrate the effect of load management on team performance and explore agent behaviors in example scenarios. Furthermore, a measure of agent importance in collaboration is developed to infer team resilience when facing handling potential overload situations.
Why do Policy Gradient Methods work so well in Cooperative MARL? Evidence from Policy Representation
In cooperative multi-agent reinforcement learning (MARL), due to its on-policy nature, policy gradient (PG) methods are typically believed to be less sample efficient than value decomposition (VD) methods, which are off-policy. However, some recent empirical studies demonstrate that with proper input representation and hyper-parameter tuning, multi-agent PG can achieve surprisingly strong performance compared to off-policy VD methods. Why could PG methods work so well? In this post, we will present concrete analysis to show that in certain scenarios, e.g., environments with a highly multi-modal reward landscape, VD can be problematic and lead to undesired outcomes. In addition, PG methods with auto-regressive (AR) policies can learn multi-modal policies.
Characterization of Group-Fair Social Choice Rules under Single-Peaked Preferences
Sreedurga, Gogulapati, Sadhukhan, Soumyarup, Roy, Souvik, Narahari, Yadati
We study fairness in social choice settings under single-peaked preferences. Construction and characterization of social choice rules in the single-peaked domain has been extensively studied in prior works. In fact, in the single-peaked domain, it is known that unanimous and strategy-proof deterministic rules have to be min-max rules and those that also satisfy anonymity have to be median rules. Further, random social choice rules satisfying these properties have been shown to be convex combinations of respective deterministic rules. We non-trivially add to this body of results by including fairness considerations in social choice. Our study directly addresses fairness for groups of agents. To study group-fairness, we consider an existing partition of the agents into logical groups, based on natural attributes such as gender, race, and location. To capture fairness within each group, we introduce the notion of group-wise anonymity. To capture fairness across the groups, we propose a weak notion as well as a strong notion of fairness. The proposed fairness notions turn out to be natural generalizations of existing individual-fairness notions and moreover provide non-trivial outcomes for strict ordinal preferences, unlike the existing group-fairness notions. We provide two separate characterizations of random social choice rules that satisfy group-fairness: (i) direct characterization (ii) extreme point characterization (as convex combinations of fair deterministic social choice rules). We also explore the special case where there are no groups and provide sharper characterizations of rules that achieve individual-fairness.