Goto

Collaborating Authors

 Chandra, Rohan


LIVEPOINT: Fully Decentralized, Safe, Deadlock-Free Multi-Robot Control in Cluttered Environments with High-Dimensional Inputs

arXiv.org Artificial Intelligence

Fully decentralized, safe, and deadlock-free multi-robot navigation in dynamic, cluttered environments is a critical challenge in robotics. Current methods require exact state measurements in order to enforce safety and liveness e.g. via control barrier functions (CBFs), which is challenging to achieve directly from onboard sensors like lidars and cameras. This work introduces LIVEPOINT, a decentralized control framework that synthesizes universal CBFs over point clouds to enable safe, deadlock-free real-time multi-robot navigation in dynamic, cluttered environments. Further, LIVEPOINT ensures minimally invasive deadlock avoidance behavior by dynamically adjusting agents' speeds based on a novel symmetric interaction metric. We validate our approach in simulation experiments across highly constrained multi-robot scenarios like doorways and intersections. Results demonstrate that LIVEPOINT achieves zero collisions or deadlocks and a 100% success rate in challenging settings compared to optimization-based baselines such as MPC and ORCA and neural methods such as MPNet, which fail in such environments. Despite prioritizing safety and liveness, LIVEPOINT is 35% smoother than baselines in the doorway environment, and maintains agility in constrained environments while still being safe and deadlock-free.


GameChat: Multi-LLM Dialogue for Safe, Agile, and Socially Optimal Multi-Agent Navigation in Constrained Environments

arXiv.org Artificial Intelligence

Safe, agile, and socially compliant multi-robot navigation in cluttered and constrained environments remains a critical challenge. This is especially difficult with self-interested agents in decentralized settings, where there is no central authority to resolve conflicts induced by spatial symmetry. We address this challenge by proposing a novel approach, GameChat, which facilitates safe, agile, and deadlock-free navigation for both cooperative and self-interested agents. Key to our approach is the use of natural language communication to resolve conflicts, enabling agents to prioritize more urgent tasks and break spatial symmetry in a socially optimal manner. Our algorithm ensures subgame perfect equilibrium, preventing agents from deviating from agreed-upon behaviors and supporting cooperation. Furthermore, we guarantee safety through control barrier functions and preserve agility by minimizing disruptions to agents' planned trajectories. We evaluate GameChat in simulated environments with doorways and intersections. The results show that even in the worst case, GameChat reduces the time for all agents to reach their goals by over 35% from a naive baseline and by over 20% from SMG-CBF in the intersection scenario, while doubling the rate of ensuring the agent with a higher priority task reaches the goal first, from 50% (equivalent to random chance) to a 100% perfect performance at maximizing social welfare.


Towards Large Language Models that Benefit for All: Benchmarking Group Fairness in Reward Models

arXiv.org Artificial Intelligence

As Large Language Models (LLMs) become increasingly powerful and accessible to human users, ensuring fairness across diverse demographic groups, i.e., group fairness, is a critical ethical concern. However, current fairness and bias research in LLMs is limited in two aspects. First, compared to traditional group fairness in machine learning classification, it requires that the non-sensitive attributes, in this case, the prompt questions, be the same across different groups. In many practical scenarios, different groups, however, may prefer different prompt questions and this requirement becomes impractical. Second, it evaluates group fairness only for the LLM's final output without identifying the source of possible bias. Namely, the bias in LLM's output can result from both the pretraining and the finetuning. For finetuning, the bias can result from both the RLHF procedure and the learned reward model. Arguably, evaluating the group fairness of each component in the LLM pipeline could help develop better methods to mitigate the possible bias. Recognizing those two limitations, this work benchmarks the group fairness of learned reward models. By using expert-written text from arXiv, we are able to benchmark the group fairness of reward models without requiring the same prompt questions across different demographic groups. Surprisingly, our results demonstrate that all the evaluated reward models (e.g., Nemotron-4-340B-Reward, ArmoRM-Llama3-8B-v0.1, and GRM-llama3-8B-sftreg) exhibit statistically significant group unfairness. We also observed that top-performing reward models (w.r.t. canonical performance metrics) tend to demonstrate better group fairness.


Group Fairness in Multi-Task Reinforcement Learning

arXiv.org Artificial Intelligence

This paper addresses a critical societal consideration in the application of Reinforcement Learning (RL): ensuring equitable outcomes across different demographic groups in multi-task settings. While previous work has explored fairness in single-task RL, many real-world applications are multi-task in nature and require policies to maintain fairness across all tasks. We introduce a novel formulation of multi-task group fairness in RL and propose a constrained optimization algorithm that explicitly enforces fairness constraints across multiple tasks simultaneously. We have shown that our proposed algorithm does not violate fairness constraints with high probability and with sublinear regret in the finite-horizon episodic setting. Through experiments in RiverSwim and MuJoCo environments, we demonstrate that our approach better ensures group fairness across multiple tasks compared to previous methods that lack explicit multi-task fairness constraints in both the finite-horizon setting and the infinite-horizon setting. Our results show that the proposed algorithm achieves smaller fairness gaps while maintaining comparable returns across different demographic groups and tasks, suggesting its potential for addressing fairness concerns in real-world multi-task RL applications.


A Survey of In-Context Reinforcement Learning

arXiv.org Artificial Intelligence

Reinforcement learning (RL) agents typically optimize their policies by performing expensive backward passes to update their network parameters. However, some agents can solve new tasks without updating any parameters by simply conditioning on additional context such as their action-observation histories. This paper surveys work on such behavior, known as in-context reinforcement learning.


LiveNet: Robust, Minimally Invasive Multi-Robot Control for Safe and Live Navigation in Constrained Environments

arXiv.org Artificial Intelligence

Robots in densely populated real-world environments frequently encounter constrained and cluttered situations such as passing through narrow doorways, hallways, and corridor intersections, where conflicts over limited space result in collisions or deadlocks among the robots. Current decentralized state-of-the-art optimization- and neural network-based approaches (i) are predominantly designed for general open spaces, and (ii) are overly conservative, either guaranteeing safety, or liveness, but not both. While some solutions rely on centralized conflict resolution, their highly invasive trajectories make them impractical for real-world deployment. This paper introduces LiveNet, a fully decentralized and robust neural network controller that enables human-like yielding and passing, resulting in agile, non-conservative, deadlock-free, and safe, navigation in congested, conflict-prone spaces. LiveNet is minimally invasive, without requiring inter-agent communication or cooperative behavior. The key insight behind LiveNet is a unified CBF formulation for simultaneous safety and liveness, which we integrate within a neural network for robustness. We evaluated LiveNet in simulation and found that general multi-robot optimization- and learning-based navigation methods fail to even reach the goal, and while methods designed specially for such environments do succeed, they are 10-20 times slower, 4-5 times more invasive, and much less robust to variations in the scenario configuration such as changes in the start states and goal states, among others. We open-source the LiveNet code at https://github.com/srikarg89/LiveNet{https://github.com/srikarg89/LiveNet.


Take an Emotion Walk: Perceiving Emotions from Gaits Using Hierarchical Attention Pooling and Affective Mapping

arXiv.org Artificial Intelligence

We present an autoencoder-based semi-supervised approach to classify perceived human emotions from walking styles obtained from videos or motion-captured data and represented as sequences of 3D poses. Given the motion on each joint in the pose at each time step extracted from 3D pose sequences, we hierarchically pool these joint motions in a bottom-up manner in the encoder, following the kinematic chains in the human body. We also constrain the latent embeddings of the encoder to contain the space of psychologically-motivated affective features underlying the gaits. We train the decoder to reconstruct the motions per joint per time step in a top-down manner from the latent embeddings. For the annotated data, we also train a classifier to map the latent embeddings to emotion labels. Our semi-supervised approach achieves a mean average precision of 0.84 on the Emotion-Gait benchmark dataset, which contains both labeled and unlabeled gaits collected from multiple sources. We outperform current state-of-art algorithms for both emotion recognition and action recognition from 3D gaits by 7%--23% on the absolute. More importantly, we improve the average precision by 10%--50% on the absolute on classes that each makes up less than 25% of the labeled part of the Emotion-Gait benchmark dataset.


STEP: Spatial Temporal Graph Convolutional Networks for Emotion Perception from Gaits

arXiv.org Artificial Intelligence

We present a novel classifier network called STEP, to classify perceived human emotion from gaits, based on a Spatial Temporal Graph Convolutional Network (ST-GCN) architecture. Given an RGB video of an individual walking, our formulation implicitly exploits the gait features to classify the emotional state of the human into one of four emotions: happy, sad, angry, or neutral. We use hundreds of annotated real-world gait videos and augment them with thousands of annotated synthetic gaits generated using a novel generative network called STEP-Gen, built on an ST-GCN based Conditional Variational Autoencoder (CVAE). We incorporate a novel push-pull regularization loss in the CVAE formulation of STEP-Gen to generate realistic gaits and improve the classification accuracy of STEP. We also release a novel dataset (E-Gait), which consists of $2,177$ human gaits annotated with perceived emotions along with thousands of synthetic gaits. In practice, STEP can learn the affective features and exhibits classification accuracy of 89% on E-Gait, which is 14 - 30% more accurate over prior methods.


Towards Imitation Learning in Real World Unstructured Social Mini-Games in Pedestrian Crowds

arXiv.org Artificial Intelligence

Imitation Learning (IL) strategies are used to generate policies for robot motion planning and navigation by learning from human trajectories. Recently, there has been a lot of excitement in applying IL in social interactions arising in urban environments such as university campuses, restaurants, grocery stores, and hospitals. However, obtaining numerous expert demonstrations in social settings might be expensive, risky, or even impossible. Current approaches therefore, focus only on simulated social interaction scenarios. This raises the question: \textit{How can a robot learn to imitate an expert demonstrator from real world multi-agent social interaction scenarios}? It remains unknown which, if any, IL methods perform well and what assumptions they require. We benchmark representative IL methods in real world social interaction scenarios on a motion planning task, using a novel pedestrian intersection dataset collected at the University of Texas at Austin campus. Our evaluation reveals two key findings: first, learning multi-agent cost functions is required for learning the diverse behavior modes of agents in tightly coupled interactions and second, conditioning the training of IL methods on partial state information or providing global information in simulation can improve imitation learning, especially in real world social interaction scenarios.


GAMEOPT+: Improving Fuel Efficiency in Unregulated Heterogeneous Traffic Intersections via Optimal Multi-agent Cooperative Control

arXiv.org Artificial Intelligence

Better fuel efficiency leads to better financial security as well as a cleaner environment. We propose a novel approach for improving fuel efficiency in unstructured and unregulated traffic environments. Existing intelligent transportation solutions for improving fuel efficiency, however, apply only to traffic intersections with sparse traffic or traffic where drivers obey the regulations, or both. We propose GameOpt+, a novel hybrid approach for cooperative intersection control in dynamic, multi-lane, unsignalized intersections. GameOpt+ is a hybrid solution that combines an auction mechanism and an optimization-based trajectory planner. It generates a priority entrance sequence for each agent and computes velocity controls in real-time, taking less than 10 milliseconds even in high-density traffic with over 10,000 vehicles per hour. Compared to fully optimization-based methods, it operates 100 times faster while ensuring fairness, safety, and efficiency. Tested on the SUMO simulator, our algorithm improves throughput by at least 25%, reduces the time to reach the goal by at least 70%, and decreases fuel consumption by 50% compared to auction-based and signaled approaches using traffic lights and stop signs. GameOpt+ is also unaffected by unbalanced traffic inflows, whereas some of the other baselines encountered a decrease in performance in unbalanced traffic inflow environments.