Goto

Collaborating Authors

 Fujimura, Kikuo


Interactive Autonomous Navigation with Internal State Inference and Interactivity Estimation

arXiv.org Artificial Intelligence

Deep reinforcement learning (DRL) provides a promising way for intelligent agents (e.g., autonomous vehicles) to learn to navigate complex scenarios. However, DRL with neural networks as function approximators is typically considered a black box with little explainability and often suffers from suboptimal performance, especially for autonomous navigation in highly interactive multi-agent environments. To address these issues, we propose three auxiliary tasks with spatio-temporal relational reasoning and integrate them into the standard DRL framework, which improves the decision making performance and provides explainable intermediate indicators. We propose to explicitly infer the internal states (i.e., traits and intentions) of surrounding agents (e.g., human drivers) as well as to predict their future trajectories in the situations with and without the ego agent through counterfactual reasoning. These auxiliary tasks provide additional supervision signals to infer the behavior patterns of other interactive agents. Multiple variants of framework integration strategies are compared. We also employ a spatio-temporal graph neural network to encode relations between dynamic entities, which enhances both internal state inference and decision making of the ego agent. Moreover, we propose an interactivity estimation mechanism based on the difference between predicted trajectories in these two situations, which indicates the degree of influence of the ego agent on other agents. To validate the proposed method, we design an intersection driving simulator based on the Intelligent Intersection Driver Model (IIDM) that simulates vehicles and pedestrians. Our approach achieves robust and state-of-the-art performance in terms of standard evaluation metrics and provides explainable intermediate indicators (i.e., internal states, and interactivity scores) for decision making.


Robust Driving Policy Learning with Guided Meta Reinforcement Learning

arXiv.org Artificial Intelligence

Although deep reinforcement learning (DRL) has shown promising results for autonomous navigation in interactive traffic scenarios, existing work typically adopts a fixed behavior policy to control social vehicles in the training environment. This may cause the learned driving policy to overfit the environment, making it difficult to interact well with vehicles with different, unseen behaviors. In this work, we introduce an efficient method to train diverse driving policies for social vehicles as a single meta-policy. By randomizing the interaction-based reward functions of social vehicles, we can generate diverse objectives and efficiently train the meta-policy through guiding policies that achieve specific objectives. We further propose a training strategy to enhance the robustness of the ego vehicle's driving policy using the environment where social vehicles are controlled by the learned meta-policy. Our method successfully learns an ego driving policy that generalizes well to unseen situations with out-of-distribution (OOD) social agents' behaviors in a challenging uncontrolled T-intersection scenario.


Risk-Aware Lane Selection on Highway with Dynamic Obstacles

arXiv.org Artificial Intelligence

This paper proposes a discretionary lane selection algorithm. In particular, highway driving is considered as a targeted scenario, where each lane has a different level of traffic flow. When lane-changing is discretionary, it is advised not to change lanes unless highly beneficial, e.g., reducing travel time significantly or securing higher safety. Evaluating such "benefit" is a challenge, along with multiple surrounding vehicles in dynamic speed and heading with uncertainty. We propose a real-time lane-selection algorithm with careful cost considerations and with modularity in design. The algorithm is a search-based optimization method that evaluates uncertain dynamic positions of other vehicles under a continuous time and space domain. For demonstration, we incorporate a state-of-the-art motion planner framework (Neural Networks integrated Model Predictive Control) under a CARLA simulation environment.


Reinforcement Learning with Iterative Reasoning for Merging in Dense Traffic

arXiv.org Artificial Intelligence

To avoid the computational requirements of online methods, we can use reinforcement learning (RL) instead. In RL, In recent years, major progress has been made to deploy the agent interacts with a simulation environment many autonomous vehicles and improve safety. However, certain times prior to execution, and at each simulation episode common driving situations like merging in dense traffic are it improves its strategy. The resulting policy can then be still challenging for autonomous vehicles. Situations like deployed online and is often inexpensive to evaluate. RL the one illustrated in Figure 1 often involve negotiating with provides a flexible framework to automatically find good human drivers.


Safe Reinforcement Learning on Autonomous Vehicles

arXiv.org Artificial Intelligence

-- There have been numerous advances in reinforcement learning, but the typically unconstrained exploration of the learning process prevents the adoption of these methods in many safety critical applications. Recent work in safe reinforcement learning uses idealized models to achieve their guarantees, but these models do not easily accommodate the stochasticity or high-dimensionality of real world systems. We investigate how prediction provides a general and intuitive framework to constraint exploration, and show how it can be used to safely learn intersection handling behaviors on an autonomous vehicle. I. INTRODUCTION With the increasing complexity of robotic systems, and the continued advances in machine learning, it can be tempting to apply reinforcement learning (RL) to challenging control problems. However the trial and error searches typical to RL methods are not appropriate to physical systems which act in the real world where failure cases result in real consequences. To mitigate the safety concerns associated with training an RL agent, there have been various efforts at designing learning processes with safe exploration. As noted by Garcia and Fernandez [1], these approaches can be broadly classified into approaches that modify the objective function and approaches that constrain the search space. Modifying the objective function mostly focuses on catastrophic rare events which do not necessarily have a large impact on the expected return over many trials. Proposed methods take into account the variance of return [2], the worst-outcome [3], [2], [4], and the probability of visiting error states [5].


Interaction-Aware Multi-Agent Reinforcement Learning for Mobile Agents with Individual Goals

arXiv.org Artificial Intelligence

-- In a multi-agent setting, the optimal policy of a single agent is largely dependent on the behavior of other agents. We investigate the problem of multi-agent reinforcement learning, focusing on decentralized learning in non-stationary domains for mobile robot navigation. We identify a cause for the difficulty in training non-stationary policies: mutual adaptation to sub-optimal behaviors, and we use this to motivate a curriculum-based strategy for learning interactive policies. The curriculum has two stages. First, the agent leverages policy gradient algorithms to learn a policy that is capable of achieving multiple goals. Second, the agent learns a modifier policy to learn how to interact with other agents in a multi-agent setting. We evaluated our approach on both an autonomous driving lane-change domain and a robot navigation domain. Single agent reinforcement learning (RL) algorithms have made significant progress in game playing [20] and robotics [13], however, single agent learning algorithms in multi-agent settings are prone to learn stereotyped behaviors that over-fit to the training environment [22], [15]. There are several reasons why multi-agent environments are more difficult: 1) interacting with an unknown agent requires having either multiple responses to a given situation or a more nuanced ability to perceive differences. The former breaks the Markov assumption, the latter rules out simpler solutions which are likely to be found first.


Driving in Dense Traffic with Model-Free Reinforcement Learning

arXiv.org Artificial Intelligence

Traditional planning and control methods could fail to find a feasible trajectory for an autonomous vehicle to execute amongst dense traffic on roads. This is because the obstacle-free volume in spacetime is very small in these scenarios for the vehicle to drive through. However, that does not mean the task is infeasible since human drivers are known to be able to drive amongst dense traffic by leveraging the cooperativeness of other drivers to open a gap. The traditional methods fail to take into account the fact that the actions taken by an agent affect the behaviour of other vehicles on the road. In this work, we rely on the ability of deep reinforcement learning to implicitly model such interactions and learn a continuous control policy over the action space of an autonomous vehicle. The application we consider requires our agent to negotiate and open a gap in the road in order to successfully merge or change lanes. Our policy learns to repeatedly probe into the target road lane while trying to find a safe spot to move in to. We compare against two model-predictive control-based algorithms and show that our policy outperforms them in simulation.


Cooperation-Aware Lane Change Control in Dense Traffic

arXiv.org Artificial Intelligence

Cooperation-A ware Lane Change Control in Dense Traffic Sangjae Bae 1, Dhruv Saxena 2, Alireza Nakhaei 3, Chiho Choi 3, Kikuo Fujimura 3, and Scott Moura 1 Abstract -- This paper presents a real-time lane change control framework of autonomous driving in dense traffic, which exploits cooperative behaviors of human drivers. This paper especially focuses on heavy traffic where vehicles cannot change lane without cooperating with other drivers. In this case, classical robust controls may not apply since there is no "safe" area to merge to. That said, modeling complex and interactive human behaviors is nontrivial from the perspective of control engineers. We propose a mathematical control framework based on Model Predictive Control (MPC) encompassing a state-of-the-art Recurrent Neural network (RNN) architecture. In particular, RNN predicts interactive motions of human drivers in response to potential actions of the autonomous vehicle, which are then be systematically evaluated in safety constraints. We also propose a real-time heuristic algorithm to find locally optimal control inputs. Finally, quantitative and qualitative analysis on simulation studies are presented, showing a strong potential of the proposed framework. I NTRODUCTION An autonomous-driving vehicle is no longer a futuristic concept and extensive researches have been conducted in various aspects, spanning from localization, perceptions, and controls to implementations and validations. Particularly from the perspective of control engineers, designing a controller that secures safety, in various traffic conditions, such as driving on arterial-road/highway in free-flow/dense traffic with/without traffic lights, has been a principal research focus.


Cooperation-Aware Reinforcement Learning for Merging in Dense Traffic

arXiv.org Artificial Intelligence

Decision making in dense traffic can be challenging for autonomous vehicles. An autonomous system only relying on predefined road priorities and considering other drivers as moving objects will cause the vehicle to freeze and fail the maneuver. Human drivers leverage the cooperation of other drivers to avoid such deadlock situations and convince others to change their behavior. Decision making algorithms must reason about the interaction with other drivers and anticipate a broad range of driver behaviors. In this work, we present a reinforcement learning approach to learn how to interact with drivers with different cooperation levels. We enhanced the performance of traditional reinforcement learning algorithms by maintaining a belief over the level of cooperation of other drivers. We show that our agent successfully learns how to navigate a dense merging scenario with less deadlocks than with online planning methods.


Uncertainty-Aware Data Aggregation for Deep Imitation Learning

arXiv.org Machine Learning

Estimating statistical uncertainties allows autonomous agents to communicate their confidence during task execution and is important for applications in safety-critical domains such as autonomous driving. In this work, we present the uncertainty-aware imitation learning (UAIL) algorithm for improving end-to-end control systems via data aggregation. UAIL applies Monte Carlo Dropout to estimate uncertainty in the control output of end-to-end systems, using states where it is uncertain to selectively acquire new training data. In contrast to prior data aggregation algorithms that force human experts to visit sub-optimal states at random, UAIL can anticipate its own mistakes and switch control to the expert in order to prevent visiting a series of sub-optimal states. Our experimental results from simulated driving tasks demonstrate that our proposed uncertainty estimation method can be leveraged to reliably predict infractions. Our analysis shows that UAIL outperforms existing data aggregation algorithms on a series of benchmark tasks.