Choi, Chiho
COPAL: Continual Pruning in Large Language Generative Models
Malla, Srikanth, Choi, Joon Hee, Choi, Chiho
Adapting pre-trained large language models to different domains in natural language processing requires two key considerations: high computational demands and model's inability to continual adaptation. To simultaneously address both issues, this paper presents COPAL (COntinual Pruning in Adaptive Language settings), an algorithm developed for pruning large language generative models under a continual model adaptation setting. While avoiding resource-heavy finetuning or retraining, our pruning process is guided by the proposed sensitivity analysis. The sensitivity effectively measures model's ability to withstand perturbations introduced by the new dataset and finds model's weights that are relevant for all encountered datasets. As a result, COPAL allows seamless model adaptation to new domains while enhancing the resource efficiency. Our empirical evaluation on a various size of LLMs show that COPAL outperforms baseline models, demonstrating its efficacy in efficiency and adaptability.
Rank2Tell: A Multimodal Driving Dataset for Joint Importance Ranking and Reasoning
Sachdeva, Enna, Agarwal, Nakul, Chundi, Suhas, Roelofs, Sean, Li, Jiachen, Kochenderfer, Mykel, Choi, Chiho, Dariush, Behzad
The widespread adoption of commercial autonomous vehicles (AVs) and advanced driver assistance systems (ADAS) may largely depend on their acceptance by society, for which their perceived trustworthiness and interpretability to riders are crucial. In general, this task is challenging because modern autonomous systems software relies heavily on black-box artificial intelligence models. Towards this goal, this paper introduces a novel dataset, Rank2Tell, a multi-modal ego-centric dataset for Ranking the importance level and Telling the reason for the importance. Using various close and open-ended visual question answering, the dataset provides dense annotations of various semantic, spatial, temporal, and relational attributes of various important objects in complex traffic scenarios. The dense annotations and unique attributes of the dataset make it a valuable resource for researchers working on visual scene understanding and related fields. Furthermore, we introduce a joint model for joint importance level ranking and natural language captions generation to benchmark our dataset and demonstrate performance with quantitative evaluations.
Domain Knowledge Driven Pseudo Labels for Interpretable Goal-Conditioned Interactive Trajectory Prediction
Sun, Lingfeng, Tang, Chen, Niu, Yaru, Sachdeva, Enna, Choi, Chiho, Misu, Teruhisa, Tomizuka, Masayoshi, Zhan, Wei
Motion forecasting in highly interactive scenarios is a challenging problem in autonomous driving. In such scenarios, we need to accurately predict the joint behavior of interacting agents to ensure the safe and efficient navigation of autonomous vehicles. Recently, goal-conditioned methods have gained increasing attention due to their advantage in performance and their ability to capture the multimodality in trajectory distribution. In this work, we study the joint trajectory prediction problem with the goal-conditioned framework. In particular, we introduce a conditional-variational-autoencoder-based (CVAE) model to explicitly encode different interaction modes into the latent space. However, we discover that the vanilla model suffers from posterior collapse and cannot induce an informative latent space as desired. To address these issues, we propose a novel approach to avoid KL vanishing and induce an interpretable interactive latent space with pseudo labels. The proposed pseudo labels allow us to incorporate domain knowledge on interaction in a flexible manner. We motivate the proposed method using an illustrative toy example. In addition, we validate our framework on the Waymo Open Motion Dataset with both quantitative and qualitative evaluations.
LOKI: Long Term and Key Intentions for Trajectory Prediction
Girase, Harshayu, Gang, Haiming, Malla, Srikanth, Li, Jiachen, Kanehara, Akira, Mangalam, Karttikeya, Choi, Chiho
Recent advances in trajectory prediction have shown that explicit reasoning about agents' intent is important to accurately forecast their motion. However, the current research activities are not directly applicable to intelligent and safety critical systems. This is mainly because very few public datasets are available, and they only consider pedestrian-specific intents for a short temporal horizon from a restricted egocentric view. To this end, we propose LOKI (LOng term and Key Intentions), a novel largescale dataset that is designed to tackle joint trajectory and intention prediction for heterogeneous traffic agents (pedestrians Figure 1: We show that reasoning about long-term goals and vehicles) in an autonomous driving setting. The and short-term intents plays a significant role in trajectory LOKI dataset is created to discover several factors that may prediction. With a lack of comprehensive benchmarks for affect intention, including i) agent's own will, ii) social interactions, this purpose, we introduce a new dataset for intention and iii) environmental constraints, and iv) contextual trajectory prediction. An example use case is illustrated in information. We also propose a model that jointly performs (a) where we predict the trajectory of the target vehicle. In trajectory and intention prediction, showing that recurrently (b), long-term goals are estimated from agent's own motion.
RAIN: Reinforced Hybrid Attention Inference Network for Motion Forecasting
Li, Jiachen, Yang, Fan, Ma, Hengbo, Malla, Srikanth, Tomizuka, Masayoshi, Choi, Chiho
Motion forecasting plays a significant role in various domains (e.g., autonomous driving, human-robot interaction), which aims to predict future motion sequences given a set of historical observations. However, the observed elements may be of different levels of importance. Some information may be irrelevant or even distracting to the forecasting in certain situations. To address this issue, we propose a generic motion forecasting framework (named RAIN) with dynamic key information selection and ranking based on a hybrid attention mechanism. The general framework is instantiated to handle multi-agent trajectory prediction and human motion forecasting tasks, respectively. In the former task, the model learns to recognize the relations between agents with a graph representation and to determine their relative significance. In the latter task, the model learns to capture the temporal proximity and dependency in long-term human motions. We also propose an effective double-stage training pipeline with an alternating training strategy to optimize the parameters in different modules of the framework. We validate the framework on both synthetic simulations and motion forecasting benchmarks in different domains, demonstrating that our method not only achieves state-of-the-art forecasting performance, but also provides interpretable and reasonable hybrid attention weights.
Shared Cross-Modal Trajectory Prediction for Autonomous Driving
Choi, Chiho, Choi, Joon Hee, Li, Jiachen, Malla, Srikanth
Predicting future trajectories of traffic agents in highly interactive environments is an essential and challenging problem for the safe operation of autonomous driving systems. On the basis of the fact that self-driving vehicles are equipped with various types of sensors (e.g., LiDAR scanner, RGB camera, radar, etc.), we propose a Cross-Modal Embedding framework that aims to benefit from the use of multiple input modalities. At training time, our model learns to embed a set of complementary features in a shared latent space by jointly optimizing the objective functions across different types of input data. At test time, a single input modality (e.g., LiDAR data) is required to generate predictions from the input perspective (i.e., in the LiDAR space), while taking advantages from the model trained with multiple sensor modalities. An extensive evaluation is conducted to show the efficacy of the proposed framework using two benchmark driving datasets.
Social-STAGE: Spatio-Temporal Multi-Modal Future Trajectory Forecast
Malla, Srikanth, Choi, Chiho, Dariush, Behzad
This paper considers the problem of multi-modal future trajectory forecast with ranking. Here, multi-modality and ranking refer to the multiple plausible path predictions and the confidence in those predictions, respectively. We propose Social-STAGE, Social interaction-aware Spatio-Temporal multi-Attention Graph convolution network with novel Evaluation for multi-modality. Our main contributions include analysis and formulation of multi-modality with ranking using interaction and multi-attention, and introduction of new metrics to evaluate the diversity and associated confidence of multi-modal predictions. We evaluate our approach on existing public datasets ETH and UCY and show that the proposed algorithm outperforms the state of the arts on these datasets.
Cooperation-Aware Lane Change Control in Dense Traffic
Bae, Sangjae, Saxena, Dhruv, Nakhaei, Alireza, Choi, Chiho, Fujimura, Kikuo, Moura, Scott
Cooperation-A ware Lane Change Control in Dense Traffic Sangjae Bae 1, Dhruv Saxena 2, Alireza Nakhaei 3, Chiho Choi 3, Kikuo Fujimura 3, and Scott Moura 1 Abstract -- This paper presents a real-time lane change control framework of autonomous driving in dense traffic, which exploits cooperative behaviors of human drivers. This paper especially focuses on heavy traffic where vehicles cannot change lane without cooperating with other drivers. In this case, classical robust controls may not apply since there is no "safe" area to merge to. That said, modeling complex and interactive human behaviors is nontrivial from the perspective of control engineers. We propose a mathematical control framework based on Model Predictive Control (MPC) encompassing a state-of-the-art Recurrent Neural network (RNN) architecture. In particular, RNN predicts interactive motions of human drivers in response to potential actions of the autonomous vehicle, which are then be systematically evaluated in safety constraints. We also propose a real-time heuristic algorithm to find locally optimal control inputs. Finally, quantitative and qualitative analysis on simulation studies are presented, showing a strong potential of the proposed framework. I NTRODUCTION An autonomous-driving vehicle is no longer a futuristic concept and extensive researches have been conducted in various aspects, spanning from localization, perceptions, and controls to implementations and validations. Particularly from the perspective of control engineers, designing a controller that secures safety, in various traffic conditions, such as driving on arterial-road/highway in free-flow/dense traffic with/without traffic lights, has been a principal research focus.