Goto

Collaborating Authors

 unexpected situation


Meta-reasoning Using Attention Maps and Its Applications in Cloud Robotics

Lendinez, Adrian, Qiu, Renxi, Zanzi, Lanfranco, Li, Dayou

arXiv.org Artificial Intelligence

Meta-reasoning Using Attention Maps and Its Applications in Cloud Robotics Adrian Lendinez 1, Renxi Qiu 1, Lanfranco Zanzi 2 and Dayou Li 1, Abstract -- Meta-reasoning, a branch of AI, focuses on reasoning about reasons. It has the potential to enhance robots' decision-making processes in unexpected situations. However, the concept has largely been confined to theoretical discussions and case-by-case investigations, lacking general and practical solutions when the V alue of Computation (V oC) is undefined, which is common in unexpected situations. In this work, we propose a revised meta-reasoning framework that significantly improves the scalability of the original approach in unexpected situations. This is achieved by incorporating semantic attention maps and unsupervised "attention" updates into the meta-reasoning processes. T o accommodate environmental dynamics, "lines of thought" are used to bridge context-specific objects with abstracted attentions, while meta-information is monitored and controlled at the meta-level for effective reasoning. The practicality of the proposed approach is demonstrated through cloud robots deployed in real-world scenarios, showing improved performance and robustness. I NTRODUCTION Significant progress has been made in probabilistic robotics to improve the adaptability and robustness of robot operations [1]. By integrating probabilistic models and statistical methods into perception and decision-making processes, robots can address structured uncertainty and randomness. However, to remain robust in unexpected situations, autonomous systems must also manage their reasoning processes, such as effectively handling uncertainties at the ground level and adapting objects at the conceptual level. This capability, known as meta-reasoning, facilitates reasoning about reasons [2].


DriVLMe: Enhancing LLM-based Autonomous Driving Agents with Embodied and Social Experiences

Huang, Yidong, Sansom, Jacob, Ma, Ziqiao, Gervits, Felix, Chai, Joyce

arXiv.org Artificial Intelligence

Recent advancements in foundation models (FMs) have unlocked new prospects in autonomous driving, yet the experimental settings of these studies are preliminary, over-simplified, and fail to capture the complexity of real-world driving scenarios in human environments. It remains under-explored whether FM agents can handle long-horizon navigation tasks with free-from dialogue and deal with unexpected situations caused by environmental dynamics or task changes. To explore the capabilities and boundaries of FMs faced with the challenges above, we introduce DriVLMe, a video-language-model-based agent to facilitate natural and effective communication between humans and autonomous vehicles that perceive the environment and navigate. We develop DriVLMe from both embodied experiences in a simulated environment and social experiences from real human dialogue. While DriVLMe demonstrates competitive performance in both open-loop benchmarks and closed-loop human studies, we reveal several limitations and challenges, including unacceptable inference time, imbalanced training data, limited visual understanding, challenges with multi-turn interactions, simplified language generation from robotic experiences, and difficulties in handling on-the-fly unexpected situations like environmental dynamics and task changes.


Driving Everywhere with Large Language Model Policy Adaptation

Li, Boyi, Wang, Yue, Mao, Jiageng, Ivanovic, Boris, Veer, Sushant, Leung, Karen, Pavone, Marco

arXiv.org Artificial Intelligence

Adapting driving behavior to new environments, customs, and laws is a long-standing problem in autonomous driving, precluding the widespread deployment of autonomous vehicles (AVs). In this paper, we present LLaDA, a simple yet powerful tool that enables human drivers and autonomous vehicles alike to drive everywhere by adapting their tasks and motion plans to traffic rules in new locations. LLaDA achieves this by leveraging the impressive zero-shot generalizability of large language models (LLMs) in interpreting the traffic rules in the local driver handbook. Through an extensive user study, we show that LLaDA's instructions are useful in disambiguating in-the-wild unexpected situations. We also demonstrate LLaDA's ability to adapt AV motion planning policies in real-world datasets; LLaDA outperforms baseline planning approaches on all our metrics. Please check our website for more details: https://boyiliee.github.io/llada.