control action
COVLM-RL: Critical Object-Oriented Reasoning for Autonomous Driving Using VLM-Guided Reinforcement Learning
Li, Lin, Cai, Yuxin, Fang, Jianwu, Xue, Jianru, Lv, Chen
End-to-end autonomous driving frameworks face persistent challenges in generalization, training efficiency, and interpretability. While recent methods leverage Vision-Language Models (VLMs) through supervised learning on large-scale datasets to improve reasoning, they often lack robustness in novel scenarios. Conversely, reinforcement learning (RL)-based approaches enhance adaptability but remain data-inefficient and lack transparent decision-making. % contribution To address these limitations, we propose COVLM-RL, a novel end-to-end driving framework that integrates Critical Object-oriented (CO) reasoning with VLM-guided RL. Specifically, we design a Chain-of-Thought (CoT) prompting strategy that enables the VLM to reason over critical traffic elements and generate high-level semantic decisions, effectively transforming multi-view visual inputs into structured semantic decision priors. These priors reduce the input dimensionality and inject task-relevant knowledge into the RL loop, accelerating training and improving policy interpretability. However, bridging high-level semantic guidance with continuous low-level control remains non-trivial. To this end, we introduce a consistency loss that encourages alignment between the VLM's semantic plans and the RL agent's control outputs, enhancing interpretability and training stability. Experiments conducted in the CARLA simulator demonstrate that COVLM-RL significantly improves the success rate by 30\% in trained driving environments and by 50\% in previously unseen environments, highlighting its strong generalization capability.
- Asia > Singapore (0.05)
- Asia > China > Shaanxi Province > Xi'an (0.04)
- Asia > Middle East > Republic of Türkiye > Karaman Province > Karaman (0.04)
- Transportation > Ground > Road (1.00)
- Information Technology > Robotics & Automation (0.63)
- Asia > China > Shanghai > Shanghai (0.06)
- Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.04)
- Asia > Middle East > Jordan (0.04)
- Transportation > Ground > Road (1.00)
- Information Technology (1.00)
- Automobiles & Trucks (0.85)
- North America > United States > California > Los Angeles County > Pasadena (0.05)
- Asia > China > Beijing > Beijing (0.04)
- Energy > Renewable (0.68)
- Energy > Power Industry (0.46)
- North America > United States > Georgia > Clarke County > Athens (0.14)
- North America > United States > California > Santa Clara County > Stanford (0.04)
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- (2 more...)
Behavior-Aware Online Prediction of Obstacle Occupancy using Zonotopes
Carrizosa-Rendon, Alvaro, Zhou, Jian, Frisk, Erik, Puig, Vicenc, Nejjari, Fatiha
Abstract-- Predicting the motion of surrounding vehicles is key to safe autonomous driving, especially in unstructured environments without prior information. This paper proposes a novel online method to accurately predict the occupancy sets of surrounding vehicles based solely on motion observations. The approach is divided into two stages: first, an Extended Kalman Filter and a Linear Programming (LP) problem are used to estimate a compact zonotopic set of control actions; then, a reachability analysis propagates this set to predict future occupancy. The effectiveness of the method has been validated through simulations in an urban environment, showing accurate and compact predictions without relying on prior assumptions or prior training data. I. INTRODUCTION Autonomous driving has generated great research interests given the expected benefits, such as reducing accidents, optimizing traffic efficiency and energy management [1]. However, ensuring safety remains a major challenge, particularly in urban environments, where multiple agents interact dynamically [2].Predicting the motion of surrounding vehicles (SVs) is critical to designing safe motion planning and control strategies for autonomous vehicles.
Least Restrictive Hyperplane Control Barrier Functions
Trende, Mattias, Ögren, Petter
Control Barrier Functions (CBFs) can provide provable safety guarantees for dynamic systems. However, finding a valid CBF for a system of interest is often non-trivial, especially if the shape of the unsafe region is complex and the CBFs are of higher order. A common solution to this problem is to make a conservative approximation of the unsafe region in the form of a line/hyperplane, and use the corresponding conservative Hyperplane-CBF when deciding on safe control actions. In this letter, we note that conservative constraints are only a problem if they prevent us from doing what we want. Thus, instead of first choosing a CBF and then choosing a safe control with respect to the CBF, we optimize over a combination of CBFs and safe controls to get as close as possible to our desired control, while still having the safety guarantee provided by the CBF. We call the corresponding CBF the least restrictive Hyperplane-CBF. Finally, we also provide a way of creating a smooth parameterization of the CBF-family for the optimization, and illustrate the approach on a double integrator dynamical system with acceleration constraints, moving through a group of arbitrarily shaped static and moving obstacles.
- North America > United States > Wisconsin > Milwaukee County > Milwaukee (0.04)
- Europe > Sweden > Stockholm > Stockholm (0.04)
Lyapunov-Aware Quantum-Inspired Reinforcement Learning for Continuous-Time Vehicle Control: A Feasibility Study
Kraipatthanapong, Nutkritta, Thathong, Natthaphat, Suksawas, Pannita, Klunklin, Thanunnut, Vongthonglua, Kritin, Attahakul, Krit, Aueawatthanaphisut, Aueaphum
This paper presents a novel Lyapunov-Based Quantum Reinforcement Learning (LQRL) framework that integrates quantum policy optimization with Lyapunov stability analysis for continuous-time vehicle control. The proposed approach combines the representational power of variational quantum circuits (VQCs) with a stability-aware policy gradient mechanism to ensure asymptotic convergence and safe decision-making under dynamic environments. The vehicle longitudinal control problem was formulated as a continuous-state reinforcement learning task, where the quantum policy network generates control actions subject to Lyapunov stability constraints. Simulation experiments were conducted in a closed-loop adaptive cruise control scenario using a quantum-inspired policy trained under stability feedback. The results demonstrate that the LQRL framework successfully embeds Lyapunov stability verification into quantum policy learning, enabling interpretable and stability-aware control performance. Although transient overshoot and Lyapunov divergence were observed under aggressive acceleration, the system maintained bounded state evolution, validating the feasibility of integrating safety guarantees within quantum reinforcement learning architectures. The proposed framework provides a foundational step toward provably safe quantum control in autonomous systems and hybrid quantum-classical optimization domains.
- Asia > Thailand > Pathum Thani > Pathum Thani (0.04)
- Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.04)
- Transportation (0.58)
- Energy (0.49)
- Automobiles & Trucks (0.37)
Active Measuring in Reinforcement Learning With Delayed Negative Effects
Gao, Daiqi, Xu, Ziping, Rawashdeh, Aseel, Klasnja, Predrag, Murphy, Susan A.
Measuring states in reinforcement learning (RL) can be costly in real-world settings and may negatively influence future outcomes. We introduce the Actively Observable Markov Decision Process (AOMDP), where an agent not only selects control actions but also decides whether to measure the latent state. The measurement action reveals the true latent state but may have a negative delayed effect on the environment. We show that this reduced uncertainty may provably improve sample efficiency and increase the value of the optimal policy despite these costs. We formulate an AOMDP as a periodic partially observable MDP and propose an online RL algorithm based on belief states. To approximate the belief states, we further propose a sequential Monte Carlo method to jointly approximate the posterior of unknown static environment parameters and unobserved latent states. We evaluate the proposed algorithm in a digital health application, where the agent decides when to deliver digital interventions and when to assess users' health status through surveys.
- North America > United States > North Carolina (0.04)
- North America > United States > Michigan (0.04)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)
VeMo: A Lightweight Data-Driven Approach to Model Vehicle Dynamics
Oddo, Girolamo, Nuca, Roberto, Parsani, Matteo
Abstract--Developing a dynamic model for a high-performance vehicle is a complex problem that requires extensive structural information about the system under analysis. This information is often unavailable to those who did not design the vehicle and represents a typical issue in autonomous driving applications, which are frequently developed on top of existing vehicles; therefore, vehicle models are developed under conditions of information scarcity. This paper proposes a lightweight encoder-decoder model based on Gate Recurrent Unit layers to correlate the vehicle's future state with its past states, measured onboard, and control actions the driver performs. The results demonstrate that the model achieves a maximum mean relative error below 2.6% in extreme dynamic conditions. It also shows good robustness when subject to noisy input data across the interested frequency components. Furthermore, being entirely data-driven and free from physical constraints, the model exhibits physical consistency in the output signals, such as longitudinal and lateral accelerations, yaw rate, and the vehicle's longitudinal velocity. N the automotive sector developing a representative vehicle dynamics model is a complex and multifaceted challenge [1]-[3]. Numerous nonlinear factors influence vehicle dynamics, including tire characteristics, suspension geometry, aerodynamics, drivetrain effects, and external environmental factors, such as road surface grip conditions and climatic effects (e.g., wind). Accurately capturing these effects in a computational model requires high-fidelity multibody simulation software and a profound understanding of the vehicle system.
- Automobiles & Trucks (1.00)
- Transportation > Ground > Road (0.48)
- Leisure & Entertainment > Sports > Motorsports (0.46)
- Information Technology > Robotics & Automation (0.34)
- Information Technology > Data Science (1.00)
- Information Technology > Artificial Intelligence > Robots (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
A Hierarchical Control Architecture for Space Robots in On-Orbit Servicing Operations
The Kessler syndrome describes the self-sustaining cascade of collisions that could render orbital regions unusable (see Kessler and Cour-Palais (1978)). To mitigate this threat, two key strategies have emerged: Active Debris Removal (ADR) and In-Orbit Servicing (IOS). ADR focuses on the active removal of defunct satellites and fragments, while IOS extends the operational lifetime of active satellites through tasks such as refueling, repair, and upgrading, as explained in Flores-Abad et al. (2014); Shan et al. (2016). Space robots represent a promising solution for both ADR and IOS. The design of a coordinated controller for this kind of systems, requiring autonomous capabilities in space environment, is complex due to the dynamic couplings between the spacecraft and the robotic arm. For this reason, they have been studied for many years, starting from the pioneering work of Papadopoulos and Dubowsky (1991) up to the most recent works of Giordano et al. (2020) and Giordano et al. (2019). The inherent complexity of robotic system is also due to the presence of uncertainties and external disturbances, which can be mitigated using robust control techniques. The works of Dubanchet et al. (2015) and Faure et al. (2022) represent the state of the art in the context of H
- North America > United States > California > Sacramento County > Sacramento (0.04)
- Europe > United Kingdom > England > Devon > Plymouth (0.04)
- Europe > Italy > Lombardy > Milan (0.04)
- Europe > France > Grand Est > Bas-Rhin > Strasbourg (0.04)