AITopics

Bilevel Multi-Armed Bandit-Based Hierarchical Reinforcement Learning for Interaction-Aware Self-Driving at Unsignalized Intersections

Peng, Zengqi, Wang, Yubin, Zheng, Lei, Ma, Jun

In this work, we present BiM-ACPPO, a bilevel multi-armed bandit-based hierarchical reinforcement learning framework for interaction-aware decision-making and planning at unsignalized intersections. Essentially, it proactively takes the uncertainties associated with surrounding vehicles (SVs) into consideration, which encompass those stemming from the driver's intention, interactive behaviors, and the varying number of SVs. Intermediate decision variables are introduced to enable the high-level RL policy to provide an interaction-aware reference, for guiding low-level model predictive control (MPC) and further enhancing the generalization ability of the proposed framework. By leveraging the structured nature of self-driving at unsignalized intersections, the training problem of the RL policy is modeled as a bilevel curriculum learning task, which is addressed by the proposed Exp3.S-based BiMAB algorithm. It is noteworthy that the training curricula are dynamically adjusted, thereby facilitating the sample efficiency of the RL training process. Comparative experiments are conducted in the high-fidelity CARLA simulator, and the results indicate that our approach achieves superior performance compared to all baseline methods. Furthermore, experimental results in two new urban driving scenarios clearly demonstrate the commendable generalization performance of the proposed method.

artificial intelligence, machine learning, reinforcement learning, (19 more...)

2502.0396

Country: Asia > China (0.28)

Genre:

Research Report (1.00)
Instructional Material > Course Syllabus & Notes (0.34)

Industry:

Transportation > Ground > Road (1.00)
Automobiles & Trucks (1.00)
Energy > Oil & Gas (0.70)

Technology:

Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Ramirez-Sanchez, Edgar, Tang, Catherine, Xu, Yaosheng, Renganathan, Nrithya, Jayawardana, Vindula, He, Zhengbing, Wu, Cathy

NeuralMOVES: A lightweight and microscopic vehicle emission estimation model based on reverse engineering and surrogate learning

This significant contribution makes it a critical sector for climate change mitigation, as reducing emissions from transportation is essential for achieving global climate goals. The sector's transformation through electrification, automation, and intelligent infrastructure offers promising avenues for substantial emissions reductions (Sciarretta et al., 2020; International Energy Agency, 2023; McKinsey Center for Future Mobility, 2023). However, the success of these innovations is critically dependent on the availability of suitable and accurate emission estimation models to guide the design and deployment of new technologies. Motor Vehicle Emission Simulation (MOVES) (U.S. Environmental Protection Agency, 2022), one of the most well-established emission estimation models, serves as the official and state-of-the-art emission estimation model in the U.S., provided, enforced, and maintained by the U.S. Environmental Protection Agency (EPA). Despite its technical certification, MOVES' processing and software is tailored for two specific governmental uses: State Implementation Plans and Conformity Analyses U.S. Environmental Protection Agency (2021), which are for states to achieve and maintain air quality standards; and its use beyond trained practitioners and these specific analyses poses two main limitations. First, a steep learning curve, computational demands, and complex inputs make it difficult for researchers and practitioners to use. In particular, MOVES has rigid input requirements, including a combination of toggle-based settings within its GUI and structured input files in specific formats. Second, MOVES is tailored for macroscopic analysis and is unsuitable for microscopic applications, such as control and optimization, which commonly require second-by-second emission calculations for individual actions and vehicles.

artificial intelligence, machine learning, optimization problem, (17 more...)

2502.04417

Country: North America > United States > Massachusetts (0.28)

Genre: Research Report > New Finding (0.46)

Industry:

Transportation > Ground > Road (1.00)
Law > Environmental Law (1.00)
Government > Regional Government > North America Government > United States Government (1.00)
Energy > Oil & Gas (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)

Hose, Henrik, Weisgerber, Jan, Trimpe, Sebastian

The Mini Wheelbot: A Testbed for Learning-based Balancing, Flips, and Articulated Driving

The Mini Wheelbot is a balancing, reaction wheel unicycle robot designed as a testbed for learning-based control. It is an unstable system with highly nonlinear yaw dynamics, non-holonomic driving, and discrete contact switches in a small, powerful, and rugged form factor. The Mini Wheelbot can use its wheels to stand up from any initial orientation - enabling automatic environment resets in repetitive experiments and even challenging half flips. We illustrate the effectiveness of the Mini Wheelbot as a testbed by implementing two popular learning-based control algorithms. First, we showcase Bayesian optimization for tuning the balancing controller. Second, we use imitation learning from an expert nonlinear MPC that uses gyroscopic effects to reorient the robot and can track higher-level velocity and orientation commands. The latter allows the robot to drive around based on user commands - for the first time in this class of robots. The Mini Wheelbot is not only compelling for testing learning-based control algorithms, but it is also just fun to work with, as demonstrated in the video of our experiments.

artificial intelligence, mini wheelbot, optimization problem, (15 more...)

2502.04582

Country: North America > United States (0.28)

Genre: Research Report (0.50)

Industry: Energy > Oil & Gas (0.52)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)

Online Location Planning for AI-Defined Vehicles: Optimizing Joint Tasks of Order Serving and Spatio-Temporal Heterogeneous Model Fine-Tuning

Zheng, Bokeng, Rao, Bo, Zhu, Tianxiang, Tan, Chee Wei, Duan, Jingpu, Zhou, Zhi, Chen, Xu, Zhang, Xiaoxi

Abstract--Advances in artificial intelligence (AI) including foundation models (FMs), are increasingly transforming human society, with smart city driving the evolution of urban living. Meanwhile, vehicle crowdsensing (VCS) has emerged as a key enabler, leveraging vehicles' mobility and sensor-equipped capabilities. In particular, ride-hailing vehicles can effectively facilitate flexible data collection and contribute towards urban intelligence, despite resource limitations. Therefore, this work explores a promising scenario, where edge-assisted vehicles perform joint tasks of order serving and the emerging foundation model finetuning using various urban data. However, integrating the VCS AI task with the conventional order serving task is challenging, due to their inconsistent spatio-temporal characteristics: (i) The distributions of ride orders and data point-of-interests (PoIs) may not coincide in geography, both following a priori unknown patterns; (ii) they have distinct forms of temporal effects, i.e., prolonged waiting makes orders become instantly invalid while data with increased staleness gradually reduces its utility for model fine-tuning. To overcome these obstacles, we propose an online framework based on multi-agent reinforcement learning (MARL) with careful augmentation. A new quality-of-service (QoS) metric is designed to characterize and balance the utility of the two joint tasks, under the effects of varying data volumes and staleness. Each RSU, equipped with a server, stores a complete base model, enabling vehicles to perform real-time fine-tuning as they collect data and transfer the I. X. Zhang are with the School of Computer Science and A previous version appears at IWQoS 2024 as a short paper. Due to the large volume, data stored in the government agencies in better city management. Notably, ridehailing RSU server can be discarded in a certain period of time. In vehicles are particularly advantageous for VCS tasks, practice, these data can be descriptive features and feedbacks due to their centralized ride-hailing platform management, (labels) of recommendation or generative AR applications, which reduces the cost of deploying and executing crowdsensing generated by nearby visitors or residents. They can also be tasks, and utilizes the data and computing resources traffic/environment monitoring data with labels generated by from ride-hailing vehicles to maximize the VCS task utilities. The government or any company that collaborates model (FM)-powered AI applications have revolutionized with the ride-hailing vehicle company has multiple types of numerous aspects of human lives, including healthcare, education, VSC tasks to fulfill, each of which needs certain locations industry, etc. FMs, e.g., BERT, GPT-4, ViT, serve of data for fine-tuning UFMs.

data mining, machine learning, reinforcement learning, (24 more...)

2502.04399

Country:

Europe > Germany > Lower Saxony > Gottingen (0.14)
North America > United States > New York (0.04)
Asia > China > Guangdong Province > Guangzhou (0.04)
(6 more...)

Genre:

Personal > Honors (0.46)
Research Report > New Finding (0.46)

Industry:

Transportation > Passenger (1.00)
Transportation > Ground > Road (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
(3 more...)

Wanner, Marc, Jonasson, Johan, Carlsson, Emil, Dubhashi, Devdatt

Variational Quantum Optimization with Continuous Bandits

We introduce a novel approach to variational Quantum algorithms (VQA) via continuous bandits. VQA are a class of hybrid Quantum-classical algorithms where the parameters of Quantum circuits are optimized by classical algorithms. Previous work has used zero and first order gradient based methods, however such algorithms suffer from the barren plateau (BP) problem where gradients and loss differences are exponentially small. We introduce an approach using bandits methods which combine global exploration with local exploitation. We show how VQA can be formulated as a best arm identification problem in a continuous space of arms with Lipschitz smoothness. While regret minimization has been addressed in this setting, existing methods for pure exploration only cover discrete spaces. We give the first results for pure exploration in a continuous setting and derive a fixed-confidence, information-theoretic, instance specific lower bound. Under certain assumptions on the expected payoff, we derive a simple algorithm, which is near-optimal with respect to our lower bound. Finally, we apply our continuous bandit algorithm to two VQA schemes: a PQC and a QAOA quantum circuit, showing that we significantly outperform the previously known state of the art methods (which used gradient based methods).

artificial intelligence, data mining, machine learning, (19 more...)

2502.04021

Country:

Europe > Sweden > Vaestra Goetaland > Gothenburg (0.04)
North America > United States > New York > New York County > New York City (0.04)
Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
(2 more...)

Genre: Research Report > Promising Solution (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science > Data Mining > Big Data (0.67)

Khosoussi, Kasra, Shames, Iman

Joint State and Noise Covariance Estimation

This paper tackles the problem of jointly estimating the noise covariance matrix alongside primary parameters (such as poses and points) from measurements corrupted by Gaussian noise. In such settings, the noise covariance matrix determines the weights assigned to individual measurements in the least squares problem. We show that the joint problem exhibits a convex structure and provide a full characterization of the optimal noise covariance estimate (with analytical solutions) within joint maximum a posteriori and likelihood frameworks and several variants. Leveraging this theoretical result, we propose two novel algorithms that jointly estimate the primary parameters and the noise covariance matrix. To validate our approach, we conduct extensive experiments across diverse scenarios and offer practical insights into their application in robotics and computer vision estimation problems with a particular focus on SLAM.

artificial intelligence, bayesian inference, machine learning, (16 more...)

2502.04584

Country:

Oceania > Australia > Queensland (0.04)
Oceania > Australia > Australian Capital Territory > Canberra (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Robots (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)

Fourati, Fares, Kharrat, Salma, Aggarwal, Vaneet, Alouini, Mohamed-Slim

Every Call is Precious: Global Optimization of Black-Box Functions with Unknown Lipschitz Constants

arXiv.org Machine LearningFeb-6-2025

Optimizing expensive, non-convex, black-box Lipschitz continuous functions presents significant challenges, particularly when the Lipschitz constant of the underlying function is unknown. Such problems often demand numerous function evaluations to approximate the global optimum, which can be prohibitive in terms of time, energy, or resources. In this work, we introduce Every Call is Precious (ECP), a novel global optimization algorithm that minimizes unpromising evaluations by strategically focusing on potentially optimal regions. Unlike previous approaches, ECP eliminates the need to estimate the Lipschitz constant, thereby avoiding additional function evaluations. ECP guarantees no-regret performance for infinite evaluation budgets and achieves minimax-optimal regret bounds within finite budgets. Extensive ablation studies validate the algorithm's robustness, while empirical evaluations show that ECP outperforms 10 benchmark algorithms including Lipschitz, Bayesian, bandits, and evolutionary methods across 30 multi-dimensional non-convex synthetic and real-world optimization problems, which positions ECP as a competitive approach for global optimization.

artificial intelligence, ecp, optimization problem, (14 more...)

arXiv.org Machine Learning

2502.0429

Country:

North America > United States > Wisconsin (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Russia (0.04)
(2 more...)

Genre: Research Report (0.81)

Industry: Transportation > Air (0.61)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)

Farzaneh, Amirmohammad, Simeone, Osvaldo

Ensuring Reliability via Hyperparameter Selection: Review and Advances

Hyperparameter selection is a critical step in the deployment of artificial intelligence (AI) models, particularly in the current era of foundational, pre-trained, models. By framing hyperparameter selection as a multiple hypothesis testing problem, recent research has shown that it is possible to provide statistical guarantees on population risk measures attained by the selected hyperparameter. This paper reviews the Learn-Then-Test (LTT) framework, which formalizes this approach, and explores several extensions tailored to engineering-relevant scenarios. These extensions encompass different risk measures and statistical guarantees, multi-objective optimization, the incorporation of prior knowledge and dependency structures into the hyperparameter selection process, as well as adaptivity. The paper also includes illustrative applications for communication systems.

artificial intelligence, machine learning, optimization problem, (15 more...)

2502.04206

Country:

Asia > Middle East > Jordan (0.05)
Europe > United Kingdom > England > Greater London > London (0.04)

Genre:

Research Report (1.00)
Overview (0.89)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.35)

Efficient variable-length hanging tether parameterization for marsupial robot planning in 3D environments

Martínez-Rozas, S., Alejo, D., Caballero, F., Merino, L., Pérez-Cutiño, M. A., Rodriguez, F., Sánchez-Canales, V., Ventura, I., Díaz-Bañez, J. M.

This paper presents a novel approach to efficiently parameterize and estimate the state of a hanging tether for path and trajectory planning of a UGV tied to a UAV in a marsupial configuration. Most implementations in the state of the art assume a taut tether or make use of the catenary curve to model the shape of the hanging tether. The catenary model is complex to compute and must be instantiated thousands of times during the planning process, becoming a time-consuming task, while the taut tether assumption simplifies the problem, but might overly restrict the movement of the platforms. In order to accelerate the planning process, this paper proposes defining an analytical model to efficiently compute the hanging tether state, and a method to get a tether state parameterization free of collisions. We exploit the existing similarity between the catenary and parabola curves to derive analytical expressions of the tether state.

artificial intelligence, optimization problem, planning & scheduling, (20 more...)

2502.04467

Country:

South America > Chile > Antofagasta Region > Antofagasta Province > Antofagasta (0.04)
Asia > Singapore (0.04)
North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)
(4 more...)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.93)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles > Drones (0.68)