AITopics

Neural Information Processing SystemsJan-15-2025, 05:28:29 GMT

Trust Region Policy Optimization with Optimal Transport Discrepancies: Duality and Algorithm for Continuous Actions

Policy Optimization (PO) algorithms have been proven particularly suited to handle the high-dimensionality of real-world continuous control tasks. In this context, Trust Region Policy Optimization methods represent a popular approach to stabilize the policy updates. These usually rely on the Kullback-Leibler (KL) divergence to limit the change in the policy. The Wasserstein distance represents a natural alternative, in place of the KL divergence, to define trust regions or to regularize the objective function. However, state-of-the-art works either resort to its approximations or do not provide an algorithm for continuous state-action spaces, reducing the applicability of the method.In this paper, we explore optimal transport discrepancies (which include the Wasserstein distance) to define trust regions, and we propose a novel algorithm - Optimal Transport Trust Region Policy Optimization (OT-TRPO) - for continuous state-action spaces.

duality and algorithm, optimal transport discrepancy, trust region policy optimization, (5 more...)

Neural Information Processing Systems

Genre: Research Report (0.39)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.41)

ThinTact:Thin Vision-Based Tactile Sensor by Lensless Imaging

Xu, Jing, Chen, Weihang, Qian, Hongyu, Wu, Dan, Chen, Rui

Vision-based tactile sensors have drawn increasing interest in the robotics community. However, traditional lens-based designs impose minimum thickness constraints on these sensors, limiting their applicability in space-restricted settings. In this paper, we propose ThinTact, a novel lensless vision-based tactile sensor with a sensing field of over 200 mm2 and a thickness of less than 10 mm.ThinTact utilizes the mask-based lensless imaging technique to map the contact information to CMOS signals. To ensure real-time tactile sensing, we propose a real-time lensless reconstruction algorithm that leverages a frequency-spatial-domain joint filter based on discrete cosine transform (DCT). This algorithm achieves computation significantly faster than existing optimization-based methods. Additionally, to improve the sensing quality, we develop a mask optimization method based on the generic algorithm and the corresponding system matrix calibration algorithm.We evaluate the performance of our proposed lensless reconstruction and tactile sensing through qualitative and quantitative experiments. Furthermore, we demonstrate ThinTact's practical applicability in diverse applications, including texture recognition and contact-rich object manipulation. The paper will appear in the IEEE Transactions on Robotics: https://ieeexplore.ieee.org/document/10842357. Video: https://youtu.be/YrOO9BDMAHo

artificial intelligence, machine learning, sensor, (18 more...)

doi: 10.1109/TRO.2025.3530319

2501.09273

Country: Asia > China (0.14)

Genre: Research Report > New Finding (0.46)

Industry: Energy > Oil & Gas (0.46)

Technology:

Information Technology > Sensing and Signal Processing (1.00)
Information Technology > Artificial Intelligence > Robots > Manipulation (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Wang, Jingyi, Wang, Haowei, Petra, Cosmin G., Chiang, Nai-Yuan

On the convergence of noisy Bayesian Optimization with Expected Improvement

arXiv.org Machine LearningJan-15-2025

Expected improvement (EI) is one of the most widely-used acquisition functions in Bayesian optimization (BO). Despite its proven success in applications for decades, important open questions remain on the theoretical convergence behaviors and rates for EI. In this paper, we contribute to the convergence theories of EI in three novel and critical area. First, we consider objective functions that are under the Gaussian process (GP) prior assumption, whereas existing works mostly focus on functions in the reproducing kernel Hilbert space (RKHS). Second, we establish the first asymptotic error bound and its corresponding rate for GP-EI with noisy observations under the GP prior assumption. Third, by investigating the exploration and exploitation of the non-convex EI function, we prove improved error bounds for both the noise-free and noisy cases. The improved noiseless bound is extended to the RKHS assumption as well.

artificial intelligence, machine learning, modeling & simulation, (19 more...)

arXiv.org Machine Learning

2501.09262

Country:

North America > United States (0.46)
North America > Canada > Alberta (0.28)
Asia > Singapore (0.14)

Genre: Research Report (0.63)

Industry: Energy > Oil & Gas > Upstream (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.68)
Information Technology > Modeling & Simulation (0.67)

Valenzuela, Andrés Meseguer, Noguera, Francisco Blanes

Task Allocation in Mobile Robot Fleets: A review

Mobile robot fleets are currently used in different scenarios such as medical environments or logistics. The management of these systems provides different challenges that vary from the control of the movement of each robot to the allocation of tasks to be performed. Task Allocation (TA) problem is a key topic for the proper management of mobile robot fleets to ensure the minimization of energy consumption and quantity of necessary robots. Solutions on this aspect are essential to reach economic and environmental sustainability of robot fleets, mainly in industry applications such as warehouse logistics. The minimization of energy consumption introduces TA problem as an optimization issue which has been treated in recent studies. This work focuses on the analysis of current trends in solving TA of mobile robot fleets. Main TA optimization algorithms are presented, including novel methods based on Artificial Intelligence (AI). Additionally, this work showcases most important results extracted from simulations, including frameworks utilized for the development of the simulations. Finally, some conclusions are obtained from the analysis to target on gaps that must be treated in the future.

algorithm, application, robot, (14 more...)

2501.08726

Country: Europe > Spain > Valencian Community > Valencia Province > Valencia (0.04)

Genre: Research Report > Promising Solution (0.34)

Industry:

Transportation > Freight & Logistics Services (0.47)
Transportation > Ground > Road (0.46)
Energy > Power Industry (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots > Locomotion (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (1.00)

GS-LIVO: Real-Time LiDAR, Inertial, and Visual Multi-sensor Fused Odometry with Gaussian Mapping

Hong, Sheng, Zheng, Chunran, Shen, Yishu, Li, Changze, Zhang, Fu, Qin, Tong, Shen, Shaojie

In recent years, 3D Gaussian splatting (3D-GS) has emerged as a novel scene representation approach. However, existing vision-only 3D-GS methods often rely on hand-crafted heuristics for point-cloud densification and face challenges in handling occlusions and high GPU memory and computation consumption. LiDAR-Inertial-Visual (LIV) sensor configuration has demonstrated superior performance in localization and dense mapping by leveraging complementary sensing characteristics: rich texture information from cameras, precise geometric measurements from LiDAR, and high-frequency motion data from IMU. Inspired by this, we propose a novel real-time Gaussian-based simultaneous localization and mapping (SLAM) system. Our map system comprises a global Gaussian map and a sliding window of Gaussians, along with an IESKF-based odometry. The global Gaussian map consists of hash-indexed voxels organized in a recursive octree, effectively covering sparse spatial volumes while adapting to different levels of detail and scales. The Gaussian map is initialized through multi-sensor fusion and optimized with photometric gradients. Our system incrementally maintains a sliding window of Gaussians, significantly reducing GPU computation and memory consumption by only optimizing the map within the sliding window. Moreover, we implement a tightly coupled multi-sensor fusion odometry with an iterative error state Kalman filter (IESKF), leveraging real-time updating and rendering of the Gaussian map. Our system represents the first real-time Gaussian-based SLAM framework deployable on resource-constrained embedded systems, demonstrated on the NVIDIA Jetson Orin NX platform. The framework achieves real-time performance while maintaining robust multi-sensor fusion capabilities. All implementation algorithms, hardware designs, and CAD models will be publicly available.

gaussian, optimization, representation, (16 more...)

2501.08672

Country:

Asia > China > Hong Kong (0.04)
Asia > China > Shanghai > Shanghai (0.04)

Genre: Research Report (1.00)

Industry: Information Technology (0.48)

Technology:

Information Technology > Sensing and Signal Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Architecture > Real Time Systems (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.46)

Diffusion Models as Network Optimizers: Explorations and Analysis

Liang, Ruihuai, Yang, Bo, Chen, Pengyu, Li, Xianjin, Xue, Yifan, Yu, Zhiwen, Cao, Xuelin, Zhang, Yan, Debbah, Mérouane, Poor, H. Vincent, Yuen, Chau

Network optimization is a fundamental challenge in the Internet of Things (IoT) network, often characterized by complex features that make it difficult to solve these problems. Recently, generative diffusion models (GDMs) have emerged as a promising new approach to network optimization, with the potential to directly address these optimization problems. However, the application of GDMs in this field is still in its early stages, and there is a noticeable lack of theoretical research and empirical findings. In this study, we first explore the intrinsic characteristics of generative models. Next, we provide a concise theoretical proof and intuitive demonstration of the advantages of generative models over discriminative models in network optimization. Based on this exploration, we implement GDMs as optimizers aimed at learning high-quality solution distributions for given inputs, sampling from these distributions during inference to approximate or achieve optimal solutions. Specifically, we utilize denoising diffusion probabilistic models (DDPMs) and employ a classifier-free guidance mechanism to manage conditional guidance based on input parameters. We conduct extensive experiments across three challenging network optimization problems. By investigating various model configurations and the principles of GDMs as optimizers, we demonstrate the ability to overcome prediction errors and validate the convergence of generated solutions to optimal solutions. We provide code and data at https://github.com/qiyu3816/DiffSG.

optimal solution, optimization, optimization problem, (14 more...)

doi: 10.1109/JIOT.2025.3528955

2411.00453

Country:

Asia > Singapore (0.04)
North America > United States > New York > New York County > New York City (0.04)
Europe > Norway > Eastern Norway > Oslo (0.04)
(4 more...)

Genre: Research Report > New Finding (0.34)

Industry: Information Technology (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

An Accelerated Algorithm for Stochastic Bilevel Optimization under Unbounded Smoothness

Gong, Xiaochuan, Hao, Jie, Liu, Mingrui

This paper investigates a class of stochastic bilevel optimization problems where the upper-level function is nonconvex with potentially unbounded smoothness and the lower-level problem is strongly convex. These problems have significant applications in sequential data learning, such as text classification using recurrent neural networks. The unbounded smoothness is characterized by the smoothness constant of the upper-level function scaling linearly with the gradient norm, lacking a uniform upper bound. Existing state-of-the-art algorithms require $\widetilde{O}(1/\epsilon^4)$ oracle calls of stochastic gradient or Hessian/Jacobian-vector product to find an $\epsilon$-stationary point. However, it remains unclear if we can further improve the convergence rate when the assumptions for the function in the population level also hold for each random realization almost surely. To address this issue, we propose a new Accelerated Bilevel Optimization algorithm named AccBO. The algorithm updates the upper-level variable by normalized stochastic gradient descent with recursive momentum and the lower-level variable by the stochastic Nesterov accelerated gradient descent algorithm with averaging. We prove that our algorithm achieves an oracle complexity of $\widetilde{O}(1/\epsilon^3)$ to find an $\epsilon$-stationary point, when the lower-level stochastic gradient's variance is $O(\epsilon)$. Our proof relies on a novel lemma characterizing the dynamics of stochastic Nesterov accelerated gradient descent algorithm under distribution drift with high probability for the lower-level variable, which is of independent interest and also plays a crucial role in analyzing the hypergradient estimation error over time. Experimental results on various tasks confirm that our proposed algorithm achieves the predicted theoretical acceleration and significantly outperforms baselines in bilevel optimization.

algorithm, lower-level variable, optimization, (14 more...)

2409.19212

Genre: Research Report > Experimental Study (1.00)

Industry: Education (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (1.00)

Large Language Model is Secretly a Protein Sequence Optimizer

Wang, Yinkai, He, Jiaxing, Du, Yuanqi, Chen, Xiaohui, Li, Jianan Canal, Liu, Li-Ping, Xu, Xiaolin, Hassoun, Soha

We consider the protein sequence engineering problem, which aims to find protein sequences with high fitness levels, starting from a given wild-type sequence. Directed evolution has been a dominating paradigm in this field which has an iterative process to generate variants and select via experimental feedback. We demonstrate large language models (LLMs), despite being trained on massive texts, are secretly protein sequence optimizers. With a directed evolutionary method, LLM can perform protein engineering through Pareto and experiment-budget constrained optimization, demonstrating success on both synthetic and experimental fitness landscapes. Protein engineering aims to develop novel protein sequences exhibiting improved or new-to-nature functions (Romero & Arnold, 2009).

artificial intelligence, large language model, natural language, (15 more...)

2501.09274

Genre: Research Report (0.50)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Foundations of Large Language Models

Xiao, Tong, Zhu, Jingbo

The development of neural sequence models, such as Transformers [Vaswani et al., 2017], along with the improvements in large-scale self-supervised learning, has opened the door to universal language understanding and generation. This achievement is largely motivated by pre-training: we separate common components from many neural network-based systems, and then train them on huge amounts of unlabeled data using self-supervision. These pre-trained models serve as foundation models that can be easily adapted to different tasks via fine-tuning or prompting. As a result, the paradigm of NLP has been enormously changed. In many cases, large-scale supervised learning for specific tasks is no longer required, and instead, we only need to adapt pre-trained foundation models.

large language model, machine learning, reinforcement learning, (30 more...)

2501.09223

Country:

Europe (1.00)
North America > United States (0.92)

Genre:

Workflow (1.00)
Research Report > Promising Solution (1.00)
Overview (1.00)
Research Report > New Finding (0.92)

Industry:

Leisure & Entertainment > Sports (1.00)
Law (1.00)
Information Technology > Security & Privacy (1.00)
(6 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(10 more...)