Goto

Collaborating Authors

 Optimization


Sampling-Based Global Optimal Control and Estimation via Semidefinite Programming

arXiv.org Artificial Intelligence

Global optimization has gained attraction over the past decades, thanks to the development of both theoretical foundations and efficient numerical routines. Among recent advances, Kernel Sum of Squares (KernelSOS) provides a powerful theoretical framework, combining the expressivity of kernel methods with the guarantees of SOS optimization. In this paper, we take KernelSOS from theory to practice and demonstrate its use on challenging control and robotics problems. We identify and address the practical considerations required to make the method work in applied settings: restarting strategies, systematic calibration of hyperparameters, methods for recovering minimizers, and the combination with fast local solvers. As a proof of concept, the application of KernelSOS to robot localization highlights its competitiveness with existing SOS approaches that rely on heuristics and handcrafted reformulations to render the problem polynomial. Even in the high-dimensional, non-parametric setting of trajectory optimization with simulators treated as black boxes, we demonstrate how KernelSOS can be combined with fast local solvers to uncover higher-quality solutions without compromising overall runtimes.


LoRA meets Riemannion: Muon Optimizer for Parametrization-independent Low-Rank Adapters

arXiv.org Artificial Intelligence

This work presents a novel, fully Riemannian framework for Low-Rank Adaptation (LoRA) that geometrically treats low-rank adapters by optimizing them directly on the fixed-rank manifold. This formulation eliminates the parametrization ambiguity present in standard Euclidean optimizers. Our framework integrates three key components to achieve this: (1) we derive Riemannion, a new Riemannian optimizer on the fixed-rank matrix manifold that generalizes the recently proposed Muon optimizer; (2) we develop a Riemannian gradient-informed LoRA initialization, and (3) we provide an efficient implementation without prominent overhead that uses automatic differentiation to compute arising geometric operations while adhering to best practices in numerical linear algebra. Comprehensive experimental results on both LLM and diffusion model architectures demonstrate that our approach yields consistent and noticeable improvements in convergence speed and final task performance over both standard LoRA and its state-of-the-art modifications.


Initial Distribution Sensitivity of Constrained Markov Decision Processes

arXiv.org Artificial Intelligence

Constrained Markov Decision Processes (CMDPs) are notably more complex to solve than standard MDPs due to the absence of universally optimal policies across all initial state distributions. This necessitates re-solving the CMDP whenever the initial distribution changes. In this work, we analyze how the optimal value of CMDPs varies with different initial distributions, deriving bounds on these variations using duality analysis of CMDPs and perturbation analysis in linear programming. Moreover, we show how such bounds can be used to analyze the regret of a given policy due to unknown variations of the initial distribution.






Supplementary Material to Improving Inference for Neural Compression S1 Stochastic Annealing

Neural Information Processing Systems

Here we provide conceptual illustrations of our stochastic annealing idea on a simple example. As mentioned in Section 3.3 and 4, lossy bits-back modifies the above Base Hyperprior as follows: All methods were tuned on a best-effort basis to ensure convergence, except that STE consistently encountered convergence issues even with a tiny learning rate (see [Yin et al., 2019]). The rate-distortion results for MAP and STE were calculated with early stopping (i.e., using the intermediate Figures in the bottom row focus on the same cropped region of images in the top row. RGB; higher values are better.


Improving Inference for Neural Image Compression

Neural Information Processing Systems

Habibian et al., 2019, Y ang et al., 2020a], which can reduce a sizable amount of global internet traffic. State-of-the-art neural methods for lossy image compression [Ballé et al., 2018, Minnen et al., 2018, Lee et al., 2019] learn a mapping between images and latent variables with a variational


Submodular Maximization Through Barrier Functions

Neural Information Processing Systems

In this paper, we introduce a novel technique for constrained submodular maximization, inspired by barrier functions in continuous optimization. This connection not only improves the running time for constrained submodular maximization but also provides the state of the art guarantee. More precisely, for maximizing a monotone submodular function subject to the combination of a k -matchoid and null -knapsack constraints (for null k), we propose a potential function that can be approximately minimized. Once we minimize the potential function up to an ε error, it is guaranteed that we have found a feasible set with a 2(k +1+ ε)-approximation factor which can indeed be further improved to ( k +1+ ε) by an enumeration technique. We extensively evaluate the performance of our proposed algorithm over several real-world applications, including a movie recommendation system, summarization tasks for Y ouTube videos, Twitter feeds and Y elp business locations, and a set cover problem.