Optimization
Optimization Guarantees for Square-Root Natural-Gradient Variational Inference
Kumar, Navish, Möllenhoff, Thomas, Khan, Mohammad Emtiyaz, Lucchi, Aurelien
Variational inference with natural-gradient descent often shows fast convergence in practice, but its theoretical convergence guarantees have been challenging to establish. This is true even for the simplest cases that involve concave log-likelihoods and use a Gaussian approximation. We show that the challenge can be circumvented for such cases using a square-root parameterization for the Gaussian covariance. This approach establishes novel convergence guarantees for natural-gradient variational-Gaussian inference and its continuous-time gradient flow. Our experiments demonstrate the effectiveness of natural gradient methods and highlight their advantages over algorithms that use Euclidean or Wasserstein geometries.
Synergistic Localization and Sensing in MIMO-OFDM Systems via Mixed-Integer Bilevel Learning
Zhu, Zelin, Yang, Kai, Zhang, Rui
Wireless localization and sensing technologies are essential in modern wireless networks, supporting applications in smart cities, the Internet of Things (IoT), and autonomous systems. High-performance localization and sensing systems are critical for both network efficiency and emerging intelligent applications. Integrating channel state information (CSI) with deep learning has recently emerged as a promising solution. Recent works have leveraged the spatial diversity of multiple input multiple output (MIMO) systems and the frequency granularity of orthogonal frequency division multiplexing (OFDM) waveforms to improve spatial resolution. Nevertheless, the joint modeling of localization and sensing under the high-dimensional CSI characteristics of MIMO-OFDM systems remains insufficiently investigated. This work aims to jointly model and optimize localization and sensing tasks to harness their potential synergy. We first formulate localization and sensing as a mixed-integer bilevel deep learning problem and then propose a novel stochastic proximal gradient-based mixed-integer bilevel optimization (SPG-MIBO) algorithm. SPG-MIBO is well-suited for high-dimensional and large-scale datasets, leveraging mini-batch training at each step for computational and memory efficiency. The algorithm is also supported by theoretical convergence guarantees. Extensive experiments on multiple datasets validate its effectiveness and highlight the performance gains from joint localization and sensing optimization.
Space-Filling Regularization for Robust and Interpretable Nonlinear State Space Models
Klein, Hermann, Herkersdorf, Max Heinz, Nelles, Oliver
The state space dynamics representation is the most general approach for nonlinear systems and often chosen for system identification. During training, the state trajectory can deform significantly leading to poor data coverage of the state space. This can cause significant issues for space-oriented training algorithms which e.g. rely on grid structures, tree partitioning, or similar. Besides hindering training, significant state trajectory deformations also deteriorate interpretability and robustness properties. This paper proposes a new type of space-filling regularization that ensures a favorable data distribution in state space via introducing a data-distribution-based penalty. This method is demonstrated in local model network architectures where good interpretability is a major concern. The proposed approach integrates ideas from modeling and design of experiments for state space structures. This is why we present two regularization techniques for the data point distributions of the state trajectories for local affine state space models. Beyond that, we demonstrate the results on a widely known system identification benchmark.
Sampling Imbalanced Data with Multi-objective Bilevel Optimization
Medlin, Karen, Leyffer, Sven, Raghavan, Krishnan
Two-class classification problems are often characterized by an imbalance between the number of majority and minority datapoints resulting in poor classification of the minority class in particular. Traditional approaches, such as reweighting the loss function or naïve resampling, risk overfitting and subsequently fail to improve classification because they do not consider the diversity between majority and minority datasets. Such consideration is infeasible because there is no metric that can measure the impact of imbalance on the model. To obviate these challenges, we make two key contributions. First, we introduce MOODS~(Multi-Objective Optimization for Data Sampling), a novel multi-objective bilevel optimization framework that guides both synthetic oversampling and majority undersampling. Second, we introduce a validation metric -- `$ε/ δ$ non-overlapping diversification metric' -- that quantifies the goodness of a sampling method towards model performance. With this metric we experimentally demonstrate state-of-the-art performance with improvement in diversity driving a $1-15 \%$ increase in $F1$ scores.
Not All Preferences are What You Need for Post-Training: Selective Alignment Strategy for Preference Optimization
Post-training alignment of large language models (LLMs) is a critical challenge, as not all tokens contribute equally to model performance. This paper introduces a selective alignment strategy that prioritizes high-impact tokens within preference pairs, leveraging token-level log-probability differences between the current policy and a reference model. By focusing on these informative tokens, our approach reduces computational overhead and enhances alignment fidelity. We further explore the role of reference model quality, demonstrating that stronger reference models significantly improve token selection accuracy and overall optimization effectiveness. Comprehensive experiments on benchmarks such as Arena-Hard and MT-Bench validate the superiority of our Selective-DPO method over standard DPO and distillation-based baselines. Our findings highlight the importance of token-level optimization and reference model selection in advancing preference alignment for LLMs. The code is available at https://github.com/Dongzhijin/SDPO.
Optimal Auction Design in the Joint Advertising
Online advertising is a vital revenue source for major internet platforms. Recently, joint advertising, which assigns a bundle of two advertisers in an ad slot instead of allocating a single advertiser, has emerged as an effective method for enhancing allocation efficiency and revenue. However, existing mechanisms for joint advertising fail to realize the optimality, as they tend to focus on individual advertisers and overlook bundle structures. This paper identifies an optimal mechanism for joint advertising in a single-slot setting. For multi-slot joint advertising, we propose \textbf{BundleNet}, a novel bundle-based neural network approach specifically designed for joint advertising. Our extensive experiments demonstrate that the mechanisms generated by \textbf{BundleNet} approximate the theoretical analysis results in the single-slot setting and achieve state-of-the-art performance in the multi-slot setting. This significantly increases platform revenue while ensuring approximate dominant strategy incentive compatibility and individual rationality.
Optimizing Model Splitting and Device Task Assignment for Deceptive Signal Assisted Private Multi-hop Split Learning
Wei, Dongyu, Xu, Xiaoren, Liu, Yuchen, Poor, H. Vincent, Chen, Mingzhe
In this paper, deceptive signal-assisted private split learning is investigated. In our model, several edge devices jointly perform collaborative training, and some eavesdroppers aim to collect the model and data information from devices. To prevent the eavesdroppers from collecting model and data information, a subset of devices can transmit deceptive signals. Therefore, it is necessary to determine the subset of devices used for deceptive signal transmission, the subset of model training devices, and the models assigned to each model training device. This problem is formulated as an optimization problem whose goal is to minimize the information leaked to eavesdroppers while meeting the model training energy consumption and delay constraints. To solve this problem, we propose a soft actor-critic deep reinforcement learning framework with intrinsic curiosity module and cross-attention (ICM-CA) that enables a centralized agent to determine the model training devices, the deceptive signal transmission devices, the transmit power, and sub-models assigned to each model training device without knowing the position and monitoring probability of eavesdroppers. The proposed method uses an ICM module to encourage the server to explore novel actions and states and a CA module to determine the importance of each historical state-action pair thus improving training efficiency. Simulation results demonstrate that the proposed method improves the convergence rate by up to 3x and reduces the information leaked to eavesdroppers by up to 13% compared to the traditional SAC algorithm.
Optimizing Communication and Device Clustering for Clustered Federated Learning with Differential Privacy
Wei, Dongyu, Xu, Xiaoren, Mao, Shiwen, Chen, Mingzhe
In this paper, a secure and communication-efficient clustered federated learning (CFL) design is proposed. In our model, several base stations (BSs) with heterogeneous task-handling capabilities and multiple users with non-independent and identically distributed (non-IID) data jointly perform CFL training incorporating differential privacy (DP) techniques. Since each BS can process only a subset of the learning tasks and has limited wireless resource blocks (RBs) to allocate to users for federated learning (FL) model parameter transmission, it is necessary to jointly optimize RB allocation and user scheduling for CFL performance optimization. Meanwhile, our considered CFL method requires devices to use their limited data and FL model information to determine their task identities, which may introduce additional communication overhead. We formulate an optimization problem whose goal is to minimize the training loss of all learning tasks while considering device clustering, RB allocation, DP noise, and FL model transmission delay. To solve the problem, we propose a novel dynamic penalty function assisted value decomposed multi-agent reinforcement learning (DPVD-MARL) algorithm that enables distributed BSs to independently determine their connected users, RBs, and DP noise of the connected users but jointly minimize the training loss of all learning tasks across all BSs. Different from the existing MARL methods that assign a large penalty for invalid actions, we propose a novel penalty assignment scheme that assigns penalty depending on the number of devices that cannot meet communication constraints (e.g., delay), which can guide the MARL scheme to quickly find valid actions, thus improving the convergence speed. Simulation results show that the DPVD-MARL can improve the convergence rate by up to 20% and the ultimate accumulated rewards by 15% compared to independent Q-learning.
g2o vs. Ceres: Optimizing Scan Matching in Cartographer SLAM
This article presents a comparative analysis of g2o and Ceres solvers in enhancing scan matching performance within the Cartographer framework. Cartographer, a widely-used library for Simultaneous Localization and Mapping (SLAM), relies on optimization algorithms to refine pose estimates and improve map accuracy. The research aims to evaluate the performance, efficiency, and accuracy of the g2o solver in comparison to the Ceres solver, which is the default in Cartographer. In our experiments comparing Ceres and g2o within Cartographer, Ceres outperformed g2o in terms of speed, convergence efficiency, and overall map clarity. Ceres required fewer iterations and less time to converge, producing more accurate and well-defined maps, especially in real-world mapping scenarios with the AgileX LIMO robot. However, g2o excelled in localized obstacle detection, highlighting its value in specific situations.
Non-Asymptotic Analysis of Online Local Private Learning with SGD
Shi, Enze, Xie, Jinhan, Jiang, Bei, Kong, Linglong, He, Xuming
Differentially Private Stochastic Gradient Descent (DP-SGD) has been widely used for solving optimization problems with privacy guarantees in machine learning and statistics. Despite this, a systematic non-asymptotic convergence analysis for DP-SGD, particularly in the context of online problems and local differential privacy (LDP) models, remains largely elusive. Existing non-asymptotic analyses have focused on non-private optimization methods, and hence are not applicable to privacy-preserving optimization problems. This work initiates the analysis to bridge this gap and opens the door to non-asymptotic convergence analysis of private optimization problems. A general framework is investigated for the online LDP model in stochastic optimization problems. We assume that sensitive information from individuals is collected sequentially and aim to estimate, in real-time, a static parameter that pertains to the population of interest. Most importantly, we conduct a comprehensive non-asymptotic convergence analysis of the proposed estimators in finite-sample situations, which gives their users practical guidelines regarding the effect of various hyperparameters, such as step size, parameter dimensions, and privacy budgets, on convergence rates. Our proposed estimators are validated in the theoretical and practical realms by rigorous mathematical derivations and carefully constructed numerical experiments.