Optimization
HAMMER: Heterogeneous, Multi-Robot Semantic Gaussian Splatting
Yu, Javier, Chen, Timothy, Schwager, Mac
3D Gaussian Splatting offers expressive scene reconstruction, modeling a broad range of visual, geometric, and semantic information. However, efficient real-time map reconstruction with data streamed from multiple robots and devices remains a challenge. To that end, we propose HAMMER, a server-based collaborative Gaussian Splatting method that leverages widely available ROS communication infrastructure to generate 3D, metric-semantic maps from asynchronous robot data-streams with no prior knowledge of initial robot positions and varying on-device pose estimators. HAMMER consists of (i) a frame alignment module that transforms local SLAM poses and image data into a global frame and requires no prior relative pose knowledge, and (ii) an online module for training semantic 3DGS maps from streaming data. HAMMER handles mixed perception modes, adjusts automatically for variations in image pre-processing among different devices, and distills CLIP semantic codes into the 3D scene for open-vocabulary language queries. In our real-world experiments, HAMMER creates higher-fidelity maps (2x) compared to competing baselines and is useful for downstream tasks, such as semantic goal-conditioned navigation (e.g., ``go to the couch"). Accompanying content available at hammer-project.github.io.
FedPref: Federated Learning Across Heterogeneous Multi-objective Preferences
Hartmann, Maria, Danoy, Grรฉgoire, Bouvry, Pascal
Federated Learning (FL) is a distributed machine learning strategy, developed for settings where training data is owned by distributed devices and cannot be shared. FL circumvents this constraint by carrying out model training in distribution. The parameters of these local models are shared intermittently among participants and aggregated to enhance model accuracy. This strategy has been rapidly adopted by the industry in efforts to overcome privacy and resource constraints in model training. However, the application of FL to real-world settings brings additional challenges associated with heterogeneity between participants. Research into mitigating these difficulties in FL has largely focused on only two types of heterogeneity: the unbalanced distribution of training data, and differences in client resources. Yet more types of heterogeneity are becoming relevant as the capability of FL expands to cover more complex problems, from the tuning of LLMs to enabling machine learning on edge devices. In this work, we discuss a novel type of heterogeneity that is likely to become increasingly relevant in future applications: this is preference heterogeneity, emerging when clients learn under multiple objectives, with different importance assigned to each objective on different clients. In this work, we discuss the implications of this type of heterogeneity and propose FedPref, a first algorithm designed to facilitate personalised FL in this setting. We demonstrate the effectiveness of the algorithm across different problems, preference distributions and model architectures. In addition, we introduce a new analytical point of view, based on multi-objective metrics, for evaluating the performance of FL algorithms in this setting beyond the traditional client-focused metrics. We perform a second experimental analysis based in this view, and show that FedPref outperforms compared algorithms.
Ranking with Confidence for Large Scale Comparison Data
Valdeira, Filipa, Soares, Clรกudia
In this work, we leverage a generative data model considering comparison noise to develop a fast, precise, and informative ranking algorithm from pairwise comparisons that produces a measure of confidence on each comparison. The problem of ranking a large number of items from noisy and sparse pairwise comparison data arises in diverse applications, like ranking players in online games, document retrieval or ranking human perceptions. Although different algorithms are available, we need fast, large-scale algorithms whose accuracy degrades gracefully when the number of comparisons is too small. Fitting our proposed model entails solving a non-convex optimization problem, which we tightly approximate by a sum of quasi-convex functions and a regularization term. Resorting to an iterative reweighted minimization and the Primal-Dual Hybrid Gradient method, we obtain PD-Rank, achieving a Kendall tau 0.1 higher than all comparing methods, even for 10\% of wrong comparisons in simulated data matching our data model, and leading in accuracy if data is generated according to the Bradley-Terry model, in both cases faster by one order of magnitude, in seconds. In real data, PD-Rank requires less computational time to achieve the same Kendall tau than active learning methods.
Optimal Multi-Objective Best Arm Identification with Fixed Confidence
Chen, Zhirui, Karthik, P. N., Chee, Yeow Meng, Tan, Vincent Y. F.
We consider a multi-armed bandit setting with finitely many arms, in which each arm yields an $M$-dimensional vector reward upon selection. We assume that the reward of each dimension (a.k.a. {\em objective}) is generated independently of the others. The best arm of any given objective is the arm with the largest component of mean corresponding to the objective. The end goal is to identify the best arm of {\em every} objective in the shortest (expected) time subject to an upper bound on the probability of error (i.e., fixed-confidence regime). We establish a problem-dependent lower bound on the limiting growth rate of the expected stopping time, in the limit of vanishing error probabilities. This lower bound, we show, is characterised by a max-min optimisation problem that is computationally expensive to solve at each time step. We propose an algorithm that uses the novel idea of {\em surrogate proportions} to sample the arms at each time step, eliminating the need to solve the max-min optimisation problem at each step. We demonstrate theoretically that our algorithm is asymptotically optimal. In addition, we provide extensive empirical studies to substantiate the efficiency of our algorithm. While existing works on pure exploration with multi-objective multi-armed bandits predominantly focus on {\em Pareto frontier identification}, our work fills the gap in the literature by conducting a formal investigation of the multi-objective best arm identification problem.
Safety in safe Bayesian optimization and its ramifications for control
Fiedler, Christian, Menn, Johanna, Trimpe, Sebastian
A recurring and important task in control engineering is parameter tuning under constraints, which conceptually amounts to optimization of a blackbox function accessible only through noisy evaluations. For example, in control practice parameters of a pre-designed controller are often tuned online in feedback with a plant, and only safe parameter values should be tried, avoiding for example instability. Recently, machine learning methods have been deployed for this important problem, in particular, Bayesian optimization (BO). To handle safety constraints, algorithms from safe BO have been utilized, especially SafeOpt-type algorithms, which enjoy considerable popularity in learning-based control, robotics, and adjacent fields. However, we identify two significant obstacles to practical safety. First, SafeOpt-type algorithms rely on quantitative uncertainty bounds, and most implementations replace these by theoretically unsupported heuristics. Second, the theoretically valid uncertainty bounds crucially depend on a quantity - the reproducing kernel Hilbert space norm of the target function - that at present is impossible to reliably bound using established prior engineering knowledge. By careful numerical experiments we show that these issues can indeed cause safety violations. To overcome these problems, we propose Lipschitz-only Safe Bayesian Optimization (LoSBO), a safe BO algorithm that relies only on a known Lipschitz bound for its safety. Furthermore, we propose a variant (LoS-GP-UCB) that avoids gridding of the search space and is therefore applicable even for moderately high-dimensional problems.
Reviews: The Implicit Bias of AdaGrad on Separable Data
The main contribution of the paper is to characterize the implicit bias of the adagrad on linear classification problems using logistic loss (and other losses). They show that the adagrad converges to a direction which is a solution of a quadratic optimization problem depending on the initial conditions, hyperparameters and the data itself unlike gradient descent which converges to the max margin direction. They also give a few toy examples to demonstrate the properties of the adagrad solution and the difference as compared to the gradient descent direction. Significance: It is an important problem to understand the implicit bias of various optimization algorithms on the solution and there have been several recent works along this direction. The motivation comes from understanding the generalization abilities of overparametrized deep neural networks.
Reviews: Model Compression with Adversarial Robustness: A Unified Optimization Framework
This paper formulates a new problem and proposed a reasonable algorithm. However, I am not totally convinced the current jointly optimization is significantly better a two steps approach of (1) first do adversarial learning and then (2) compress the model by pruning and compression. As shown in Figure 2, the performance of the proposed method is almost the same as the two step approach (adversarial learning pruning). Moreover, the current paper could be improved in the following aspects: - eq (3): what about the non-conversational layers? I wish to see results of other adversarial learning. Is it possible to provide experiments for large neural networks?
Reviews: A Convex Relaxation Barrier to Tight Robustness Verification of Neural Networks
The paper targets the problem of robustness verification of neural networks. This is a very popular and important problem. One of the prominent ways to deal with it is by formulating it as a nonlinear optimization problem and then relaxing its constraints to form a linear program relaxation. These relaxations are not guaranteed to return the optimal value, but they can be solved in polynomial time and provide bounds on the optimal solution. The main contributions of the paper are as follows: 1. Proposing a unified framework that generalizes all known layerwise LP relaxations, and showing their relationship (i.e., which relaxation is tighter).
Review for NeurIPS paper: Finding Second-Order Stationary Points Efficiently in Smooth Nonconvex Linearly Constrained Optimization Problems
Additional Feedback: To summarize, the authors consider the problem of finding approximate Second Order Stationary Points (SOSPs). Compared with other works, the authors assume that the objective is generally non-convex and constrains are linear. Two methods are designed to solve this optimization problem. Both of these two methods are proved to have polynomial per-iteration complexity and global sublinear rates. In general, the problem is both theoretically and empirically interesting.