global optimizer
Meta-learning characteristics and dynamics of quantum systems
Schorling, Lucas, Vaidhyanathan, Pranav, Schuff, Jonas, Carballido, Miguel J., Zumbühl, Dominik, Milburn, Gerard, Marquardt, Florian, Foerster, Jakob, Osborne, Michael A., Ares, Natalia
While machine learning holds great promise for quantum technologies, most current methods focus on predicting or controlling a specific quantum system. Meta-learning approaches, however, can adapt to new systems for which little data is available, by leveraging knowledge obtained from previous data associated with similar systems. In this paper, we meta-learn dynamics and characteristics of closed and open two-level systems, as well as the Heisenberg model. Based on experimental data of a Loss-DiVincenzo spin-qubit hosted in a Ge/Si core/shell nanowire for different gate voltage configurations, we predict qubit characteristics i.e. $g$-factor and Rabi frequency using meta-learning. The algorithm we introduce improves upon previous state-of-the-art meta-learning methods for physics-based systems by introducing novel techniques such as adaptive learning rates and a global optimizer for improved robustness and increased computational efficiency. We benchmark our method against other meta-learning methods, a vanilla transformer, and a multilayer perceptron, and demonstrate improved performance.
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.05)
- Europe > Switzerland > Basel-City > Basel (0.04)
- Europe > Germany (0.04)
- (2 more...)
Finding a sparse vector in a subspace: Linear sparsity using alternating directions
This problem can be considered a homogeneous variant of the sparse recovery problem, and finds applications in sparse dictionary learning, sparse PCA, and other problems in signal processing and machine learning. Simple convex heuristics for this problem provably break down when the fraction of nonzero entries in the target sparse vector substantially exceeds 1/ n. In contrast, we exhibit a relatively simple nonconvex approach based on alternating directions, which provably succeeds even when the fraction of nonzero entries is Ω(1). To our knowledge, this is the first practical algorithm to achieve this linear scaling. This result assumes a planted sparse model, in which the target sparse vector is embedded in an otherwise random subspace. Empirically, our proposed algorithm also succeeds in more challenging data models arising, e.g., from sparse dictionary learning.
- North America > United States > New York > Richmond County > New York City (0.04)
- North America > United States > New York > Queens County > New York City (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- (2 more...)
Composing Global Optimizers to Reasoning Tasks via Algebraic Objects in Neural Nets
We prove rich algebraic structures of the solution space for 2-layer neural networks with quadratic activation and $L_2$ loss, trained on reasoning tasks in Abelian group (e.g., modular addition). Such a rich structure enables analytical construction of global optimal solutions from partial solutions that only satisfy part of the loss, despite its high nonlinearity. We coin the framework as CoGO (Composing Global Optimizers). Specifically, we show that the weight space over different numbers of hidden nodes of the 2-layer network is equipped with a semi-ring algebraic structure, and the loss function to be optimized consists of monomial potentials, which are ring homomorphism, allowing partial solutions to be composed into global ones by ring addition and multiplication. Our experiments show that around $95\%$ of the solutions obtained by gradient descent match exactly our theoretical constructions. Although the global optimizers constructed only required a small number of hidden nodes, our analysis on gradient dynamics shows that over-parameterization asymptotically decouples training dynamics and is beneficial. We further show that training dynamics favors simpler solutions under weight decay, and thus high-order global optimizers such as perfect memorization are unfavorable. Code can be found at https://github.com/facebookresearch/luckmatters/tree/yuandong3/ssl/real-dataset.
- North America > Canada > Ontario > National Capital Region > Ottawa (0.04)
- Asia > Middle East > Jordan (0.04)
Finding a sparse vector in a subspace: Linear sparsity using alternating directions
This problem can be considered a homogeneous variant of the sparse recovery problem, and finds applications in sparse dictionary learning, sparse PCA, and other problems in signal processing and machine learning. Simple convex heuristics for this problem provably break down when the fraction of nonzero entries in the target sparse vector substantially exceeds 1/ n. In contrast, we exhibit a relatively simple nonconvex approach based on alternating directions, which provably succeeds even when the fraction of nonzero entries is Ω(1). To our knowledge, this is the first practical algorithm to achieve this linear scaling. This result assumes a planted sparse model, in which the target sparse vector is embedded in an otherwise random subspace. Empirically, our proposed algorithm also succeeds in more challenging data models arising, e.g., from sparse dictionary learning.
- North America > United States > New York > Richmond County > New York City (0.04)
- North America > United States > New York > Queens County > New York City (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- (2 more...)
Verification and Synthesis of Robust Control Barrier Functions: Multilevel Polynomial Optimization and Semidefinite Relaxation
Kang, Shucheng, Chen, Yuxiao, Yang, Heng, Pavone, Marco
We study the problem of verification and synthesis of robust control barrier functions (CBF) for control-affine polynomial systems with bounded additive uncertainty and convex polynomial constraints on the control. We first formulate robust CBF verification and synthesis as multilevel polynomial optimization problems (POP), where verification optimizes -- in three levels -- the uncertainty, control, and state, while synthesis additionally optimizes the parameter of a chosen parametric CBF candidate. We then show that, by invoking the KKT conditions of the inner optimizations over uncertainty and control, the verification problem can be simplified as a single-level POP and the synthesis problem reduces to a min-max POP. This reduction leads to multilevel semidefinite relaxations. For the verification problem, we apply Lasserre's hierarchy of moment relaxations. For the synthesis problem, we draw connections to existing relaxation techniques for robust min-max POP, which first use sum-of-squares programming to find increasingly tight polynomial lower bounds to the unknown value function of the verification POP, and then call Lasserre's hierarchy again to maximize the lower bounds. Both semidefinite relaxations guarantee asymptotic global convergence to optimality. We provide an in-depth study of our framework on the controlled Van der Pol Oscillator, both with and without additive uncertainty.
- North America > United States > Massachusetts > Middlesex County > Belmont (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
On Local Optimizers of Acquisition Functions in Bayesian Optimization
Bayesian optimization is a sample-efficient method for finding a global optimum of an expensive-to-evaluate black-box function. A global solution is found by accumulating a pair of query point and corresponding function value, repeating these two procedures: (i) learning a surrogate model for the objective function using the data observed so far; (ii) the maximization of an acquisition function to determine where next to query the objective function. Convergence guarantees are only valid when the global optimizer of the acquisition function is found and selected as the next query point. In practice, however, local optimizers of acquisition functions are also used, since searching the exact optimizer of the acquisition function is often a non-trivial or time-consuming task. In this paper we present an analysis on the behavior of local optimizers of acquisition functions, in terms of instantaneous regrets over global optimizers. We also present the performance analysis when multi-started local optimizers are used to find the maximum of the acquisition function. Numerical experiments confirm the validity of our theoretical analysis.
- Asia > South Korea > Gyeongsangbuk-do > Pohang (0.04)
- North America > Canada > Quebec > Montreal (0.04)
- Asia > Middle East > Israel > Haifa District > Haifa (0.04)
Porcupine Neural Networks: (Almost) All Local Optima are Global
Feizi, Soheil, Javadi, Hamid, Zhang, Jesse, Tse, David
Neural networks have been used prominently in several machine learning and statistics applications. In general, the underlying optimization of neural networks is non-convex which makes their performance analysis challenging. In this paper, we take a novel approach to this problem by asking whether one can constrain neural network weights to make its optimization landscape have good theoretical properties while at the same time, be a good approximation for the unconstrained one. For two-layer neural networks, we provide affirmative answers to these questions by introducing Porcupine Neural Networks (PNNs) whose weight vectors are constrained to lie over a finite set of lines. We show that most local optima of PNN optimizations are global while we have a characterization of regions where bad local optimizers may exist. Moreover, our theoretical and empirical results suggest that an unconstrained neural network can be approximated using a polynomially-large PNN.
Finding a sparse vector in a subspace: Linear sparsity using alternating directions
Qu, Qing, Sun, Ju, Wright, John
We consider the problem of recovering the sparsest vector in a subspace $ \mathcal{S} \in \mathbb{R}^p $ with $ \text{dim}(\mathcal{S})=n$. This problem can be considered a homogeneous variant of the sparse recovery problem, and finds applications in sparse dictionary learning, sparse PCA, and other problems in signal processing and machine learning. Simple convex heuristics for this problem provably break down when the fraction of nonzero entries in the target sparse vector substantially exceeds $1/ \sqrt{n}$. In contrast, we exhibit a relatively simple nonconvex approach based on alternating directions, which provably succeeds even when the fraction of nonzero entries is $\Omega(1)$. To our knowledge, this is the first practical algorithm to achieve this linear scaling. This result assumes a planted sparse model, in which the target sparse vector is embedded in an otherwise random subspace. Empirically, our proposed algorithm also succeeds in more challenging data models arising, e.g., from sparse dictionary learning.
- North America > United States > New York > Richmond County > New York City (0.04)
- North America > United States > New York > Queens County > New York City (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- (3 more...)
Simulated Annealing: Rigorous finite-time guarantees for optimization on continuous domains
Lecchini-visintini, Andrea, Lygeros, John, Maciejowski, Jan
Simulated annealing is a popular method for approaching the solution of a global optimization problem. Existing results on its performance apply to discrete combinatorial optimizationwhere the optimization variables can assume only a finite set of possible values. We introduce a new general formulation of simulated annealing whichallows one to guarantee finite-time performance in the optimization of functions of continuous variables. The results hold universally for any optimization problem on a bounded domain and establish a connection between simulated annealing and up-to-date theory of convergence of Markov chain Monte Carlo methods on continuous domains. This work is inspired by the concept of finite-time learning with known accuracy and confidence developed in statistical learning theory.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.29)
- Europe > Switzerland > Zürich > Zürich (0.14)
- North America > United States > New York (0.05)
- (3 more...)
Simulated Annealing: Rigorous finite-time guarantees for optimization on continuous domains
Lecchini-visintini, Andrea, Lygeros, John, Maciejowski, Jan
Simulated annealing is a popular method for approaching the solution of a global optimization problem. Existing results on its performance apply to discrete combinatorial optimization where the optimization variables can assume only a finite set of possible values. We introduce a new general formulation of simulated annealing which allows one to guarantee finite-time performance in the optimization of functions of continuous variables. The results hold universally for any optimization problem on a bounded domain and establish a connection between simulated annealing and up-to-date theory of convergence of Markov chain Monte Carlo methods on continuous domains. This work is inspired by the concept of finite-time learning with known accuracy and confidence developed in statistical learning theory.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.29)
- Europe > Switzerland > Zürich > Zürich (0.14)
- North America > United States > New York (0.05)
- (3 more...)