AITopics | global optimizer

Meta-learning characteristics and dynamics of quantum systems

Schorling, Lucas, Vaidhyanathan, Pranav, Schuff, Jonas, Carballido, Miguel J., Zumbühl, Dominik, Milburn, Gerard, Marquardt, Florian, Foerster, Jakob, Osborne, Michael A., Ares, Natalia

arXiv.org Artificial IntelligenceMar-13-2025

While machine learning holds great promise for quantum technologies, most current methods focus on predicting or controlling a specific quantum system. Meta-learning approaches, however, can adapt to new systems for which little data is available, by leveraging knowledge obtained from previous data associated with similar systems. In this paper, we meta-learn dynamics and characteristics of closed and open two-level systems, as well as the Heisenberg model. Based on experimental data of a Loss-DiVincenzo spin-qubit hosted in a Ge/Si core/shell nanowire for different gate voltage configurations, we predict qubit characteristics i.e. $g$-factor and Rabi frequency using meta-learning. The algorithm we introduce improves upon previous state-of-the-art meta-learning methods for physics-based systems by introducing novel techniques such as adaptive learning rates and a global optimizer for improved robustness and increased computational efficiency. We benchmark our method against other meta-learning methods, a vanilla transformer, and a multilayer perceptron, and demonstrate improved performance.

artificial intelligence, deep learning, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2503.10492

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.05)
Europe > Switzerland > Basel-City > Basel (0.04)
Europe > Germany (0.04)
(2 more...)

Genre: Research Report > Promising Solution (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Add feedback

Finding a sparse vector in a subspace: Linear sparsity using alternating directions

Qing Qu, Ju Sun, John Wright

Neural Information Processing SystemsFeb-9-2025, 13:57:57 GMT

This problem can be considered a homogeneous variant of the sparse recovery problem, and finds applications in sparse dictionary learning, sparse PCA, and other problems in signal processing and machine learning. Simple convex heuristics for this problem provably break down when the fraction of nonzero entries in the target sparse vector substantially exceeds 1/ n. In contrast, we exhibit a relatively simple nonconvex approach based on alternating directions, which provably succeeds even when the fraction of nonzero entries is Ω(1). To our knowledge, this is the first practical algorithm to achieve this linear scaling. This result assumes a planted sparse model, in which the target sparse vector is embedded in an otherwise random subspace. Empirically, our proposed algorithm also succeeds in more challenging data models arising, e.g., from sparse dictionary learning.

algorithm, artificial intelligence, machine learning, (13 more...)

Neural Information Processing Systems

Country:

North America > United States > New York > Richmond County > New York City (0.04)
North America > United States > New York > Queens County > New York City (0.04)
North America > United States > New York > New York County > New York City (0.04)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Composing Global Optimizers to Reasoning Tasks via Algebraic Objects in Neural Nets

Tian, Yuandong

arXiv.org Artificial IntelligenceDec-3-2024

We prove rich algebraic structures of the solution space for 2-layer neural networks with quadratic activation and $L_2$ loss, trained on reasoning tasks in Abelian group (e.g., modular addition). Such a rich structure enables analytical construction of global optimal solutions from partial solutions that only satisfy part of the loss, despite its high nonlinearity. We coin the framework as CoGO (Composing Global Optimizers). Specifically, we show that the weight space over different numbers of hidden nodes of the 2-layer network is equipped with a semi-ring algebraic structure, and the loss function to be optimized consists of monomial potentials, which are ring homomorphism, allowing partial solutions to be composed into global ones by ring addition and multiplication. Our experiments show that around $95\%$ of the solutions obtained by gradient descent match exactly our theoretical constructions. Although the global optimizers constructed only required a small number of hidden nodes, our analysis on gradient dynamics shows that over-parameterization asymptotically decouples training dynamics and is beneficial. We further show that training dynamics favors simpler solutions under weight decay, and thus high-order global optimizers such as perfect memorization are unfavorable. Code can be found at https://github.com/facebookresearch/luckmatters/tree/yuandong3/ssl/real-dataset.

global optimizer, multiplication, preprint, (13 more...)

arXiv.org Artificial Intelligence

2410.01779

Country:

North America > Canada > Ontario > National Capital Region > Ottawa (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Finding a sparse vector in a subspace: Linear sparsity using alternating directions

Neural Information Processing SystemsMar-13-2024, 11:01:10 GMT

This problem can be considered a homogeneous variant of the sparse recovery problem, and finds applications in sparse dictionary learning, sparse PCA, and other problems in signal processing and machine learning. Simple convex heuristics for this problem provably break down when the fraction of nonzero entries in the target sparse vector substantially exceeds 1/ n. In contrast, we exhibit a relatively simple nonconvex approach based on alternating directions, which provably succeeds even when the fraction of nonzero entries is Ω(1). To our knowledge, this is the first practical algorithm to achieve this linear scaling. This result assumes a planted sparse model, in which the target sparse vector is embedded in an otherwise random subspace. Empirically, our proposed algorithm also succeeds in more challenging data models arising, e.g., from sparse dictionary learning.

algorithm, sparse vector, subspace, (11 more...)

Neural Information Processing Systems

Country:

North America > United States > New York > Richmond County > New York City (0.04)
North America > United States > New York > Queens County > New York City (0.04)
North America > United States > New York > New York County > New York City (0.04)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Verification and Synthesis of Robust Control Barrier Functions: Multilevel Polynomial Optimization and Semidefinite Relaxation

Kang, Shucheng, Chen, Yuxiao, Yang, Heng, Pavone, Marco

arXiv.org Artificial IntelligenceJul-21-2023

We study the problem of verification and synthesis of robust control barrier functions (CBF) for control-affine polynomial systems with bounded additive uncertainty and convex polynomial constraints on the control. We first formulate robust CBF verification and synthesis as multilevel polynomial optimization problems (POP), where verification optimizes -- in three levels -- the uncertainty, control, and state, while synthesis additionally optimizes the parameter of a chosen parametric CBF candidate. We then show that, by invoking the KKT conditions of the inner optimizations over uncertainty and control, the verification problem can be simplified as a single-level POP and the synthesis problem reduces to a min-max POP. This reduction leads to multilevel semidefinite relaxations. For the verification problem, we apply Lasserre's hierarchy of moment relaxations. For the synthesis problem, we draw connections to existing relaxation techniques for robust min-max POP, which first use sum-of-squares programming to find increasingly tight polynomial lower bounds to the unknown value function of the verification POP, and then call Lasserre's hierarchy again to maximize the lower bounds. Both semidefinite relaxations guarantee asymptotic global convergence to optimality. We provide an in-depth study of our framework on the controlled Van der Pol Oscillator, both with and without additive uncertainty.

artificial intelligence, machine learning, optimization problem, (17 more...)

arXiv.org Artificial Intelligence

2303.10081

Country:

North America > United States > Massachusetts > Middlesex County > Belmont (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.88)

Add feedback

On Local Optimizers of Acquisition Functions in Bayesian Optimization

Kim, Jungtaek, Choi, Seungjin

arXiv.org Machine LearningJan-24-2019

Bayesian optimization is a sample-efficient method for finding a global optimum of an expensive-to-evaluate black-box function. A global solution is found by accumulating a pair of query point and corresponding function value, repeating these two procedures: (i) learning a surrogate model for the objective function using the data observed so far; (ii) the maximization of an acquisition function to determine where next to query the objective function. Convergence guarantees are only valid when the global optimizer of the acquisition function is found and selected as the next query point. In practice, however, local optimizers of acquisition functions are also used, since searching the exact optimizer of the acquisition function is often a non-trivial or time-consuming task. In this paper we present an analysis on the behavior of local optimizers of acquisition functions, in terms of instantaneous regrets over global optimizers. We also present the performance analysis when multi-started local optimizers are used to find the maximum of the acquisition function. Numerical experiments confirm the validity of our theoretical analysis.

acquisition function, local optimizer, optimizer, (12 more...)

arXiv.org Machine Learning

1901.0835

Country:

Asia > South Korea > Gyeongsangbuk-do > Pohang (0.04)
North America > Canada > Quebec > Montreal (0.04)
Asia > Middle East > Israel > Haifa District > Haifa (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Porcupine Neural Networks: (Almost) All Local Optima are Global

Feizi, Soheil, Javadi, Hamid, Zhang, Jesse, Tse, David

arXiv.org Machine LearningOct-5-2017

Neural networks have been used prominently in several machine learning and statistics applications. In general, the underlying optimization of neural networks is non-convex which makes their performance analysis challenging. In this paper, we take a novel approach to this problem by asking whether one can constrain neural network weights to make its optimization landscape have good theoretical properties while at the same time, be a good approximation for the unconstrained one. For two-layer neural networks, we provide affirmative answers to these questions by introducing Porcupine Neural Networks (PNNs) whose weight vectors are constrained to lie over a finite set of lines. We show that most local optima of PNN optimizations are global while we have a characterization of regions where bad local optimizers may exist. Moreover, our theoretical and empirical results suggest that an unconstrained neural network can be approximated using a polynomially-large PNN.

artificial intelligence, machine learning, neural network, (18 more...)

arXiv.org Machine Learning

1710.02196

Country: Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Finding a sparse vector in a subspace: Linear sparsity using alternating directions

Qu, Qing, Sun, Ju, Wright, John

Neural Information Processing SystemsDec-31-2014

We consider the problem of recovering the sparsest vector in a subspace $ \mathcal{S} \in \mathbb{R}^p $ with $ \text{dim}(\mathcal{S})=n$. This problem can be considered a homogeneous variant of the sparse recovery problem, and finds applications in sparse dictionary learning, sparse PCA, and other problems in signal processing and machine learning. Simple convex heuristics for this problem provably break down when the fraction of nonzero entries in the target sparse vector substantially exceeds $1/ \sqrt{n}$. In contrast, we exhibit a relatively simple nonconvex approach based on alternating directions, which provably succeeds even when the fraction of nonzero entries is $\Omega(1)$. To our knowledge, this is the first practical algorithm to achieve this linear scaling. This result assumes a planted sparse model, in which the target sparse vector is embedded in an otherwise random subspace. Empirically, our proposed algorithm also succeeds in more challenging data models arising, e.g., from sparse dictionary learning.

algorithm, artificial intelligence, machine learning, (13 more...)

Neural Information Processing Systems

Country: North America > United States > New York (0.15)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Simulated Annealing: Rigorous finite-time guarantees for optimization on continuous domains

Lecchini-visintini, Andrea, Lygeros, John, Maciejowski, Jan

Neural Information Processing SystemsDec-31-2008

Simulated annealing is a popular method for approaching the solution of a global optimization problem. Existing results on its performance apply to discrete combinatorial optimization where the optimization variables can assume only a finite set of possible values. We introduce a new general formulation of simulated annealing which allows one to guarantee finite-time performance in the optimization of functions of continuous variables. The results hold universally for any optimization problem on a bounded domain and establish a connection between simulated annealing and up-to-date theory of convergence of Markov chain Monte Carlo methods on continuous domains. This work is inspired by the concept of finite-time learning with known accuracy and confidence developed in statistical learning theory.

approximate global optimizer, continuous domain, global optimizer, (15 more...)

Neural Information Processing Systems

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.29)
Europe > Switzerland > Zürich > Zürich (0.14)
North America > United States > New York (0.05)
(3 more...)

Genre: Research Report (0.69)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (1.00)

Add feedback

Simulated Annealing: Rigorous finite-time guarantees for optimization on continuous domains

Lecchini-visintini, Andrea, Lygeros, John, Maciejowski, Jan

Neural Information Processing SystemsDec-31-2008

Simulated annealing is a popular method for approaching the solution of a global optimization problem. Existing results on its performance apply to discrete combinatorial optimization where the optimization variables can assume only a finite set of possible values. We introduce a new general formulation of simulated annealing which allows one to guarantee finite-time performance in the optimization of functions of continuous variables. The results hold universally for any optimization problem on a bounded domain and establish a connection between simulated annealing and up-to-date theory of convergence of Markov chain Monte Carlo methods on continuous domains. This work is inspired by the concept of finite-time learning with known accuracy and confidence developed in statistical learning theory.

approximate global optimizer, continuous domain, global optimizer, (15 more...)

Neural Information Processing Systems

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.29)
Europe > Switzerland > Zürich > Zürich (0.14)
North America > United States > New York (0.05)
(3 more...)

Genre: Research Report (0.69)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (1.00)

Add feedback

Filters

Collaborating Authors

global optimizer

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Meta-learning characteristics and dynamics of quantum systems

Finding a sparse vector in a subspace: Linear sparsity using alternating directions

Composing Global Optimizers to Reasoning Tasks via Algebraic Objects in Neural Nets

Finding a sparse vector in a subspace: Linear sparsity using alternating directions

Verification and Synthesis of Robust Control Barrier Functions: Multilevel Polynomial Optimization and Semidefinite Relaxation

On Local Optimizers of Acquisition Functions in Bayesian Optimization

Porcupine Neural Networks: (Almost) All Local Optima are Global

Finding a sparse vector in a subspace: Linear sparsity using alternating directions

Simulated Annealing: Rigorous finite-time guarantees for optimization on continuous domains

Simulated Annealing: Rigorous finite-time guarantees for optimization on continuous domains