AITopics | hessf

Collaborating Authors

hessf

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Riemannian Hamiltonian methods for min-max optimization on manifolds

Han, Andi, Mishra, Bamdev, Jawanpuria, Pratik, Kumar, Pawan, Gao, Junbin

arXiv.org Artificial IntelligenceAug-24-2023

In this paper, we study min-max optimization problems on Riemannian manifolds. We introduce a Riemannian Hamiltonian function, minimization of which serves as a proxy for solving the original min-max problems. Under the Riemannian Polyak--{\L}ojasiewicz condition on the Hamiltonian function, its minimizer corresponds to the desired min-max saddle point. We also provide cases where this condition is satisfied. For geodesic-bilinear optimization in particular, solving the proxy problem leads to the correct search direction towards global optimality, which becomes challenging with the min-max formulation. To minimize the Hamiltonian function, we propose Riemannian Hamiltonian methods (RHM) and present their convergence analyses. We extend RHM to include consensus regularization and to the stochastic setting. We illustrate the efficacy of the proposed RHM in applications such as subspace robust Wasserstein distance, robust training of neural networks, and generative adversarial networks.

artificial intelligence, machine learning, manifold, (19 more...)

arXiv.org Artificial Intelligence

2204.11418

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > New York (0.04)
Europe > Russia (0.04)
(3 more...)

Genre: Research Report (0.63)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.92)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.87)

Add feedback

Riemannian optimization with a preconditioning scheme on the generalized Stiefel manifold

Shustin, Boris, Avron, Haim

arXiv.org Artificial IntelligenceNov-15-2022

Optimization problems on the generalized Stiefel manifold (and products of it) are prevalent across science and engineering. For example, in computational science they arise in symmetric (generalized) eigenvalue problems, in nonlinear eigenvalue problems, and in electronic structures computations, to name a few problems. In statistics and machine learning, they arise, for example, in various dimensionality reduction techniques such as canonical correlation analysis. In deep learning, regularization and improved stability can be obtained by constraining some layers to have parameter matrices that belong to the Stiefel manifold. Solving problems on the generalized Stiefel manifold can be approached via the tools of Riemannian optimization. However, using the standard geometric components for the generalized Stiefel manifold has two possible shortcomings: computing some of the geometric components can be too expensive and convergence can be rather slow in certain cases. Both shortcomings can be addressed using a technique called Riemannian preconditioning, which amounts to using geometric components derived by a precoditioner that defines a Riemannian metric on the constraint manifold. In this paper we develop the geometric components required to perform Riemannian optimization on the generalized Stiefel manifold equipped with a non-standard metric, and illustrate theoretically and numerically the use of those components and the effect of Riemannian preconditioning for solving optimization problems on the generalized Stiefel manifold.

data mining, machine learning, manifold, (18 more...)

arXiv.org Artificial Intelligence

1902.01635

Country:

North America > United States > Maryland > Baltimore (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Russia > Siberian Federal District > Tomsk Oblast > Tomsk (0.04)
Asia > Middle East > Israel > Tel Aviv District > Tel Aviv (0.04)

Genre: Research Report (0.81)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Riemannian accelerated gradient methods via extrapolation

Han, Andi, Mishra, Bamdev, Jawanpuria, Pratik, Gao, Junbin

arXiv.org Artificial IntelligenceAug-13-2022

Optimization on a Riemannian manifold naturally appears in various fields of applications, including principal component analysis [22, 61], matrix completion and factorization [35, 56, 13], dictionary learning [17, 27], optimal transport [49, 40, 26], to name a few. Riemannian optimization [2, 12] provides a universal and efficient framework for problem (1) that respects the intrinsic geometry of the constraint set. In addition, many non-convex problems turns out to be geodesic convex (a generalized notion of convexity) on the manifold, which yields better convergence guarantees for Riemannian optimization methods. One of the most fundamental solvers is the Riemannian gradient descent method [55, 62, 2, 12], which generalizes the classical gradient descent method in the Euclidean space with intrinsic updates on manifolds. There also exist various advanced algorithms for Riemannian optimization that include stochastic and variance reduced methods [11, 61, 34, 24, 25], adaptive gradient methods [8, 33] quasi-Newton methods [30, 43], trust region methods [1], and cubic regularized Newton methods [3], among others. Nevertheless, it remains unclear whether there exists a simple strategy to accelerate firstorder algorithms on Riemannian manifolds. Existing research on accelerated gradient methods focus primarily on generalizing Nesterov acceleration [42] to Riemannian manifolds, including [37, 4, 63, 6, 31, 36]. However, most of the algorithms are theoretic constructs and are usually less favourable in practice.

exp 1, manifold, retr 1, (15 more...)

arXiv.org Artificial Intelligence

2208.06619

Country:

Asia > Middle East > Jordan (0.04)
Europe > Slovenia > Drava > Municipality of Benedikt > Benedikt (0.04)
Asia > India (0.04)

Genre: Research Report (0.63)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.88)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.55)

Add feedback

Towards Sharp Stochastic Zeroth Order Hessian Estimators over Riemannian Manifolds

Wang, Tianyu

arXiv.org Machine LearningFeb-12-2022

We study Hessian estimators for real-valued functions defined over an $n$-dimensional complete Riemannian manifold. We introduce new stochastic zeroth-order Hessian estimators using $O (1)$ function evaluations. We show that, for a smooth real-valued function $f$ with Lipschitz Hessian (with respect to the Rimannian metric), our estimator achieves a bias bound of order $ O \left( L_2 \delta + \gamma \delta^2 \right) $, where $ L_2 $ is the Lipschitz constant for the Hessian, $ \gamma $ depends on both the Levi-Civita connection and function $f$, and $\delta$ is the finite difference step size. To the best of our knowledge, our results provide the first bias bound for Hessian estimators that explicitly depends on the geometry of the underlying Riemannian manifold. Perhaps more importantly, our bias bound does not increase with dimension $n$. This improves best previously known bias bound for $O(1)$-evaluation Hessian estimators, which increases quadratically with $n$. We also study downstream computations based on our Hessian estimators. The supremacy of our method is evidenced by empirical evaluations.

estimator, hessf, hessian estimator, (14 more...)

arXiv.org Machine Learning

2201.1078

Country:

North America > United States > Indiana (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.70)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.68)

Add feedback

Convergence rates for the stochastic gradient descent method for non-convex objective functions

Fehrman, Benjamin, Gess, Benjamin, Jentzen, Arnulf

arXiv.org Machine LearningApr-2-2019

We prove the local convergence to minima and estimates on the rate of convergence for the stochastic gradient descent method in the case of not necessarily globally convex nor contracting objective functions. In particular, the results are applicable to simple objective functions arising in machine learning.

artificial intelligence, machine learning, proposition 2, (18 more...)

arXiv.org Machine Learning

1904.01517

Country:

Europe > Switzerland (0.27)
Europe > United Kingdom (0.27)

Genre: Research Report (0.81)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (1.00)

Add feedback

Solving SDPs for synchronization and MaxCut problems via the Grothendieck inequality

Mei, Song, Misiakiewicz, Theodor, Montanari, Andrea, Oliveira, Roberto I.

arXiv.org Machine LearningMar-29-2017

A number of statistical estimation problems can be addressed by semidefinite programs (SDP). While SDPs are solvable in polynomial time using interior point methods, in practice generic SDP solvers do not scale well to high-dimensional problems. In order to cope with this problem, Burer and Monteiro proposed a non-convex rank-constrained formulation, which has good performance in practice but is still poorly understood theoretically. In this paper we study the rank-constrained version of SDPs arising in MaxCut and in synchronization problems. We establish a Grothendieck-type inequality that proves that all the local maxima and dangerous saddle points are within a small multiplicative gap from the global maximum. We use this structural information to prove that SDPs can be solved within a known accuracy, by applying the Riemannian trust-region method to this non-convex problem, while constraining the rank to be of order one. For the MaxCut problem, our inequality implies that any local maximizer of the rank-constrained SDP provides a $ (1 - 1/(k-1)) \times 0.878$ approximation of the MaxCut, when the rank is fixed to $k$. We then apply our results to data matrices generated according to the Gaussian ${\mathbb Z}_2$ synchronization problem, and the two-groups stochastic block model with large bounded degree. We prove that the error achieved by local maximizers undergoes a phase transition at the same threshold as for information-theoretically optimal methods.

artificial intelligence, optimization problem, sdp, (17 more...)

arXiv.org Machine Learning

1703.08729

Genre: Research Report (0.70)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.93)

Add feedback

Riemannian Submanifold Tracking on Low-Rank Algebraic Variety

Li, Qian (Chinese Academy of Sciences) | Wang, Zhichao (Tsinghua University)

AAAI ConferencesFeb-14-2017

Matrix recovery aims to learn a low-rank structure from high dimensional data, which arises in numerous learning applications. As a popular heuristic to matrix recovery, convex relaxation involves iterative calling of singular value decomposition (SVD). Riemannian optimization based method can alleviate such expensive cost in SVD for improved scalability, which however is usually degraded by the unknown rank. This paper proposes a novel algorithm RIST that exploits the algebraic variety of low-rank manifold for matrix recovery. Particularly, RIST utilizes an efficient scheme that automatically estimate the potential rank on the real algebraic variety and tracks the favorable Riemannian submanifold. Moreover, RIST utilizes the second-order geometric characterization and achieves provable superlinear convergence, which is superior to the linear convergence of most existing methods. Extensive comparison experiments demonstrate the accuracy and ef- ficiency of RIST algorithm.

algorithm, artificial intelligence, rist, (16 more...)

AAAI Conferences

Thirty-First AAAI Conference on Artificial Intelligence

Country: Asia > China (0.14)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning (0.94)

Add feedback