AITopics | rgd

7a2b33c672ce223b2aa5789171ddde2f-Paper.pdf

Neural Information Processing SystemsFeb-12-2026, 16:31:15 GMT

algorithm, descent, gradient descent, (15 more...)

Country:

North America > Canada > Quebec > Montreal (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > California > Alameda County > Berkeley (0.04)
(5 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.45)

Add feedback

c4b108f53550f1d5967305a9a8140ddd-Paper.pdf

Neural Information Processing SystemsFeb-10-2026, 05:43:32 GMT

Here we study structure-preserving discretizations for a certain class of dissipative (conformal) Hamiltonian systems, allowing us to analyze the symplectic structure of both Nesterov and heavy ball, besides providing several new insights into these methods. Moreover, we propose a new algorithm based on a dissipative relativistic system that normalizes the momentum and may result in more stable/faster optimization.

artificial intelligence, integrator, optimization, (18 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > Jordan (0.05)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
Europe > Russia (0.04)
(2 more...)

Technology: Information Technology > Artificial Intelligence (0.69)

Add feedback

c4b108f53550f1d5967305a9a8140ddd-AuthorFeedback.pdf

Neural Information Processing SystemsFeb-10-2026, 05:26:40 GMT

algorithm, optimization, supplement, (16 more...)

Neural Information Processing Systems

Technology:

Information Technology > Software > Programming Languages (0.35)
Information Technology > Artificial Intelligence (0.30)

Add feedback

7a2b33c672ce223b2aa5789171ddde2f-Paper.pdf

Neural Information Processing SystemsOct-3-2025, 01:33:11 GMT

algorithm, artificial intelligence, machine learning, (17 more...)

Neural Information Processing Systems

Country:

North America > Canada > Quebec > Montreal (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > California > Alameda County > Berkeley (0.04)
(6 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Conformal Symplectic and Relativistic Optimization Guilherme Franc a

Neural Information Processing SystemsAug-16-2025, 07:46:58 GMT

Arguably, the two most popular accelerated or momentum-based optimization methods are Nesterov's accelerated gradient and Polyaks's heavy ball, both corresponding to different discretizations of a particular second order differential equation with a friction term. Such connections with continuous-time dynamical systems have been instrumental in demystifying acceleration phenomena in optimization. Here we study structure-preserving discretizations for a certain class of dissipative (conformal) Hamiltonian systems, allowing us to analyze the sym-plectic structure of both Nesterov and heavy ball, besides providing several new insights into these methods. Moreover, we propose a new algorithm based on a dissipative relativistic system that normalizes the momentum and may result in more stable/faster optimization. Importantly, such a method generalizes both Nesterov and heavy ball, each being recovered as distinct limiting cases, and has potential advantages at no additional cost.

artificial intelligence, machine learning, optimization, (17 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > Jordan (0.05)
North America > Canada (0.04)
Europe > Russia (0.04)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.67)

Add feedback

c4b108f53550f1d5967305a9a8140ddd-AuthorFeedback.pdf

Neural Information Processing SystemsAug-16-2025, 07:46:47 GMT

artificial intelligence, optimization, supplement, (17 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.30)

Add feedback

Guaranteed Nonconvex Factorization Approach for Tensor Train Recovery

Qin, Zhen, Wakin, Michael B., Zhu, Zhihui

arXiv.org Machine LearningJan-4-2024

In this paper, we provide the first convergence guarantee for the factorization approach. Specifically, to avoid the scaling ambiguity and to facilitate theoretical analysis, we optimize over the so-called left-orthogonal TT format which enforces orthonormality among most of the factors. To ensure the orthonormal structure, we utilize the Riemannian gradient descent (RGD) for optimizing those factors over the Stiefel manifold. We first delve into the TT factorization problem and establish the local linear convergence of RGD. Notably, the rate of convergence only experiences a linear decline as the tensor order increases. We then study the sensing problem that aims to recover a TT format tensor from linear measurements. Assuming the sensing operator satisfies the restricted isometry property (RIP), we show that with a proper initialization, which could be obtained through spectral initialization, RGD also converges to the ground-truth tensor at a linear rate. Furthermore, we expand our analysis to encompass scenarios involving Gaussian noise in the measurements. We prove that RGD can reliably recover the ground truth at a linear rate, with the recovery error exhibiting only polynomial growth in relation to the tensor order. We conduct various experiments to validate our theoretical findings.

artificial intelligence, machine learning, tensor, (16 more...)

arXiv.org Machine Learning

2401.02592

Country:

North America > United States > Colorado > Jefferson County > Golden (0.14)
North America > United States > Ohio > Franklin County > Columbus (0.04)
Europe > Italy (0.04)
(3 more...)

Genre: Research Report > New Finding (0.67)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.35)

Add feedback

Rethinking PGD Attack: Is Sign Function Necessary?

Yang, Junjie, Chen, Tianlong, Chen, Xuxi, Wang, Zhangyang, Liang, Yingbin

arXiv.org Machine LearningDec-2-2023

Neural networks have demonstrated success in various domains, yet their performance can be significantly degraded by even a small input perturbation. Consequently, the construction of such perturbations, known as adversarial attacks, has gained significant attention, many of which fall within "white-box" scenarios where we have full access to the neural network. Existing attack algorithms, such as the projected gradient descent (PGD), commonly take the sign function on the raw gradient before updating adversarial inputs, thereby neglecting gradient magnitude information. In this paper, we present a theoretical analysis of how such sign-based update algorithm influences step-wise attack performance, as well as its caveat. We also interpret why previous attempts of directly using raw gradients failed. Based on that, we further propose a new raw gradient descent (RGD) algorithm that eliminates the use of sign. Specifically, we convert the constrained optimization problem into an unconstrained one, by introducing a new hidden variable of non-clipped perturbation that can move beyond the constraint. The effectiveness of the proposed RGD algorithm has been demonstrated extensively in experiments, outperforming PGD and other competitors in various settings, without incurring any additional computational overhead. The codes is available in https://github.com/JunjieYang97/RGD.

artificial intelligence, deep learning, machine learning, (19 more...)

arXiv.org Machine Learning

2312.0126

Country:

North America > United States > Texas > Travis County > Austin (0.04)
North America > United States > Ohio (0.04)

Genre: Research Report > New Finding (0.93)

Industry: Information Technology > Security & Privacy (0.88)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.54)

Add feedback

Convergence Analysis for Learning Orthonormal Deep Linear Neural Networks

Qin, Zhen, Tan, Xuwei, Zhu, Zhihui

arXiv.org Artificial IntelligenceNov-24-2023

Enforcing orthonormal or isometric property for the weight matrices has been shown to enhance the training of deep neural networks by mitigating gradient exploding/vanishing and increasing the robustness of the learned networks. However, despite its practical performance, the theoretical analysis of orthonormality in neural networks is still lacking; for example, how orthonormality affects the convergence of the training process. In this letter, we aim to bridge this gap by providing convergence analysis for training orthonormal deep linear neural networks. Specifically, we show that Riemannian gradient descent with an appropriate initialization converges at a linear rate for training orthonormal deep linear neural networks with a class of loss functions. Unlike existing works that enforce orthonormal weight matrices for all the layers, our approach excludes this requirement for one layer, which is crucial to establish the convergence guarantee. Our results shed light on how increasing the number of hidden layers can impact the convergence speed. Experimental results validate our theoretical analysis.

artificial intelligence, machine learning, neural network, (16 more...)

arXiv.org Artificial Intelligence

2311.14658

Country: North America > United States > Ohio (0.04)

Genre: Research Report > New Finding (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.89)

Add feedback

Robust empirical risk minimization via Newton's method

Ioannou, Eirini, Pydi, Muni Sreenivas, Loh, Po-Ling

arXiv.org Artificial IntelligenceJul-17-2023

A new variant of Newton's method for empirical risk minimization is studied, where at each iteration of the optimization algorithm, the gradient and Hessian of the objective function are replaced by robust estimators taken from existing literature on robust mean estimation for multivariate data. After proving a general theorem about the convergence of successive iterates to a small ball around the population-level minimizer, consequences of the theory in generalized linear models are studied when data are generated from Huber's epsilon-contamination model and/or heavytailed distributions. An algorithm for obtaining robust Newton directions based on the conjugate gradient method is also proposed, which may be more appropriate for high-dimensional settings, and conjectures about the convergence of the resulting algorithm are offered. Compared to robust gradient descent, the proposed algorithm enjoys the faster rates of convergence for successive iterates often achieved by second-order algorithms for convex problems, i.e., quadratic convergence in a neighborhood of the optimum, with a stepsize that may be chosen adaptively via backtracking linesearch.

artificial intelligence, inequality, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2301.13192

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.28)
Europe > United Kingdom > Scotland > City of Edinburgh > Edinburgh (0.04)
Europe > France > Île-de-France > Paris > Paris (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.35)

Add feedback

Collaborating Authors

rgd

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

7a2b33c672ce223b2aa5789171ddde2f-Paper.pdf

c4b108f53550f1d5967305a9a8140ddd-Paper.pdf

c4b108f53550f1d5967305a9a8140ddd-AuthorFeedback.pdf

7a2b33c672ce223b2aa5789171ddde2f-Paper.pdf

Conformal Symplectic and Relativistic Optimization Guilherme Franc a

c4b108f53550f1d5967305a9a8140ddd-AuthorFeedback.pdf

Guaranteed Nonconvex Factorization Approach for Tensor Train Recovery

Rethinking PGD Attack: Is Sign Function Necessary?

Convergence Analysis for Learning Orthonormal Deep Linear Neural Networks

Robust empirical risk minimization via Newton's method