AITopics | hessian-vector product

It works both in the stochastic and the deterministic settings, without hurting the algorithm'sperformance. As applications, our reduction turns Natasha2 into a first-order method without hurting its theoretical performance. It also converts SGD, GD, SCSG, and SVRG into algorithms finding approximate local minima, outperforming some bestknownresults.

algorithm, artificial intelligence, machine learning, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
North America > Canada (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.31)

Add feedback

fcdf698a5d673435e0a5a6f9ffea05ca-Supplemental.pdf

Neural Information Processing SystemsFeb-12-2026, 01:10:31 GMT

matrix, matrix-vector product, variance, (13 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.36)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.36)

Add feedback

072b030ba126b2f4b2374f342be9ed44-AuthorFeedback.pdf

Neural Information Processing SystemsFeb-11-2026, 09:27:27 GMT

imaml, iteration, trade-off, (14 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.31)

Add feedback

1704fe7aaff33a54802b83a016050ab8-Paper-Conference.pdf

Neural Information Processing SystemsFeb-7-2026, 15:54:16 GMT

algorithm, arxiv preprint arxiv, optimization, (13 more...)

Neural Information Processing Systems

Country:

North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Reducing Reparameterization Gradient Variance

Andrew Miller, Nick Foti, Alexander D'Amour, Ryan P. Adams

Neural Information Processing SystemsNov-21-2025, 06:43:07 GMT

Optimization with noisy gradients has become ubiquitous in statistics and machine learning. Reparameterization gradients, or gradient estimates computed via the "reparameterization trick," represent a class of noisy gradients often used in Monte Carlo variational inference (MCVI). However, when these gradient estimators are too noisy, the optimization procedure can be slow or fail to converge. One way to reduce noise is to generate more samples for the gradient estimate, but this can be computationally expensive. Instead, we view the noisy gradient as a random variable, and form an inexpensive approximation of the generating procedure for the gradient sample. This approximation has high correlation with the noisy gradient by construction, making it a useful control variate for variance reduction. We demonstrate our approach on a non-conjugate hierarchical model and a Bayesian neural net where our method attained orders of magnitude (20-2,000) reduction in gradient variance resulting in faster and more stable optimization.

artificial intelligence, estimator, machine learning, (18 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Los Angeles County > Long Beach (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.89)

Add feedback

Adaptive Negative Curvature Descent with Applications in Non-convex Optimization

Mingrui Liu, Zhe Li, Xiaoyu Wang, Jinfeng Yi, Tianbao Yang

Neural Information Processing SystemsNov-20-2025, 21:16:34 GMT

When the function is non-degenerate (i.e., strict saddle or the Hessian at all

artificial intelligence, complexity, machine learning, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > Iowa > Johnson County > Iowa City (0.14)
North America > Canada > Quebec > Montreal (0.04)
Asia > Middle East > Jordan (0.04)
Asia > Afghanistan > Parwan Province > Charikar (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.70)

Add feedback

NEON2: Finding Local Minima via First-Order Oracles

Zeyuan Allen-Zhu, Yuanzhi Li

Neural Information Processing SystemsNov-20-2025, 20:21:43 GMT

We propose a reduction for non-convex optimization that can (1) turn an stationary-point finding algorithm into an local-minimum finding one, and (2) replace the Hessian-vector product computations with only gradient computations. It works both in the stochastic and the deterministic settings, without hurting the algorithm's performance. As applications, our reduction turns Natasha2 into a first-order method without hurting its theoretical performance. It also converts SGD, GD, SCSG, and SVRG into algorithms finding approximate local minima, outperforming some best known results.

artificial intelligence, computation, machine learning, (16 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > California > Santa Clara County > Stanford (0.04)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.32)

Add feedback

Filters

Collaborating Authors

hessian-vector product

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

e17a5a399de92e1d01a56c50afb2a68e-Paper.pdf

1704fe7aaff33a54802b83a016050ab8-Paper-Conference.pdf

Adaptive Negative Curvature Descent with Applications in Non-convex Optimization

NEON2: Finding Local Minima via First-Order Oracles

fcdf698a5d673435e0a5a6f9ffea05ca-Supplemental.pdf

072b030ba126b2f4b2374f342be9ed44-AuthorFeedback.pdf

1704fe7aaff33a54802b83a016050ab8-Paper-Conference.pdf

Reducing Reparameterization Gradient Variance

Adaptive Negative Curvature Descent with Applications in Non-convex Optimization

NEON2: Finding Local Minima via First-Order Oracles