AITopics | second-order information

2b323d6eb28422cef49b266557dd31ad-Paper.pdf

Neural Information Processing SystemsApr-25-2026, 06:15:36 GMT

algorithm, artificial intelligence, machine learning, (17 more...)

Neural Information Processing Systems

Country: Europe (0.46)

Genre: Research Report (0.47)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.93)

Add feedback

A Stein variational Newton method

Neural Information Processing SystemsMar-17-2026, 02:38:49 GMT

Stein variational gradient descent (SVGD) was recently proposed as a general purpose nonparametric variational inference algorithm: it minimizes the Kullback-Leibler divergence between the target distribution and its approximation by implementing a form of functional gradient descent on a reproducing kernel Hilbert space [Liu & Wang, NIPS 2016]. In this paper, we accelerate and generalize the SVGD algorithm by including second-order information, thereby approximating a Newton-like iteration in function space. We also show how second-order information can lead to more effective choices of kernel. We observe significant computational gains over the original SVGD algorithm in multiple test cases.

artificial intelligence, machine learning, proceedings, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.67)

Add feedback

fb1d9c3fc2161e12aa71cdcab74b9d2c-Paper-Conference.pdf

Neural Information Processing SystemsFeb-18-2026, 02:21:19 GMT

algorithm, artificial intelligence, machine learning, (17 more...)

Neural Information Processing Systems

Country:

North America > Canada > Ontario > Toronto (0.14)
Europe > United Kingdom > Scotland > City of Edinburgh > Edinburgh (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(2 more...)

Genre: Research Report (0.30)

Industry: Information Technology > Security & Privacy (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.70)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.68)
Information Technology > Security & Privacy (0.68)
(3 more...)

Add feedback

A Stein variational Newton method

Gianluca Detommaso, Tiangang Cui, Youssef Marzouk, Alessio Spantini, Robert Scheichl

Neural Information Processing SystemsFeb-15-2026, 09:43:13 GMT

Neural Information Processing Systems http://nips.cc/

algorithm, approximation, kernel, (13 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts (0.04)
Oceania > Australia (0.04)
North America > Canada > Quebec > Montreal (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.41)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.32)

Add feedback

cf1129594f603fde9e1913d10b7dbf77-Paper-Conference.pdf

Neural Information Processing SystemsFeb-12-2026, 00:11:04 GMT

architecture, architecture search, international conference, (15 more...)

Neural Information Processing Systems

Country:

Europe > Denmark > North Jutland > Aalborg (0.04)
Asia > China > Heilongjiang Province > Harbin (0.04)
Asia > China > Guangdong Province > Shenzhen (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.90)

Add feedback

2b323d6eb28422cef49b266557dd31ad-Paper.pdf

Neural Information Processing SystemsFeb-8-2026, 00:33:23 GMT

algorithm, convergence rate, frank-wolfe algorithm, (14 more...)

Neural Information Processing Systems

Country:

Europe > France > Bourgogne-Franche-Comté > Doubs > Besançon (0.05)
Europe > Russia (0.04)
Europe > Netherlands > North Holland > Amsterdam (0.04)
(2 more...)

Genre: Research Report (0.47)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.93)

Add feedback

Limitations of the empirical Fisher approximation for natural gradient descent

Neural Information Processing SystemsDec-25-2025, 08:06:29 GMT

Natural gradient descent, which preconditions a gradient descent update with the Fisher information matrix of the underlying statistical model, is a way to capture partial second-order information. Several highly visible works have advocated an approximation known as the empirical Fisher, drawing connections between approximate second-order methods and heuristics like Adam. We dispute this argument by showing that the empirical Fisher---unlike the Fisher---does not generally capture second-order information. We further argue that the conditions under which the empirical Fisher approaches the Fisher (and the Hessian) are unlikely to be met in practice, and that, even on simple optimization problems, the pathologies of the empirical Fisher can have undesirable effects.

empirical fisher approximation, gradient descent, name change, (4 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.71)

Add feedback

Newton Losses: Using Curvature Information for Learning with Differentiable Algorithms

Neural Information Processing SystemsDec-24-2025, 18:10:45 GMT

When training neural networks with custom objectives, such as ranking losses and shortest-path losses, a common problem is that they are, per se, non-differentiable. A popular approach is to continuously relax the objectives to provide gradients, enabling learning. However, such differentiable relaxations are often non-convex and can exhibit vanishing and exploding gradients, making them (already in isolation) hard to optimize. Here, the loss function poses the bottleneck when training a deep neural network.

artificial intelligence, machine learning, newton loss, (9 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.87)

Add feedback

M-FAC: Efficient Matrix-Free Approximations of Second-Order Information

Neural Information Processing SystemsDec-24-2025, 08:43:51 GMT

Efficiently approximating local curvature information of the loss function is a useful tool for the optimization and compression of deep neural networks. Yet, most existing methods to approximate second-order information have high computational or storage costs, limiting their practicality. In this work, we investigate matrix-free approaches for estimating Inverse-Hessian Vector Products (IHVPs) for the case when the Hessian can be approximated as a sum of rank-one matrices, as in the classic approximation of the Hessian by the empirical Fisher matrix. The first algorithm we propose is tailored towards network compression and can compute the IHVP for dimension $d$ given a fixed set of $m$ rank-one matrices using $O(dm^2)$ precomputation, $O(dm)$ cost for computing the IHVP and query cost $O(m)$ for computing any single element of the inverse Hessian approximation. The second algorithm targets an optimization setting, where we wish to compute the product between the inverse Hessian, estimated over a sliding window of optimization steps, and a given gradient direction. We give an algorithm with cost $O(dm + m^2)$ for computing the IHVP and $O(dm + m^3)$ for adding or removing any gradient from the sliding window. We show that both algorithms yield competitive results for network pruning and optimization, respectively, with significantly lower computational overhead relative to existing second-order methods.

efficient matrix-free approximation, name change, second-order information, (9 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.59)

Add feedback

A Stein variational Newton method

Neural Information Processing SystemsNov-20-2025, 23:17:34 GMT

Stein variational gradient descent (SVGD) was recently proposed as a general purpose nonparametric variational inference algorithm: it minimizes the Kullback-Leibler divergence between the target distribution and its approximation by implementing a form of functional gradient descent on a reproducing kernel Hilbert space [Liu & Wang, NIPS 2016]. In this paper, we accelerate and generalize the SVGD algorithm by including second-order information, thereby approximating a Newton-like iteration in function space. We also show how second-order information can lead to more effective choices of kernel. We observe significant computational gains over the original SVGD algorithm in multiple test cases.

name change, proceedings, stein variational newton method, (6 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.67)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.45)

Add feedback

Filters

Collaborating Authors

second-order information

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

2b323d6eb28422cef49b266557dd31ad-Paper.pdf

A Stein variational Newton method

fb1d9c3fc2161e12aa71cdcab74b9d2c-Paper-Conference.pdf

A Stein variational Newton method

cf1129594f603fde9e1913d10b7dbf77-Paper-Conference.pdf

2b323d6eb28422cef49b266557dd31ad-Paper.pdf

Limitations of the empirical Fisher approximation for natural gradient descent

Newton Losses: Using Curvature Information for Learning with Differentiable Algorithms

M-FAC: Efficient Matrix-Free Approximations of Second-Order Information

A Stein variational Newton method