AITopics | Gradient Descent

Collaborating Authors

Gradient Descent

News Overviews Instructional Materials AI-Alerts Classics

Can Implicit Bias Explain Generalization Stochastic Convex Optimization as a Case Study

Neural Information Processing SystemsOct-2-2025, 23:38:56 GMT

We revisit this paradigm in arguably the simplest non-trivial setup, and study the implicit bias of Stochastic Gradient Descent (SGD) in the context of Stochastic Convex Optimization.

artificial intelligence, machine learning, regularizer, (17 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.54)

Add feedback

Poincaré Recurrence, Cycles and Spurious Equilibria in Gradient-Descent-Ascent for Non-Convex Non-Concave Zero-Sum Games

Emmanouil-Vasileios Vlatakis-Gkaragkounis, Lampros Flokas, Georgios Piliouras

Neural Information Processing SystemsOct-2-2025, 22:47:10 GMT

We study a wide class of non-convex non-concave min-max games that generalizes over standard bilinear zero-sum games.

artificial intelligence, machine learning, optimization problem, (19 more...)

Neural Information Processing Systems

Country:

Europe (1.00)
Asia (0.93)
North America > United States > California > Los Angeles County (0.28)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.48)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.46)
(2 more...)

Add feedback

A Single-Loop Smoothed Gradient Descent-Ascent Algorithm for Nonconvex-Concave Min-Max Problems

Neural Information Processing SystemsOct-2-2025, 22:37:34 GMT

Nonconvex-concave min-max problem arises in many machine learning applications including minimizing a pointwise maximum of a set of nonconvex functions and robust adversarial training of neural networks. A popular approach to solve this problem is the gradient descent-ascent (GDA) algorithm which unfortunately can exhibit oscillation in case of nonconvexity. In this paper, we introduce a "smoothing" scheme which can be combined with GDA to stabilize the oscillation and ensure convergence to a stationary solution.

algorithm, artificial intelligence, machine learning, (11 more...)

Neural Information Processing Systems

Country:

North America > United States (0.28)
Asia > China > Guangdong Province (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.86)

Add feedback

Export Reviews, Discussions, Author Feedback and Meta-Reviews

Neural Information Processing SystemsOct-2-2025, 21:08:01 GMT

First provide a summary of the paper, and then address the following criteria: Quality, clarity, originality and significance. The paper proposes a supervised learning algorithm. It uses stochastic gradient descent and periodically expands the hypothesis space by introducing new basis functions and adding corresponding components to the weight vector. As such, as it processes more data, it fits more complex models. The hypothesis space considered here are polynomials and higher order monomials are gradually introduced to the model. The concept of growing the hypothesis space as more data is introduced is not new (training kernel methods with SGD exhibits this behavior), but in the proposed method, choosing which monomials to add to the hypothesis space is very cheap.

algorithm, hypothesis space, monomial, (11 more...)

Neural Information Processing Systems

Country: North America > Canada > Quebec > Montreal (0.05)

Genre: Research Report (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.54)

Add feedback

between the correctness of autodiff systems and that of applications (e.g., gradient descent) built upon autodiff systems?

Neural Information Processing SystemsOct-2-2025, 20:46:42 GMT

We thank the reviewers for their constructive and inspiring feedback. As we cannot see R2 (i.e., Reviewer #2), we respond to the reviews by R1, R3, and R4 only. The correctness of autodiff systems defined in the paper could be misleading to practitioners. We agree with the reviewers' points that (i) the correctness of the applications built upon autodiff systems is as important Also, we do not claim that our correctness condition is "the" Rather we are just suggesting "a" correctness condition that can serve as a reasonable (possibly minimal) We will clarify this limitation in the revised version of the paper. Here are detailed responses to the point (ii) on the applications mentioned in the reviews.

autodiff system, correctness, gradient descent, (14 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.45)

Add feedback

Deep Leakage from Gradients

Ligeng Zhu, Zhijian Liu, Song Han

Neural Information Processing SystemsOct-2-2025, 20:37:06 GMT

Distributed training becomes necessary to speedup training on large-scale datasets.

artificial intelligence, machine learning, natural language, (14 more...)

Neural Information Processing Systems

Country: North America > United States > Massachusetts (0.28)

Industry:

Health & Medicine (0.94)
Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.46)

Add feedback

Stein Variational Gradient Descent with Matrix-Valued Kernels

Neural Information Processing SystemsOct-2-2025, 19:57:47 GMT

On the other hand, standard SVGD only uses the first order gradient information, and can not leverage the advantage of the second order methods, such as Newton's method and natural gradient, to achieve better performance on challenging problems with complex loss landscapes or domains.

artificial intelligence, machine learning, svgd, (13 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.66)

Add feedback

Minimal Variance Sampling in Stochastic Gradient Boosting

Bulat Ibragimov, Gleb Gusev

Neural Information Processing SystemsOct-2-2025, 19:38:49 GMT

Neural Information Processing Systems http://nips.cc/

algorithm, artificial intelligence, machine learning, (18 more...)

Neural Information Processing Systems

Country: Europe > Russia (0.14)

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.96)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.85)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.51)

Add feedback

Export Reviews, Discussions, Author Feedback and Meta-Reviews

Neural Information Processing SystemsOct-2-2025, 19:21:31 GMT

First provide a summary of the paper, and then address the following criteria: Quality, clarity, originality and significance. This paper seems to essentially combine three published ideas: accelerated gradient, the stochastic gradient variance reduction technique of Johnson and Zhang, and variance reduction via minibatching. Hence, on a conceptual level at least, it's a fairly incremental paper (I don't want to minimize the effort that may have gone into developing the convergence proof). With this said, it's well-done, mostly well-written, and has good theoretical and experimental results. In terms of quality, originality and significance, it's as I said above: they're combining pre-existing ideas, but doing it well, and included a convergence proof with a slightly improved rate over the competition.

algorithm, experiment, originality and significance, (11 more...)

Neural Information Processing Systems

Country: North America > Canada > Quebec > Montreal (0.05)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.39)

Add feedback