AITopics | Gradient Descent

Considering the one-hidden-layer example above, this corresponds to learning linear predictors over a fixed representation (chosen obliviously and randomly at initialization).

artificial intelligence, machine learning, neural network, (16 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.31)

Add feedback

Optimal Epoch Stochastic Gradient Descent Ascent Methods for Min-Max Optimization

Neural Information Processing SystemsOct-2-2025, 18:08:50 GMT

Although some recent studies have proposed stochastic algorithms with fast convergence rates for min-max problems, they require additional assumptions about the problem, e.g.,

artificial intelligence, duality gap, machine learning, (14 more...)

Neural Information Processing Systems

Country: North America > United States (0.68)

Genre: Research Report (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.88)

Add feedback

Export Reviews, Discussions, Author Feedback and Meta-Reviews

Neural Information Processing SystemsOct-2-2025, 17:51:25 GMT

"NIPS Neural Information Processing Systems 8-11th December 2014, Montreal, Canada",,, "Paper ID:","1527" "Title:","Delay-Tolerant Algorithms for Asynchronous Distributed Online Learning" Current Reviews First provide a summary of the paper, and then address the following criteria: Quality, clarity, originality and significance. This paper considers asynchronous parallel updates in stochastic gradient descent with delays. This is a very important problem in large-scale distributed data processing. The objective of the problem studied in this paper is to achieve regret bounds similar to the ones obtained by adaptive gradient (i.e. This boils down to keeping track of updates to gradient coordinates.

algorithm, author feedback and meta-review, cc paperinformation reviewerinstruction, (9 more...)

Neural Information Processing Systems

Country: North America > Canada > Quebec > Montreal (0.25)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.72)

Add feedback

A Detailed comparisons with related work

Neural Information Processing SystemsOct-2-2025, 17:04:15 GMT

In Table 1, we compare our agnostic learning results. Our results in this setting come from Theorem 3.3. We note that the sample complexity for Diakonikolas et al. To prove Lemma 3.5, we use the following result of Y ehudai and Shamir [35]. We first consider the case when σ satisfies Assumption 3.1.

artificial intelligence, machine learning, nullx null 2 2, (15 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.31)

Add feedback

Agnostic Learning of a Single Neuron with Gradient Descent

Neural Information Processing SystemsOct-2-2025, 17:04:08 GMT

We assume we have access to a set of i.i.d.

artificial intelligence, gradient descent, machine learning, (15 more...)

Neural Information Processing Systems

Country: North America > United States > California > Los Angeles County > Los Angeles (0.29)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Understanding the Role of Momentum in Stochastic Gradient Methods

Igor Gitman, Hunter Lang, Pengchuan Zhang, Lin Xiao

Neural Information Processing SystemsOct-2-2025, 17:02:50 GMT

The use of momentum in stochastic gradient methods has become a widespread practice in machine learning. Different variants of momentum, including heavy-ball momentum, Nesterov's accelerated gradient (NAG), and quasi-hyperbolic momentum (QHM), have demonstrated success on various tasks.

artificial intelligence, convergence rate, machine learning, (17 more...)

Neural Information Processing Systems

Country: