AITopics | online stochastic gradient descent

Collaborating Authors

online stochastic gradient descent

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Distributionally Time-Varying Online Stochastic Optimization under Polyak-{\L}ojasiewicz Condition with Application in Conditional Value-at-Risk Statistical Learning

Pun, Yuen-Man, Farokhi, Farhad, Shames, Iman

arXiv.org Artificial IntelligenceSep-17-2023

In this work, we consider a sequence of stochastic optimization problems following a time-varying distribution via the lens of online optimization. Assuming that the loss function satisfies the Polyak-{\L}ojasiewicz condition, we apply online stochastic gradient descent and establish its dynamic regret bound that is composed of cumulative distribution drifts and cumulative gradient biases caused by stochasticity. The distribution metric we adopt here is Wasserstein distance, which is well-defined without the absolute continuity assumption or with a time-varying support set. We also establish a regret bound of online stochastic proximal gradient descent when the objective function is regularized. Moreover, we show that the above framework can be applied to the Conditional Value-at-Risk (CVaR) learning problem. Particularly, we improve an existing proof on the discovery of the PL condition of the CVaR problem, resulting in a regret bound of online stochastic gradient descent.

gradient descent, online stochastic gradient descent, online stochastic proximal gradient descent, (13 more...)

arXiv.org Artificial Intelligence

2309.09411

Country:

North America > United States > New Jersey > Hudson County > Hoboken (0.04)
Europe > Spain > Aragón (0.04)

Genre: Research Report (0.82)

Industry: Education (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (1.00)

Add feedback

Poor starting points in machine learning

Tygert, Mark

arXiv.org Machine LearningFeb-8-2016

In many settings, the method of Robbins and Monro (online stochastic gradient descent) is known to be optimal for good starting points, but may not be optimal for poor starting points -- indeed, for poor starting points Nesterov acceleration can help during the initial iterations, even though Nesterov methods not designed for stochastic approximation could hurt during later iterations. A good option is to roll off Nesterov acceleration for later iterations. The common practice of training with nontrivial minibatches enhances the advantage of Nesterov acceleration.

artificial intelligence, iteration, machine learning, (16 more...)

arXiv.org Machine Learning

1602.02823

Genre: Research Report (0.65)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.59)

Add feedback