AITopics | katyusha

Collaborating Authors

katyusha

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Optimal Black-Box Reductions Between Optimization Objectives

Zeyuan Allen-Zhu, Elad Hazan

Neural Information Processing SystemsMar-23-2026, 03:36:44 GMT

Neural Information Processing Systems http://nips.cc/

artificial intelligence, machine learning, reduction, (17 more...)

Neural Information Processing Systems

Genre: Research Report > New Finding (0.46)

Industry: Transportation > Air (0.41)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Nonlinear Acceleration of Stochastic Algorithms

Neural Information Processing SystemsNov-21-2025, 14:02:54 GMT

artificial intelligence, machine learning, optimization problem, (15 more...)

Neural Information Processing Systems

Country:

Europe > France > Île-de-France > Paris > Paris (0.05)
North America > United States > California > Los Angeles County > Long Beach (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.36)

Add feedback

Boosting First-Order Methods by Shifting Objective: New Schemes with Faster Worst-Case Rates Kaiwen Zhou Anthony Man-Cho So James Cheng Department of Computer Science and Engineering

Neural Information Processing SystemsAug-15-2025, 20:22:34 GMT

We then propose an algorithmic template for tackling the shifted objective, which can exploit such a condition.

complexity, interpolation condition, objective, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > Canada (0.04)
Europe > Russia (0.04)
(3 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)

Add feedback

Nonlinear Acceleration of Stochastic Algorithms

Neural Information Processing SystemsOct-4-2024, 11:17:52 GMT

Extrapolation methods use the last few iterates of an optimization algorithm to produce a better estimate of the optimum. They were shown to achieve optimal convergence rates in a deterministic setting using simple gradient iterates. Here, we study extrapolation methods in a stochastic setting, where the iterates are produced by either a simple or an accelerated stochastic gradient algorithm. We first derive convergence bounds for arbitrary, potentially biased perturbations, then produce asymptotic bounds using the ratio between the variance of the noise and the accuracy of the current point. Finally, we apply this acceleration technique to stochastic algorithms such as SGD, SAGA, SVRG and Katyusha in different settings, and show significant performance gains.

artificial intelligence, machine learning, optimization problem, (15 more...)

Neural Information Processing Systems

Country:

Europe > France > Île-de-France > Paris > Paris (0.05)
North America > United States > California > Los Angeles County > Long Beach (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.36)

Add feedback

Optimal Black-Box Reductions Between Optimization Objectives ∗

Neural Information Processing SystemsMar-12-2024, 08:32:08 GMT

The diverse world of machine learning applications has given rise to a plethora of algorithms and optimization methods, finely tuned to the specific regression or classification task at hand. We reduce the complexity of algorithm design for machine learning by reductions: we develop reductions that take a method developed for one setting and apply it to the entire spectrum of smoothness and strong-convexity in applications. Furthermore, unlike existing results, our new reductions are optimal and more practical. We show how these new reductions give rise to new and faster running times on training linear classifiers for various families of loss functions, and conclude with experiments showing their successes also in practice.

adaptreg, algorithm, reduction, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Transportation > Air (0.41)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Boosting First-Order Methods by Shifting Objective: New Schemes with Faster Worst-Case Rates

Zhou, Kaiwen, So, Anthony Man-Cho, Cheng, James

arXiv.org Machine LearningOct-21-2020

We propose a new methodology to design first-order methods for unconstrained strongly convex problems. Specifically, instead of tackling the original objective directly, we construct a shifted objective function that has the same minimizer as the original objective and encodes both the smoothness and strong convexity of the original objective in an interpolation condition. We then propose an algorithmic template for tackling the shifted objective, which can exploit such a condition. Following this template, we derive several new accelerated schemes for problems that are equipped with various first-order oracles and show that the interpolation condition allows us to vastly simplify and tighten the analysis of the derived methods. In particular, all the derived methods have faster worst-case convergence rates than their existing counterparts. Experiments on machine learning tasks are conducted to evaluate the new methods.

artificial intelligence, machine learning, optimization problem, (18 more...)

arXiv.org Machine Learning

2005.12061

Country:

Asia > China > Hong Kong > Sha Tin (0.04)
North America > United States > New York (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
(3 more...)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.46)

Add feedback

Don't Jump Through Hoops and Remove Those Loops: SVRG and Katyusha are Better Without the Outer Loop

Kovalev, Dmitry, Horvath, Samuel, Richtarik, Peter

arXiv.org Machine LearningJan-24-2019

The stochastic variance-reduced gradient method (SVRG) and its accelerated variant (Katyusha) have attracted enormous attention in the machine learning community in the last few years due to their superior theoretical properties and empirical behaviour on training supervised machine learning models via the empirical risk minimization paradigm. A key structural element in both of these methods is the inclusion of an outer loop at the beginning of which a full pass over the training data is made in order to compute the exact gradient, which is then used to construct a variance-reduced estimator of the gradient. In this work we design {\em loopless variants} of both of these methods. In particular, we remove the outer loop and replace its function by a coin flip performed in each iteration designed to trigger, with a small probability, the computation of the gradient. We prove that the new methods enjoy the same superior theoretical convergence properties as the original methods. However, we demonstrate through numerical experiments that our methods have substantially superior practical behavior.

gradient, katyusha, svrg and katyusha, (13 more...)

arXiv.org Machine Learning

1901.08689

Country:

North America > United States > New York (0.04)
Europe > United Kingdom > Scotland > City of Edinburgh > Edinburgh (0.04)
Europe > Russia > Central Federal District > Moscow Oblast > Moscow (0.04)
(3 more...)

Genre: Research Report > Promising Solution (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)

Add feedback

ASVRG: Accelerated Proximal SVRG

Shang, Fanhua, Jiao, Licheng, Zhou, Kaiwen, Cheng, James, Ren, Yan, Jin, Yufei

arXiv.org Artificial IntelligenceOct-7-2018

This paper proposes an accelerated proximal stochastic variance reduced gradient (ASVRG) method, in which we design a simple and effective momentum acceleration trick. Unlike most existing accelerated stochastic variance reduction methods such as Katyusha, ASVRG has only one additional variable and one momentum parameter. Thus, ASVRG is much simpler than those methods, and has much lower per-iteration complexity. We prove that ASVRG achieves the best known oracle complexities for both strongly convex and non-strongly convex objectives. In addition, we extend ASVRG to mini-batch and non-smooth settings. We also empirically verify our theoretical results and show that the performance of ASVRG is comparable with, and sometimes even better than that of the state-of-the-art stochastic methods.

artificial intelligence, machine learning, oracle complexity, (14 more...)

arXiv.org Artificial Intelligence

1810.03105

Country: Asia > China (0.14)

Genre: Research Report (0.65)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.48)

Add feedback

Dissipativity Theory for Accelerating Stochastic Variance Reduction: A Unified Analysis of SVRG and Katyusha Using Semidefinite Programs

Hu, Bin, Wright, Stephen, Lessard, Laurent

arXiv.org Machine LearningJun-10-2018

Techniques for reducing the variance of gradient estimates used in stochastic programming algorithms for convex finite-sum problems have received a great deal of attention in recent years. By leveraging dissipativity theory from control, we provide a new perspective on two important variance-reduction algorithms: SVRG and its direct accelerated variant Katyusha. Our perspective provides a physically intuitive understanding of the behavior of SVRG-like methods via a principle of energy conservation. The tools discussed here allow us to automate the convergence analysis of SVRG-like methods by capturing their essential properties in small semidefinite programs amenable to standard analysis and computational techniques. Our approach recovers existing convergence results for SVRG and Katyusha and generalizes the theory to alternative parameter choices. We also discuss how our approach complements the linear coupling technique. Our combination of perspectives leads to a better understanding of accelerated variance-reduced stochastic methods for finite-sum problems.

artificial intelligence, dissipativity theory, machine learning, (12 more...)

arXiv.org Machine Learning

1806.03677

Country:

North America > United States > Wisconsin > Dane County > Madison (0.04)
Asia > Middle East > Jordan (0.04)
Europe > Sweden > Stockholm > Stockholm (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Filters

Collaborating Authors

katyusha

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Optimal Black-Box Reductions Between Optimization Objectives

b096577e264d1ebd6b41041f392eec23-Paper.pdf

Nonlinear Acceleration of Stochastic Algorithms

Boosting First-Order Methods by Shifting Objective: New Schemes with Faster Worst-Case Rates Kaiwen Zhou Anthony Man-Cho So James Cheng Department of Computer Science and Engineering

Nonlinear Acceleration of Stochastic Algorithms

Optimal Black-Box Reductions Between Optimization Objectives ∗

Boosting First-Order Methods by Shifting Objective: New Schemes with Faster Worst-Case Rates

Don't Jump Through Hoops and Remove Those Loops: SVRG and Katyusha are Better Without the Outer Loop

ASVRG: Accelerated Proximal SVRG

Dissipativity Theory for Accelerating Stochastic Variance Reduction: A Unified Analysis of SVRG and Katyusha Using Semidefinite Programs