AITopics

Country: Asia > Middle East > Jordan (0.08)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (1.00)

Neural Information Processing SystemsNov-21-2025, 11:27:08 GMT

Non-convex Finite-Sum Optimization Via SCSG Methods

Lihua Lei, Cheng Ju, Jianbo Chen, Michael I. Jordan

Moreover, SCSG is never worse than the state-of-the-art methods based on variance reduction and it significantly outperforms them when the target accuracy is low.

artificial intelligence, machine learning, scsg, (15 more...)

Country:

Asia > Middle East > Jordan (0.05)
North America > United States > Massachusetts (0.04)
North America > United States > California > Los Angeles County > Long Beach (0.04)

Genre: Research Report (0.66)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.31)

Neural Information Processing SystemsOct-8-2024, 08:12:46 GMT

Reviews: A Simple Proximal Stochastic Gradient Method for Nonsmooth Nonconvex Optimization

This paper focuses on the optimization problem min f(x) h(x), where f is of a finite sum structure (with n functions in the sum), with nonconvex but smooth components, and h is a convex but possibly nonsmooth function. So, this is a nonconvex finite sum problem with a convex regularizer. Function h is treated using a prox step. The authors propose a small modification to ProxSVRG (called ProxSVRG), and prove that this small modification has surprisingly interesting consequences. The modification consists in replacing the full gradient computation in the outer loop of ProxSVRG by an approximation thereof through subsampling/minibatch (batch size B).

artificial intelligence, machine learning, proxsvrg, (14 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.42)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.42)

Neural Information Processing SystemsOct-3-2024, 18:42:05 GMT

Non-convex Finite-Sum Optimization Via SCSG Methods

Lihua Lei, Cheng Ju, Jianbo Chen, Michael I. Jordan

Neural Information Processing Systems http://nips.cc/

artificial intelligence, machine learning, scsg, (15 more...)

Country:

Asia > Middle East > Jordan (0.05)
North America > United States > Massachusetts (0.04)
North America > United States > California > Los Angeles County > Long Beach (0.04)

Genre: Research Report (0.46)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.30)

arXiv.org Machine LearningJun-9-2019

SVRG for Policy Evaluation with Fewer Gradient Evaluations

Peng, Zilun, Touati, Ahmed, Vincent, Pascal, Precup, Doina

Stochastic variance-reduced gradient (SVRG) is an optimization method originally designed for tackling machine learning problems with a finite sum structure. SVRG was later shown to work for policy evaluation, a problem in reinforcement learning in which one aims to estimate the value function of a given policy. SVRG makes use of gradient estimates at two scales. At the slower scale, SVRG computes a full gradient over the whole dataset, which could lead to prohibitive computation costs. In this work, we show that two variants of SVRG for policy evaluation could significantly diminish the number of gradient calculations while preserving a linear convergence speed. More importantly, our theoretical result implies that one does not need to use the entire dataset in every epoch of SVRG when it is applied to policy evaluation with linear function approximation. Our experiments demonstrate large computational savings provided by the proposed methods.

artificial intelligence, machine learning, reinforcement learning, (15 more...)

1906.03704

Country: North America (0.46)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Lei, Lihua, Jordan, Michael I.

On the Adaptivity of Stochastic Gradient-Based Optimization

arXiv.org Machine LearningApr-9-2019

Stochastic-gradient-based optimization has been a core enabling methodology in applications to large-scale problems in machine learning and related areas. Despite the progress, the gap between theory and practice remains significant, with theoreticians pursuing mathematical optimality at a cost of obtaining specialized procedures in different regimes (e.g., modulus of strong convexity, magnitude of target accuracy, signal-to-noise ratio), and with practitioners not readily able to know which regime is appropriate to their problem, and seeking broadly applicable algorithms that are reasonably close to optimality. To bridge these perspectives it is necessary to study algorithms that are adaptive to different regimes. We present the stochastically controlled stochastic gradient (SCSG) method for composite convex finite-sum optimization problems and show that SCSG is adaptive to both strong convexity and target accuracy. The adaptivity is achieved by batch variance reduction with adaptive batch sizes and a novel technique, which we referred to as \emph{geometrization}, which sets the length of each epoch as a geometric random variable. The algorithm achieves strictly better theoretical complexity than other existing adaptive algorithms, while the tuning parameters of the algorithm only depend on the smoothness parameter of the objective.

algorithm, artificial intelligence, machine learning, (17 more...)

1904.0448

Country: North America > United States > California (0.28)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.91)

Neural Information Processing SystemsDec-31-2017

Non-convex Finite-Sum Optimization Via SCSG Methods

Lei, Lihua, Ju, Cheng, Chen, Jianbo, Jordan, Michael I.

We develop a class of algorithms, as variants of the stochastically controlled stochastic gradient (SCSG) methods , for the smooth nonconvex finite-sum optimization problem. Only assuming the smoothness of each component, the complexity of SCSG to reach a stationary point with $E \|\nabla f(x)\|^{2}\le \epsilon$ is $O(\min\{\epsilon^{-5/3}, \epsilon^{-1}n^{2/3}\})$, which strictly outperforms the stochastic gradient descent. Moreover, SCSG is never worse than the state-of-the-art methods based on variance reduction and it significantly outperforms them when the target accuracy is low. A similar acceleration is also achieved when the functions satisfy the Polyak-Lojasiewicz condition. Empirical experiments demonstrate that SCSG outperforms stochastic gradient methods on training multi-layers neural networks in terms of both training and validation loss.

artificial intelligence, machine learning, scsg, (15 more...)

Country: North America > United States (0.28)

Genre: Research Report (0.66)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Lei, Lihua, Jordan, Michael I.

Less than a Single Pass: Stochastically Controlled Stochastic Gradient Method

arXiv.org Machine LearningJul-2-2017

We develop and analyze a procedure for gradient-based optimization that we refer to as stochastically controlled stochastic gradient (SCSG). As a member of the SVRG family of algorithms, SCSG makes use of gradient estimates at two scales, with the number of updates at the faster scale being governed by a geometric random variable. Unlike most existing algorithms in this family, both the computation cost and the communication cost of SCSG do not necessarily scale linearly with the sample size $n$; indeed, these costs are independent of $n$ when the target accuracy is low. An experimental evaluation on real datasets confirms the effectiveness of SCSG.

artificial intelligence, convex case, machine learning, (17 more...)

1609.03261

Country: North America > United States > California (0.28)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.72)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.71)

Bolstad, Andrew, Van Veen, Barry, Nowak, Robert

Causal Network Inference via Group Sparse Regularization

arXiv.org Machine LearningJun-3-2011

This paper addresses the problem of inferring sparse causal networks modeled by multivariate auto-regressive (MAR) processes. Conditions are derived under which the Group Lasso (gLasso) procedure consistently estimates sparse network structure. The key condition involves a "false connection score." In particular, we show that consistent recovery is possible even when the number of observations of the network is far less than the number of parameters describing the network, provided that the false connection score is less than one. The false connection score is also demonstrated to be a useful metric of recovery in non-asymptotic regimes. The conditions suggest a modified gLasso procedure which tends to improve the false connection score and reduce the chances of reversing the direction of causal influence. Computational experiments and a real network based electrocorticogram (ECoG) simulation study demonstrate the effectiveness of the approach.

artificial intelligence, machine learning, node, (16 more...)

doi: 10.1109/TSP.2011.2129515

1106.0762

Country: North America > United States (0.93)

Genre: Research Report (1.00)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)