AITopics | Yingbin Liang

Gradient-based temporal difference (GTD) algorithms are widely used in off-policy learning scenarios. Among them, the two time-scale TD with gradient correction (TDC) algorithm has been shown to have superior performance. In contrast to previous studies that characterized the non-asymptotic convergence rate of TDC only under identical and independently distributed (i.i.d.) data samples, we provide the first non-asymptotic convergence analysis for two time-scale TDC under a non-i.i.d.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

Neural Information Processing Systems

Country:

North America > Canada (0.46)
North America > United States > Ohio (0.40)

Genre: Research Report > New Finding (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Minimax Estimation of Neural Net Distance

Kaiyi Ji, Yingbin Liang

Neural Information Processing SystemsMar-27-2025, 02:26:37 GMT

Neural Information Processing Systems http://nips.cc/

artificial intelligence, machine learning, neural network, (13 more...)

Neural Information Processing Systems

Country: North America > United States (0.46)

Genre: Research Report (0.54)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Convergence of Cubic Regularization for Nonconvex Optimization under KL Property

Yi Zhou, Zhe Wang, Yingbin Liang

Neural Information Processing SystemsMar-26-2025, 21:16:35 GMT

Cubic-regularized Newton's method (CR) is a popular algorithm that guarantees to produce a second-order stationary solution for solving nonconvex optimization problems. However, existing understandings of the convergence rate of CR are conditioned on special types of geometrical properties of the objective function. In this paper, we explore the asymptotic convergence rate of CR by exploiting the ubiquitous Kurdyka-Łojasiewicz (KŁ) property of nonconvex objective functions. In specific, we characterize the asymptotic convergence rate of various types of optimality measures for CR including function value gap, variable distance gap, gradient norm and least eigenvalue of the Hessian matrix. Our results fully characterize the diverse convergence behaviors of these optimality measures in the full parameter regime of the KŁ property. Moreover, we show that the obtained asymptotic convergence rates of CR are order-wise faster than those of first-order gradient descent algorithms under the KŁ property.

artificial intelligence, convergence rate, machine learning, (15 more...)

Neural Information Processing Systems

Country: North America (0.46)

Genre: Research Report > New Finding (0.49)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.38)

Add feedback

Finite-Sample Analysis for SARSA with Linear Function Approximation

Shaofeng Zou, Tengyu Xu, Yingbin Liang

Neural Information Processing SystemsMar-26-2025, 09:54:12 GMT

SARSA is an on-policy algorithm to learn a Markov decision process policy in reinforcement learning. We investigate the SARSA algorithm with linear function approximation under the non-i.i.d.

Add feedback

SpiderBoost and Momentum: Faster Variance Reduction Algorithms

Zhe Wang, Kaiyi Ji, Yi Zhou, Yingbin Liang, Vahid Tarokh

Neural Information Processing SystemsMar-23-2025, 13:26:20 GMT

SARAH and SPIDER are two recently developed stochastic variance-reduced algorithms, and SPIDER has been shown to achieve a near-optimal first-order oracle complexity in smooth nonconvex optimization. However, SPIDER uses an accuracy-dependent stepsize that slows down the convergence in practice, and cannot handle objective functions that involve nonsmooth regularizers. In this paper, we propose SpiderBoost as an improved scheme, which allows to use a much larger constant-level stepsize while maintaining the same near-optimal oracle complexity, and can be extended with proximal mapping to handle composite optimization (which is nonsmooth and nonconvex) with provable convergence guarantee.

artificial intelligence, machine learning, optimization, (13 more...)

Neural Information Processing Systems

Country: North America > United States (0.28)

Genre: Research Report (0.69)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.31)

Add feedback

Two Time-scale Off-Policy TD Learning: Non-asymptotic Analysis over Markovian Samples

Tengyu Xu, Shaofeng Zou, Yingbin Liang

Neural Information Processing SystemsJan-27-2025, 20:00:52 GMT

Gradient-based temporal difference (GTD) algorithms are widely used in off-policy learning scenarios. Among them, the two time-scale TD with gradient correction (TDC) algorithm has been shown to have superior performance. In contrast to previous studies that characterized the non-asymptotic convergence rate of TDC only under identical and independently distributed (i.i.d.) data samples, we provide the first non-asymptotic convergence analysis for two time-scale TDC under a non-i.i.d.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

Neural Information Processing Systems

Country:

North America > Canada (0.46)
North America > United States (0.28)

Genre: Research Report > New Finding (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Finite-Sample Analysis for SARSA with Linear Function Approximation

Shaofeng Zou, Tengyu Xu, Yingbin Liang

Neural Information Processing SystemsJan-26-2025, 03:01:54 GMT

SARSA is an on-policy algorithm to learn a Markov decision process policy in reinforcement learning. We investigate the SARSA algorithm with linear function approximation under the non-i.i.d.

Add feedback

SpiderBoost and Momentum: Faster Variance Reduction Algorithms

Zhe Wang, Kaiyi Ji, Yi Zhou, Yingbin Liang, Vahid Tarokh

Neural Information Processing SystemsJan-23-2025, 16:33:06 GMT

SARAH and SPIDER are two recently developed stochastic variance-reduced algorithms, and SPIDER has been shown to achieve a near-optimal first-order oracle complexity in smooth nonconvex optimization. However, SPIDER uses an accuracy-dependent stepsize that slows down the convergence in practice, and cannot handle objective functions that involve nonsmooth regularizers. In this paper, we propose SpiderBoost as an improved scheme, which allows to use a much larger constant-level stepsize while maintaining the same near-optimal oracle complexity, and can be extended with proximal mapping to handle composite optimization (which is nonsmooth and nonconvex) with provable convergence guarantee.

artificial intelligence, machine learning, optimization, (14 more...)

Neural Information Processing Systems

Country: North America > United States (0.28)

Genre: Research Report (0.69)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.31)

Add feedback

Reshaped Wirtinger Flow for Solving Quadratic System of Equations

Huishuai Zhang, Yingbin Liang

Neural Information Processing SystemsJan-20-2025, 14:45:30 GMT

Our work is along the line of the Wirtinger flow (WF) approach Candès et al. [2015], which solves the problem by minimizing a nonconvex loss function via a gradient algorithm and can be shown to converge to a global optimal point under good initialization. In contrast to the smooth loss function used in WF, we adopt a nonsmooth but lower-order loss function, and design a gradient-like algorithm (referred to as reshaped-WF). We show that for random Gaussian measurements, reshaped-WF enjoys geometric convergence to a global optimal point as long as the number m of measurements is at the order of O(n), where n is the dimension of the unknown x. This improves the sample complexity of WF, and achieves the same sample complexity as truncated-WF Chen and Candes [2015] but without truncation at gradient step. Furthermore, reshaped-WF costs less computationally than WF, and runs faster numerically than both WF and truncated-WF. Bypassing higher-order variables in the loss function and truncations in the gradient loop, analysis of reshaped-WF is simplified.

algorithm, artificial intelligence, machine learning, (17 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Minimax Estimation of Neural Net Distance

Kaiyi Ji, Yingbin Liang

Neural Information Processing SystemsOct-8-2024, 06:55:21 GMT

An important class of distance metrics proposed for training generative adversarial networks (GANs) is the integral probability metric (IPM), in which the neural net distance captures the practical GAN training via two neural networks. This paper investigates the minimax estimation problem of the neural net distance based on samples drawn from the distributions. We develop the first known minimax lower bound on the estimation error of the neural net distance, and an upper bound tighter than an existing bound on the estimator error for the empirical neural net distance. Our lower and upper bounds match not only in the order of the sample size but also in terms of the norm of the parameter matrices of neural networks, which justifies the empirical neural net distance as a good approximation of the true neural net distance for training GANs in practice.

artificial intelligence, machine learning, neural network, (13 more...)

Neural Information Processing Systems

Country: North America > United States (0.46)

Genre: Research Report (0.54)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Filters

Collaborating Authors

Yingbin Liang

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Two Time-scale Off-Policy TD Learning: Non-asymptotic Analysis over Markovian Samples

Minimax Estimation of Neural Net Distance

Convergence of Cubic Regularization for Nonconvex Optimization under KL Property

Finite-Sample Analysis for SARSA with Linear Function Approximation

SpiderBoost and Momentum: Faster Variance Reduction Algorithms

Two Time-scale Off-Policy TD Learning: Non-asymptotic Analysis over Markovian Samples

Finite-Sample Analysis for SARSA with Linear Function Approximation

SpiderBoost and Momentum: Faster Variance Reduction Algorithms

Reshaped Wirtinger Flow for Solving Quadratic System of Equations

Minimax Estimation of Neural Net Distance