AITopics | Difan Zou

Stochastic Gradient Hamiltonian Monte Carlo (SGHMC) algorithms have received increasing attention in both theory and practice. In this paper, we propose a Stochastic Recursive Variance-Reduced gradient HMC (SRVR-HMC) algorithm. It makes use of a semi-stochastic gradient estimator that recursively accumulates the gradient information to reduce the variance of the stochastic gradient. We provide a convergence analysis of SRVR-HMC for sampling from a class of non-log-concave distributions and show that SRVR-HMC converges faster than all existing HMC-type algorithms based on underdamped Langevin dynamics. Thorough experiments on synthetic and real-world datasets validate our theory and demonstrate the superiority of SRVR-HMC.

algorithm, artificial intelligence, machine learning, (14 more...)

Neural Information Processing Systems

Country: North America > United States > California (0.29)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (1.00)

Add feedback

Global Convergence of Langevin Dynamics Based Algorithms for Nonconvex Optimization

Pan Xu, Jinghui Chen, Difan Zou, Quanquan Gu

Neural Information Processing SystemsMar-26-2025, 16:16:25 GMT

We present a unified framework to analyze the global convergence of Langevin dynamics based algorithms for nonconvex finite-sum optimization with n component functions. At the core of our analysis is a direct analysis of the ergodicity of the numerical approximations to Langevin dynamics, which leads to faster convergence rates.

artificial intelligence, machine learning, optimization problem, (16 more...)

Neural Information Processing Systems

Country: North America > United States (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.75)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.51)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.47)

Add feedback

Layer-Dependent Importance Sampling for Training Deep and Large Graph Convolutional Networks

Difan Zou, Ziniu Hu, Yewen Wang, Song Jiang, Yizhou Sun, Quanquan Gu

Neural Information Processing SystemsMar-26-2025, 05:32:26 GMT

Neural Information Processing Systems http://nips.cc/

artificial intelligence, machine learning, node, (19 more...)

Neural Information Processing Systems

Country:

Europe (0.68)
North America > United States > New York (0.15)

Genre: Research Report > New Finding (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

An Improved Analysis of Training Over-parameterized Deep Neural Networks

Difan Zou, Quanquan Gu

Neural Information Processing SystemsMar-23-2025, 23:40:34 GMT

Neural Information Processing Systems http://nips.cc/

artificial intelligence, machine learning, neural network, (17 more...)

Neural Information Processing Systems

Country: North America > United States > California (0.28)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Stochastic Gradient Hamiltonian Monte Carlo Methods with Recursive Variance Reduction

Difan Zou, Pan Xu, Quanquan Gu

Neural Information Processing SystemsJan-26-2025, 22:30:37 GMT

Stochastic Gradient Hamiltonian Monte Carlo (SGHMC) algorithms have received increasing attention in both theory and practice. In this paper, we propose a Stochastic Recursive Variance-Reduced gradient HMC (SRVR-HMC) algorithm. It makes use of a semi-stochastic gradient estimator that recursively accumulates the gradient information to reduce the variance of the stochastic gradient. We provide a convergence analysis of SRVR-HMC for sampling from a class of non-log-concave distributions and show that SRVR-HMC converges faster than all existing HMC-type algorithms based on underdamped Langevin dynamics. Thorough experiments on synthetic and real-world datasets validate our theory and demonstrate the superiority of SRVR-HMC.

algorithm, artificial intelligence, machine learning, (14 more...)

Neural Information Processing Systems

Country: North America > United States > California (0.29)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (1.00)

Add feedback

Layer-Dependent Importance Sampling for Training Deep and Large Graph Convolutional Networks

Difan Zou, Ziniu Hu, Yewen Wang, Song Jiang, Yizhou Sun, Quanquan Gu

Neural Information Processing SystemsJan-25-2025, 17:22:57 GMT

Graph convolutional networks (GCNs) have recently received wide attentions, due to their successful applications in different graph tasks and different domains. Training GCNs for a large graph, however, is still a challenge. Original full-batch GCN training requires calculating the representation of all the nodes in the graph per GCN layer, which brings in high computation and memory costs. To alleviate this issue, several sampling-based methods have been proposed to train GCNs on a subset of nodes. Among them, the node-wise neighbor-sampling method recursively samples a fixed number of neighbor nodes, and thus its computation cost suffers from exponential growing neighbor size; while the layer-wise importance-sampling method discards the neighbor-dependent constraints, and thus the nodes sampled across layer suffer from sparse connection problem.

artificial intelligence, machine learning, node, (19 more...)

Neural Information Processing Systems

Country:

Europe (0.68)
North America > United States > New York (0.15)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Communications (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

An Improved Analysis of Training Over-parameterized Deep Neural Networks

Difan Zou, Quanquan Gu

Neural Information Processing SystemsJan-24-2025, 11:38:45 GMT

A recent line of research has shown that gradient-based algorithms with random initialization can converge to the global minima of the training loss for overparameterized (i.e., sufficiently wide) deep neural networks. However, the condition on the width of the neural network to ensure the global convergence is very stringent, which is often a high-degree polynomial in the training sample size n (e.g., O(n

artificial intelligence, machine learning, neural network, (17 more...)

Neural Information Processing Systems

Country: North America > United States > California (0.28)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Global Convergence of Langevin Dynamics Based Algorithms for Nonconvex Optimization

Pan Xu, Jinghui Chen, Difan Zou, Quanquan Gu

Neural Information Processing SystemsOct-7-2024, 21:16:42 GMT

We present a unified framework to analyze the global convergence of Langevin dynamics based algorithms for nonconvex finite-sum optimization with n component functions. At the core of our analysis is a direct analysis of the ergodicity of the numerical approximations to Langevin dynamics, which leads to faster convergence rates.

artificial intelligence, machine learning, optimization problem, (15 more...)

Neural Information Processing Systems

Country: North America > United States (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.75)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.51)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.47)

Add feedback

Filters

Collaborating Authors

Difan Zou

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Stochastic Gradient Hamiltonian Monte Carlo Methods with Recursive Variance Reduction

Global Convergence of Langevin Dynamics Based Algorithms for Nonconvex Optimization

Layer-Dependent Importance Sampling for Training Deep and Large Graph Convolutional Networks

An Improved Analysis of Training Over-parameterized Deep Neural Networks

Stochastic Gradient Hamiltonian Monte Carlo Methods with Recursive Variance Reduction

Layer-Dependent Importance Sampling for Training Deep and Large Graph Convolutional Networks

An Improved Analysis of Training Over-parameterized Deep Neural Networks

Global Convergence of Langevin Dynamics Based Algorithms for Nonconvex Optimization