AITopics | chebyshev step

Collaborating Authors

chebyshev step

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Accelerating Convergence of Stein Variational Gradient Descent via Deep Unfolding

Kawamura, Yuya, Takabe, Satoshi

arXiv.org Machine LearningFeb-23-2024

Stein variational gradient descent (SVGD) is a prominent particle-based variational inference method used for sampling a target distribution. SVGD has attracted interest for application in machine-learning techniques such as Bayesian inference. In this paper, we propose novel trainable algorithms that incorporate a deep-learning technique called deep unfolding,into SVGD. This approach facilitates the learning of the internal parameters of SVGD, thereby accelerating its convergence speed. To evaluate the proposed trainable SVGD algorithms, we conducted numerical simulations of three tasks: sampling a one-dimensional Gaussian mixture, performing Bayesian logistic regression, and learning Bayesian neural networks. The results show that our proposed algorithms exhibit faster convergence than the conventional variants of SVGD.

c-dusvgd, step size, svgd, (14 more...)

arXiv.org Machine Learning

2402.15125

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.04)
North America > Canada > Ontario > Toronto (0.04)

Genre: Research Report > New Finding (0.55)

Add feedback

Convergence Acceleration via Chebyshev Step: Plausible Interpretation of Deep-Unfolded Gradient Descent

Takabe, Satoshi, Wadayama, Tadashi

arXiv.org Machine LearningOct-26-2020

Deep unfolding is a promising deep-learning technique, whose network architecture is based on expanding the recursive structure of existing iterative algorithms. Although convergence acceleration is a remarkable advantage of deep unfolding, its theoretical aspects have not been revealed yet. The first half of this study details the theoretical analysis of the convergence acceleration in deep-unfolded gradient descent (DUGD) whose trainable parameters are step sizes. We propose a plausible interpretation of the learned step-size parameters in DUGD by introducing the principle of Chebyshev steps derived from Chebyshev polynomials. The use of Chebyshev steps in gradient descent (GD) enables us to bound the spectral radius of a matrix governing the convergence speed of GD, leading to a tight upper bound on the convergence rate. The convergence rate of GD using Chebyshev steps is shown to be asymptotically optimal, although it has no momentum terms. We also show that Chebyshev steps numerically explain the learned step-size parameters in DUGD well. In the second half of the study, %we apply the theory of Chebyshev steps and Chebyshev-periodical successive over-relaxation (Chebyshev-PSOR) is proposed for accelerating linear/nonlinear fixed-point iterations. Theoretical analysis and numerical experiments indicate that Chebyshev-PSOR exhibits significantly faster convergence for various examples such as Jacobi method and proximal gradient methods.

artificial intelligence, chebyshev step, machine learning, (18 more...)

arXiv.org Machine Learning

2010.13335

Country:

North America > United States > California > Los Angeles County > Long Beach (0.04)
Europe > Russia (0.04)
Europe > France > Île-de-France > Paris > Paris (0.04)
(2 more...)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.90)

Add feedback

Theoretical Interpretation of Learned Step Size in Deep-Unfolded Gradient Descent

Takabe, Satoshi, Wadayama, Tadashi

arXiv.org Machine LearningJan-30-2020

Theoretical Interpretation of Learned Step Size in Deep-Unfolded Gradient Descent Satoshi Takabe † and Tadashi Wadayama Nagoya Institute of Technology, Gokiso, Nagoya, Aichi, 466-8555, Japan, {wadayama, s_takabe}@nitech.ac.jp † RIKEN Center for Advanced Intelligence Project, Chuo-ku, Tokyo, 103-0027, Japan Abstract --Deep unfolding is a promising deep-learning technique in which an iterative algorithm is unrolled to a deep network architecture with trainable parameters. In the case of gradient descent algorithms, as a result of the training process, one often observes the acceleration of the convergence speed with learned non-constant step size parameters whose behavior is not intuitive nor interpretable from conventional theory. In this paper, we provide a theoretical interpretation of the learned step size of deep-unfolded gradient descent (DUGD). We first prove that the training process of DUGD reduces not only the mean squared error loss but also the spectral radius related to the convergence rate. Next, we show that minimizing the upper bound of the spectral radius naturally leads to the Chebyshev step which is a sequence of the step size based on Chebyshev polynomials. The numerical experiments confirm that the Chebyshev steps qualitatively reproduce the learned step size parameters in DUGD, which provides a plausible interpretation of the learned parameters. Additionally, we show that the Chebyshev steps achieve the lower bound of the convergence rate for the first-order method in a specific limit without learning parameters or momentum terms. I NTRODUCTION Deep unfolding [10], [12] is a promising deep learning approach whose architecture is based on existing iterative algorithms with tuning parameters such as step sizes in gradient descent (GD). The recursive structure of the algorithm is unrolled to a deep network and some parameters are embedded into the network. These parameters can be trained using standard deep learning techniques such as back propagation and stochastic GD if all the processes in the algorithm are differentiable. One notable advantage of deep unfolding is the acceleration of the convergence speed that results from tuning parameters compared with the original algorithm. Embedding proper trainable parameters also offers a flexible network structure to the algorithm that is applicable, for example, to inverse problems with/without prior information [26]. Recently, theoretical aspects of deep unfolding have also been investigated [5], [21], [23]. MSE performance (upper) and learned step size parameters {γ t} 24 t 0 (lower) of DUGD (circles) and GD with a constant step size (cross marks) when (n,m) (300, 600).

algorithm, chebyshev step, step size parameter, (16 more...)

arXiv.org Machine Learning

2001.05142

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.24)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > California > Los Angeles County > Long Beach (0.04)
(3 more...)

Genre:

Research Report (0.50)
Instructional Material (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback