Convergence of Batch Updating Methods with Approximate Gradients and/or Noisy Measurements: Theory and Computational Results

Reddy, Tadipatri Uday Kiran, Vidyasagar, M.

Jan-27-2023–arXiv.org Machine Learning

In this paper, we present a unified and general framework for analyzing the batch updating approach to nonlinear, high-dimensional optimization. The framework encompasses all the currently used batch updating approaches, and is applicable to nonconvex as well as convex functions. Moreover, the framework permits the use of noise-corrupted gradients, as well as first-order approximations to the gradient (sometimes referred to as "gradient-free" approaches). By viewing the analysis of the iterations as a problem in the convergence of stochastic processes, we are able to establish a very general theorem, which includes most known convergence results for zeroth-order and first-order methods. The analysis of "second-order" or momentum-based methods is not a part of this paper, and will be studied elsewhere. However, numerical experiments indicate that momentum-based methods can fail if the true gradient is replaced by its first-order approximation. This requires further theoretical analysis.

artificial intelligence, deep learning, machine learning, (16 more...)

arXiv.org Machine Learning

Jan-27-2023

arXiv.org PDF

Add feedback

Country:
- Europe > Russia (0.04)
- North America > United States
  - Georgia > Fulton County
    - Atlanta (0.04)
  - California > San Diego County
    - San Diego (0.04)
- Asia
  - India > Telangana (0.04)
  - Russia (0.04)

Genre:
- Research Report (0.50)

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning (1.00)
  - Machine Learning > Neural Networks
    - Deep Learning (0.46)