Reviews: Stochastic Chebyshev Gradient Descent for Spectral Optimization
–Neural Information Processing Systems
Spectral optimization is defined as finding \theta that minimizes F(A(\theta)) g(\theta) where A(\theta) is a symmetric matrix and F typically the trace of an analytic function i.e. F(A) tr(p(A)) where p is a polynomial. They propose an unbiased estimator of F by randomly truncating the Chebyshev approximation to F and doing importance sampling. Moreover, they calculate the optimal distribution for this importance sampling. They demonstrate how this method would be used for SGD and stochastic Variance Reduced Gradient.
Neural Information Processing Systems
Oct-7-2024, 17:15:53 GMT
- Technology: