Zeroth-Order Hard-Thresholding: Gradient Error vs. Expansivity

Jan-17-2025, 17:14:53 GMT–Neural Information Processing Systems

Hard-thresholding gradient descent is a dominant technique to solve this problem. However, first-order gradients of the objective function may be either unavailable or expensive to calculate in a lot of real-world problems, where zeroth-order (ZO) gradients could be a good surrogate. Unfortunately, whether ZO gradients can work with the hard-thresholding operator is still an unsolved problem.To solve this puzzle, in this paper, we focus on the \ell_0 constrained black-box stochastic optimization problems, and propose a new stochastic zeroth-order gradient hard-thresholding (SZOHT) algorithm with a general ZO gradient estimator powered by a novel random support sampling. We provide the convergence analysis of SZOHT under standard assumptions. Importantly, we reveal a conflict between the deviation of ZO estimators and the expansivity of the hard-thresholding operator, and provide a theoretical minimal value of the number of random directions in ZO gradients.

expansivity, gradient, zeroth-order hard-thresholding, (4 more...)

Neural Information Processing Systems

Jan-17-2025, 17:14:53 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning > Statistical Learning (0.62)
  - Representation & Reasoning > Optimization (0.46)