Zeroth-Order Hard-Thresholding: Gradient Error vs. Expansivity William de V azelhes

Neural Information Processing Systems 

Hard-thresholding gradient descent is a dominant technique to solve this problem.