safe adaptive importance sampling
Safe Adaptive Importance Sampling
Importance sampling has become an indispensable strategy to speed up optimization algorithms for large-scale applications. Improved adaptive variants -- using importance values defined by the complete gradient information which changes during optimization -- enjoy favorable theoretical properties, but are typically computationally infeasible. In this paper we propose an efficient approximation of gradient-based sampling, which is based on safe bounds on the gradient. The proposed sampling distribution is (i) provably the \emph{best sampling} with respect to the given bounds, (ii) always better than uniform sampling and fixed importance sampling and (iii) can efficiently be computed -- in many applications at negligible extra cost. The proposed sampling scheme is generic and can easily be integrated into existing algorithms. In particular, we show that coordinate-descent (CD) and stochastic gradient descent (SGD) can enjoy significant a speed-up under the novel scheme. The proven efficiency of the proposed sampling is verified by extensive numerical testing.
Reviews: Safe Adaptive Importance Sampling
The authors present a "safe" adaptive importance sampling strategy for coordinate descent and stochastic gradient methods. Based on lower and upper bounds on the gradient values, an efficient approximation of gradient based sampling is proposed. The method is proven to be the best strategy with respect to the bounds, always better than uniform or fixed importance sampling and can be computed efficiently for negligible extra cost. Although adaptive importance sampling strategies have been previously proposed, the authors present a novel formulation of selecting the optimal sampling distribution as a convex optimization problem and present an efficient algorithm to solve it. This paper is well written and a nice contribution to the study of importance sampling techniques.
Safe Adaptive Importance Sampling
Stich, Sebastian U., Raj, Anant, Jaggi, Martin
Importance sampling has become an indispensable strategy to speed up optimization algorithms for large-scale applications. Improved adaptive variants -- using importance values defined by the complete gradient information which changes during optimization -- enjoy favorable theoretical properties, but are typically computationally infeasible. In this paper we propose an efficient approximation of gradient-based sampling, which is based on safe bounds on the gradient. The proposed sampling distribution is (i) provably the \emph{best sampling} with respect to the given bounds, (ii) always better than uniform sampling and fixed importance sampling and (iii) can efficiently be computed -- in many applications at negligible extra cost. The proposed sampling scheme is generic and can easily be integrated into existing algorithms.