Unbiased Gradient Estimation for Distributionally Robust Learning

Ghosh, Soumyadip, Squillante, Mark

arXiv.org Machine Learning 

The formulation of distributionally robust optimization problems for machine learning (DRL) has the potential to significantly improve model generalization by reducing the probability of poor performance over samples not encountered in training. The minimum-maximum nature of these formulations, however, creates fundamental difficulties for standard algorithms such as stochastic gradient descent (SGD). To address these difficulties, we consider in this paper a new SGD algorithm where gradient descent is applied to the outer minimization problem. Our main contributions include efficiently estimating the gradient of the inner maximization problem by exploiting a randomization technique called multi-level Monte Carlo, introduced by Giles (2008). We present theoretical results that establish why standard gradient estimators fail and that determine the range of parameter values for the gradient estimator of our proposed DRL approach which balances a fundamental tradeoff between stochastic error and computation time. A set of numerical experiments are also presented demonstrating that our DRL approach yields significant computational savings over previous work, while further illustrating the efficacy of our DRL approach for better model generalization. We now provide our formal general setting for the DRL problem, which is consistent with previous recent work in the literature (see, e.g., Namkoong and Duchi (2016, 2017); Ghosh et al. (2018)).

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found