A framework for bilevel optimization that enables stochastic and global variance reduction algorithms

Neural Information Processing Systems 

Figure 1: Convergence curves of the two proposed methods on a toy problem.