A Near-Optimal Algorithm for Stochastic Bilevel Optimization via Double-Momentum

Neural Information Processing Systems 

The latter allows us to control the error in the stochastic gradient updates due to inaccurate solution to both subproblems.