Adaptive Training Distributions with Scalable Online Bilevel Optimization