Simulating, Fast and Slow: Learning Policies for Black-Box Optimization