The Step Decay Schedule: A Near Optimal, Geometrically Decaying Learning Rate Procedure