Large Learning Rates Improve Generalization: But How Large Are We Talking About?