Optimization Methods for Large-Scale Machine Learning