MoMo: Momentum Models for Adaptive Learning Rates