Reviews: Learning values across many orders of magnitude

Neural Information Processing Systems 

I think the problem that the authors are trying to solve is a very important one that shows up in many gradient-based learning situations. The ideas are straightforward but (to my knowledge) new and apparently effective. For this reason I can see them becoming widely used. The paper is well-written and at most places clear. The related work section seems to contain enough relevant references, but it would be nice if some of the most related works would be discussed in a bit more detail.