Reviews: Lookahead Optimizer: k steps forward, 1 step back
–Neural Information Processing Systems
Update: I have read the author's response and have kept my score. Please note that in DeVries and Taylor'17, 'ResNet-18' is not truly the ResNet-18 model (it consists of 4 stages and has more than an order of magnitude more parameters than the original ResNet-18 due to wider channels). This should be made clear in the paper in order not to cause more confusion in the community. Originality: Medium/High The proposed algorithm is considerably different than recently proposed methods for deep learning, which gravitate towards adaptive gradient methods. It has some similarities to variance reduction algorithms with inner and outer loops, however Lookahead has a very simple outer loop structure and and is easy to implement.
Neural Information Processing Systems
Jan-25-2025, 16:59:59 GMT
- Technology: