Lookahead Optimizer: k steps forward, 1 step back
Michael Zhang, James Lucas, Jimmy Ba, Geoffrey E. Hinton
–Neural Information Processing Systems
We find that Lookahead is less sensitive to suboptimal hyperparameters and therefore lessens the need for extensive hyperparameter tuning.
Neural Information Processing Systems
Oct-3-2025, 05:38:44 GMT
- Technology: