Remove that Square Root: A New Efficient Scale-Invariant Version of AdaGrad
–Neural Information Processing Systems
Adaptive methods are extremely popular in machine learning as they make learning rate tuning less expensive. This paper introduces a novel optimization algorithm named KATE, which presents a scale-invariant adaptation of the well-known Ada-Grad algorithm. We prove the scale-invariance of KATE for the case of Generalized Linear Models.
Neural Information Processing Systems
May-29-2025, 13:57:06 GMT
- Genre:
- Research Report > Experimental Study (1.00)
- Industry:
- Education > Educational Setting > Online (0.46)
- Technology: