Memory Efficient Adaptive Optimization
Rohan Anil, Vineet Gupta, Tomer Koren, Yoram Singer
–Neural Information Processing Systems
Our method retains the benefits of per-parameter adaptivity while allowing significantly larger models and batch sizes.
Neural Information Processing Systems
Oct-3-2025, 05:15:59 GMT
- Country:
- North America > United States (0.28)
- Genre:
- Research Report (0.46)
- Industry:
- Education (0.47)
- Technology: