Adam-mini: Use Fewer Learning Rates To Gain More

Open in new window