Can We Learn Communication-Efficient Optimizers?