Training Transformers with 4-bit Integers Haocheng Xi2