Understanding and Improving Layer Normalization