Beyond BatchNorm: Towards a Unified Understanding of Normalization in Deep Learning