Reviews: Kalman Normalization: Normalizing Internal Representations Across Network Layers

Neural Information Processing Systems 

This paper addresses the batch normalization problem with small batch size. Traditional batch normalization relies on the estimation of the batch statistics (mean and variance) over each batch. Batch normalization requires relatively large batch size in order to obtain relatively reliable estimates of these statistics. However, due to memory constraint, some higher-level tasks (e.g., in computer vision) could not use large batchsize. Therefore, developing normalization method with small batch size is important for improving the performance of these systems.