Batch normalization provably avoids ranks collapse for randomly initialised deep networks

Open in new window