Variance-Preserving Initialization Schemes Improve Deep Network Training: But Which Variance is Preserved?