Non-GaussianTensorPrograms

Neural Information Processing Systems 

However, in general, theydo not express anypreference on the initial weight distribution beyond iid sampling.