Luck Matters: Understanding Training Dynamics of Deep ReLU Networks