Effects of Data Geometry in Early Deep Learning