Effective Rank and the Staircase Phenomenon: New Insights into Neural Network Training Dynamics