On the Implicit Bias Towards Minimal Depth of Deep Neural Networks