Do Deep Networks Forget Initialization? A Forgetting-Time View of Practical Inductive Bias