Differentially Private Learning Needs Better Model Initialization and Self-Distillation