Knowledge Distillation Under Ideal Joint Classifier Assumption