Scaling Ensemble Distribution Distillation to Many Classes with Proxy Targets