Towards a theory of model distillation