Convex Distillation: Efficient Compression of Deep Networks via Convex Optimization