Towards Optimal Trade-offs in Knowledge Distillation for CNNs and Vision Transformers at the Edge