Puzzle: Distillation-Based NAS for Inference-Optimized LLMs