PARD: Accelerating LLM Inference with Low-Cost PARallel Draft Model Adaptation

Open in new window