FastMTP: Accelerating LLM Inference with Enhanced Multi-Token Prediction

Open in new window