Sandwich: Separating Prefill-Decode Compilation for Efficient CPU LLM Serving

Open in new window