SPAD: Specialized Prefill and Decode Hardware for Disaggregated LLM Inference

Open in new window