Mooncake: A KVCache-centric Disaggregated Architecture for LLM Serving

Open in new window