Prism: Unleashing GPU Sharing for Cost-Efficient Multi-LLM Serving

Open in new window