Quality-of-Service Aware LLM Routing for Edge Computing with Multiple Experts

Open in new window