Informed Routing in LLMs: Smarter Token-Level Computation for Faster Inference

Open in new window