Uncertainty-Aware Hybrid Inference with On-Device Small and Remote Large Language Models

Open in new window