Uncertainty-Aware Hybrid Inference with On-Device Small and Remote Large Language Models