Towards Optimal Caching and Model Selection for Large Model Inference