Poodle: Seamlessly Scaling Down Large Language Models with Just-in-Time Model Replacement

Open in new window