Balcony: A Lightweight Approach to Dynamic Inference of Generative Language Models

Open in new window