Efficient Large Language Models with Zero-Shot Adjustable Acceleration

Open in new window