Merino: Entropy-driven Design for Generative Language Models on IoT Devices