TinyLlama: An Open-Source Small Language Model

Zhang, Peiyuan, Zeng, Guangtao, Wang, Tianduo, Lu, Wei

arXiv.org Artificial Intelligence 

Building on the architecture and tokenizer of Llama 2 (Touvron et al., 2023b), TinyLlama leverages various advances contributed by the open-source community (e.g., FlashAttention (Dao, 2023)), achieving better computational efficiency. Despite its relatively small size, TinyLlama demonstrates remarkable performance in a series of downstream tasks. It significantly outperforms existing open-source language models with comparable sizes.