Airavata: Introducing Hindi Instruction-tuned LLM
Gala, Jay, Jayakumar, Thanmay, Husain, Jaavid Aktar, M, Aswanth Kumar, Khan, Mohammed Safi Ur Rahman, Kanojia, Diptesh, Puduppully, Ratish, Khapra, Mitesh M., Dabre, Raj, Murthy, Rudra, Kunchukuttan, Anoop
–arXiv.org Artificial Intelligence
The last year has witnessed tremendous interest and activity in the world of Large Language Models (LLMs). LLMs hold the potential to unlock exciting applications in artificial intelligence thanks to their ability to comprehend complex natural language instructions and excel in a broad spectrum of tasks involving language, knowledge, reasoning, and creative generation. To foster research, innovation, and widespread adoption, an open ecosystem is essential. We have observed significant advancements in this area with the launch of models like Llama 2 (Touvron et al., 2023) and Mistral (Jiang et al., 2023), as well as their instruction-tuned variants such as Llama 2 Chat (Touvron et al., 2023), Mistral-Instruct (Jiang et al., 2023), and Zephyr (Tunstall et al., 2023), among others. However, most of these advancements have been predominantly centered on the English language. There is limited support for Indian languages, which can be attributed to the incidental inclusion of some Indian language data that slipped through the data filters during the pre-training of these language models.
arXiv.org Artificial Intelligence
Jan-26-2024