Airavata: Introducing Hindi Instruction-tuned LLM

Gala, Jay, Jayakumar, Thanmay, Husain, Jaavid Aktar, M, Aswanth Kumar, Khan, Mohammed Safi Ur Rahman, Kanojia, Diptesh, Puduppully, Ratish, Khapra, Mitesh M., Dabre, Raj, Murthy, Rudra, Kunchukuttan, Anoop

Jan-26-2024–arXiv.org Artificial Intelligence

The last year has witnessed tremendous interest and activity in the world of Large Language Models (LLMs). LLMs hold the potential to unlock exciting applications in artificial intelligence thanks to their ability to comprehend complex natural language instructions and excel in a broad spectrum of tasks involving language, knowledge, reasoning, and creative generation. To foster research, innovation, and widespread adoption, an open ecosystem is essential. We have observed significant advancements in this area with the launch of models like Llama 2 (Touvron et al., 2023) and Mistral (Jiang et al., 2023), as well as their instruction-tuned variants such as Llama 2 Chat (Touvron et al., 2023), Mistral-Instruct (Jiang et al., 2023), and Zephyr (Tunstall et al., 2023), among others. However, most of these advancements have been predominantly centered on the English language. There is limited support for Indian languages, which can be attributed to the incidental inclusion of some Indian language data that slipped through the data filters during the pre-training of these language models.

arxiv preprint arxiv, dataset, instruction, (14 more...)

arXiv.org Artificial Intelligence

Jan-26-2024

arXiv.org PDF

Add feedback

Country:
- North America > United States
  - New York > New York County
    - New York City (0.04)
  - Minnesota > Hennepin County
    - Minneapolis (0.14)
- Europe
  - Italy > Tuscany
    - Florence (0.04)
  - Denmark > Capital Region
    - Copenhagen (0.04)
- Asia
  - Indonesia > Bali (0.04)
  - Singapore (0.04)
  - Middle East > UAE
    - Abu Dhabi Emirate > Abu Dhabi (0.04)

Genre:
- Research Report (0.40)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language > Large Language Model (1.00)
  - Machine Learning > Neural Networks
    - Deep Learning (1.00)