Small Language Models (SLMs) Can Still Pack a Punch: A survey

Subramanian, Shreyas, Elango, Vikram, Gungor, Mecit

arXiv.org Artificial Intelligence 

Large Language Models (LLMs) refer to Transformer-based language models (from [128]) with billions of parameters, which exhibit surprising abilities not present in smaller models. LLMs have had far reaching impact on academic research related to Language modeling as well as industry adoption. Several papers and surveys cover traditional LLMs - for example [153] by Zhao et al. provides a comprehensive review of recent advances in LLMs. The paper discusses key techniques for developing LLMs, including scaling laws, emergent abilities, distributed training algorithms, eliciting abilities through prompting, and aligning models to human values. The review also covers recent progress in pre-training, adaptation, utilization, and capability evaluation of LLMs. Other recent surveys on LLMs such as [47] also cover similar topics, but additionally explores practical applications, productivity tools, prompting techniques, limitations and future challenges. Surveys such as [153, 47, 96, 158] all generally cover models that have more than 10B parameters, referred to as Large or Foundational models with a cursory mention of smaller models for language modeling. Independently, there has been a growing interest in smaller language models.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found