Small Language Models (SLMs) Can Still Pack a Punch: A survey

Subramanian, Shreyas, Elango, Vikram, Gungor, Mecit

Jan-3-2025–arXiv.org Artificial Intelligence

Large Language Models (LLMs) refer to Transformer-based language models (from [128]) with billions of parameters, which exhibit surprising abilities not present in smaller models. LLMs have had far reaching impact on academic research related to Language modeling as well as industry adoption. Several papers and surveys cover traditional LLMs - for example [153] by Zhao et al. provides a comprehensive review of recent advances in LLMs. The paper discusses key techniques for developing LLMs, including scaling laws, emergent abilities, distributed training algorithms, eliciting abilities through prompting, and aligning models to human values. The review also covers recent progress in pre-training, adaptation, utilization, and capability evaluation of LLMs. Other recent surveys on LLMs such as [47] also cover similar topics, but additionally explores practical applications, productivity tools, prompting techniques, limitations and future challenges. Surveys such as [153, 47, 96, 158] all generally cover models that have more than 10B parameters, referred to as Large or Foundational models with a cursory mention of smaller models for language modeling. Independently, there has been a growing interest in smaller language models.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

Jan-3-2025

arXiv.org PDF

Add feedback

Genre:
- Overview (1.00)
- Research Report
  - Promising Solution (0.45)
  - New Finding (0.45)

Industry:
- Information Technology (0.67)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language > Large Language Model (1.00)
  - Machine Learning > Neural Networks
    - Deep Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found