Understanding Transformers via N-gram Statistics

Mar-27-2025, 00:27:54 GMT–Neural Information Processing Systems

Transformer based large-language models (LLMs) display extreme proficiency with language yet a precise understanding of how they work remains elusive. One way of demystifying transformer predictions would be to describe how they depend on their context in terms of simple template functions. This paper takes a first step in this direction by considering families of functions (i.e.

large language model, machine learning, natural language, (20 more...)

Neural Information Processing Systems

Mar-27-2025, 00:27:54 GMT

Conferences PDF

Add feedback

Country:
- North America
  - Mexico > Mexico City (0.14)
  - United States > Ohio (0.14)

Genre:
- Research Report > Experimental Study (0.93)

Industry:
- Education (0.46)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning > Neural Networks
    - Deep Learning (1.00)
  - Natural Language > Large Language Model (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found