Information Gravity: A Field-Theoretic Model for Token Selection in Large Language Models

Vyshnyvetska, Maryna

arXiv.org Artificial Intelligence 

Large language models (LLMs) have revolutionized the field of artificial intelligence, demonstrating text understanding and generation capabilities approaching human levels. However, despite impressive results, the internal functioning mechanisms of these models largely remain a "black box." As Amodei [1] notes in his essay "The Urgency of Interpretability," researchers have limited understanding of why LLMs generate specific responses and how they arrive at their conclusions. This lack of transparency becomes increasingly problematic as LLMs begin to play central roles in economics, technology, and national security. Of particular concern are phenomena such as unpredictable hallucinations, extreme sensitivity to query formulations, and puzzling patterns in the probability distributions of generated tokens. These phenomena not only limit the reliability of LLMs in critical applications but also point to fundamental gaps in our understanding of their operation.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found