Generalization from Starvation: Hints of Universality in LLM Knowledge Graph Learning

Baek, David D., Li, Yuxiao, Tegmark, Max

Oct-10-2024–arXiv.org Artificial Intelligence

We show that these attractor representations optimize generalization to unseen examples by exploiting properties of knowledge graph relations (e.g. We find experimental support for such universality by showing that LLMs and simpler neural networks can be stitched, i.e., by stitching the first part of one model to the last part of another, mediated only by an affine or almost affine transformation. We hypothesize that this dynamic toward simplicity and generalization is driven by "intelligence from starvation": where overfitting is minimized by pressure to minimize the use of resources that are either scarce or competed for against other tasks. Large Language Models (LLMs), despite being primarily trained for next-token predictions, have shown impressive reasoning capabilities (Bubeck et al., 2023; Anthropic, 2024; Team et al., 2023). However, despite recent progress reviewed below, it is not well understood what knowledge LLMs represent internally and how they represent it. Improving such understanding could enable valuable progress relevant to transparency, interpretability, fairness and robustness, for example discovering and correcting inaccuracies to improve model reliability.

large language model, machine learning, natural language, (15 more...)

arXiv.org Artificial Intelligence

Oct-10-2024

arXiv.org PDF

Add feedback

Country:
- North America > United States > Massachusetts (0.28)

Genre:
- Research Report (0.82)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning > Neural Networks
    - Deep Learning (0.52)
  - Natural Language > Large Language Model (1.00)
  - Representation & Reasoning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found