Dissociating model architectures from inference computations

Jul-22-2025–arXiv.org Artificial Intelligence

Parr et al., 2025 examines how auto-regressive and deep temporal models differ in their treatment of non-Markovian sequence modelling. Building on this, we highlight the need for dissociating model architectures, i.e., how the predictive distribution factorises, from the computations invoked at inference. We demonstrate that deep temporal computations are mimicked by autoregressive models by structuring context access during iterative inference. Using a transformer trained on next-token prediction, we show that inducing hierarchical temporal factorisation during iterative inference maintains predictive capacity while instantiating fewer computations. This emphasises that processes for constructing and refining predictions are not necessarily bound to their underlying model architectures.

artificial intelligence, model architecture, object-oriented architecture, (18 more...)

arXiv.org Artificial Intelligence

Jul-22-2025

arXiv.org PDF

Add feedback

Country:
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.15)

Genre:
- Research Report (0.40)

Industry:
- Health & Medicine > Therapeutic Area > Neurology (0.99)

Technology:
- Information Technology > Artificial Intelligence
  - Cognitive Science (0.99)
  - Representation & Reasoning > Object-Oriented Architecture (0.83)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found