What can we learn from signals and systems in a transformer? Insights for probabilistic modeling and inference architecture

Aug-29-2025–arXiv.org Artificial Intelligence

In the 1940s, Wiener introduced a linear predictor, where the future prediction is computed by linearly combining the past data. A transformer generalizes this idea: it is a nonlinear predictor where the next-token prediction is computed by nonlinearly combining the past tokens. In this essay, we present a probabilistic model that interprets transformer signals as surrogates of conditional measures, and layer operations as fixed-point updates. An explicit form of the fixed-point update is described for the special case when the probabilistic model is a hidden Markov model (HMM). In part, this paper is in an attempt to bridge the classical nonlinear filtering theory with modern inference architectures.

artificial intelligence, machine learning, transformer, (15 more...)

arXiv.org Artificial Intelligence

Aug-29-2025

arXiv.org PDF

Add feedback

Country:
- North America > United States > Illinois (0.28)

Genre:
- Research Report (0.50)

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning > Uncertainty (1.00)
  - Machine Learning
    - Neural Networks (1.00)
    - Learning Graphical Models > Undirected Networks
      - Markov Models (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found