Sparse Transformer Architectures via Regularized Wasserstein Proximal Operator with $L_1$ Prior

Oct-21-2025–arXiv.org Machine Learning

Modern generative models, such as neural ordinary differential equations (neural ODEs) [4], transformers [25], and diffusion models [22], have demonstrated remarkable ability to learn and generate samples from complex, high-dimensional probability distributions. These architectures have achieved broad success in scientific computing, image processing, and data science, offering scalable frameworks for data-driven modeling. However, training and sampling in such spaces remain expensive and highly sensitive to architectural and optimization choices. Despite these advances, the curse of dimensionality continues to present a fundamental challenge in many real-world applications. Fortunately, numerous problems in scientific computing exhibit intrinsic structures, such as sparsity, low-rank representations, or approximate invariances, that can be interpreted as prior information about the underlying data or operators. Leveraging such priors within generative models offers a promising avenue to improve both computational efficiency and generalization. A classical way to incorporate prior information, such as sparsity or piecewise regularity, is through Bayesian modeling, where the posterior combines a prior distribution encoding structural knowledge with a likelihood function derived from observations.

artificial intelligence, machine learning, sparse transformer, (18 more...)

arXiv.org Machine Learning

Oct-21-2025

arXiv.org PDF

Add feedback

Country:
- North America > United States
  - South Carolina (0.28)
  - California > Los Angeles County
    - Los Angeles (0.28)

Genre:
- Research Report (0.50)

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning > Uncertainty
    - Bayesian Inference (0.34)
  - Machine Learning
    - Neural Networks > Deep Learning (0.50)
    - Learning Graphical Models > Directed Networks
      - Bayesian Learning (0.34)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found