ReSSFormer: A Recursive Sparse Structured Transformer for Scalable and Long-Context Reasoning
–arXiv.org Artificial Intelligence
While Transformer architectures have demonstrated impressive scalability across domains, they continue to face challenges in long-context reasoning, computational efficiency, and structural generalization - largely due to rigid layer stacking, dense attention, and reliance on positional encodings. We present ReSSFormer, a Recursive Sparse Structured Transformer that integrates three complementary innovations: Recurrent Reasoning & Memory Unit (R2MU) for iterative reasoning with bounded depth, Adaptive Sparse Attention Module (ASAM) for efficient and focused context selection, and Self-Organizing Encoder Structure (SOES) for position-free structure induction. ReSSFormer replaces conventional depth stacking with recurrent inference, substitutes full attention with token- and expert-level sparsity, and models latent token topology directly from content. Across language modeling, multi-hop QA, and structure-sensitive tasks, ReSSFormer consistently outperforms strong baselines under comparable FLOPs and parameter budgets, highlighting its scalability, efficiency, and structural flexibility.
arXiv.org Artificial Intelligence
Oct-3-2025
- Country:
- Asia
- China > Hebei Province
- Shijiazhuang (0.04)
- Malaysia > Kuala Lumpur
- Kuala Lumpur (0.05)
- Middle East > Jordan (0.04)
- China > Hebei Province
- North America > United States
- New York > New York County > New York City (0.04)
- South America > Chile
- Asia
- Genre:
- Research Report (0.40)
- Technology: