Recursive Inference Scaling: A Winning Path to Scalable Inference in Language and Multimodal Systems

Jun-13-2026, 12:33:45 GMT–Neural Information Processing Systems

Inspired by recent findings on the fractal geometry of language, we introduce Recursive INference Scaling (RINS) as a complementary, plug-in recipe for scaling inference time in language and multimodal systems. RINS is a particular form of recursive depth that significantly outperforms +55 other variants, including the recent repeat-all-over (RAO) strategy in Mobile LLM (Liu et al., 2024) and latent recurrent thinking (Geiping et al., 2025). Unlike prior works, we carry out our comparisons on a compute-matched regime, and demonstrate that for a fixed model size and training compute budget, RINS substantially improves language modeling performance.

artificial intelligence, large language model, natural language, (8 more...)

Neural Information Processing Systems

Jun-13-2026, 12:33:45 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.30)