AITopics | recurrence

Collaborating Authors

recurrence

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Deep Continuous-Time State-Space Models for Marked Event Sequences

Neural Information Processing SystemsJun-23-2026, 01:56:23 GMT

Marked temporal point processes (MTPPs) model sequences of events occurring at irregular time intervals, with wide-ranging applications in fields such as healthcare, finance and social networks. We propose the state-space point process (S2P2) model, a novel and performant model that leverages techniques derived for modern deep state-space models (SSMs) to overcome limitations of existing MTPP models, while simultaneously imbuing strong inductive biases for continuous-time event sequences that other discrete sequence models (i.e., RNNs, transformers) do not capture. Inspired by the classical linear Hawkes processes, we propose an architecture that interleaves stochastic jump differential equations with nonlinearities to create a highly expressive intensity-based MTPP model, without the need for restrictive parametric assumptions for the intensity. Our approach enables efficient training and inference with a parallel scan, bringing linear complexity and sublinear scaling while retaining expressivity to MTPPs. Empirically, S2P2 achieves state-of-the-art predictive likelihoods across eight real-world datasets, delivering an average improvement of 33% over the best existing approaches.

artificial intelligence, machine learning, sequence, (18 more...)

Neural Information Processing Systems

Country: North America > United States (0.67)

Genre:

Research Report > Experimental Study (1.00)
Overview (0.67)

Industry:

Health & Medicine (1.00)
Information Technology > Services (0.48)

Technology:

Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.67)

Add feedback

ABio Inspired Oscillatory State System with Temporal Dynamics

Neural Information Processing SystemsJun-23-2026, 00:38:32 GMT

Today's deep learning architectures are primarily based on perceptron models, which do not capture the oscillatory dynamics characteristic of biological neural activity. Although oscillatory systems have recently gained attention for their closer resemblance to neural behavior, they often lack a structured mechanism to represent rich spatio-temporal dynamics in a controllable and interpretable manner. In this paper, we propose a bio-inspired oscillatory state system (BioOSS), a 2D topographically organized oscillatory state-space model designed to generate diverse oscillation-driven spatio-temporal patterns. BioOSS comprises two coupled state components: punits that represent membrane-potential-like variables inspired by pyramidal-cell activity, and o units that act as velocity-like latent states controlling phase, time scales, and damping. The model incorporates trainable parameters for damping and effective oscillation rates, enabling flexible adaptation to task-specific temporal structures while remaining efficient for long-sequence learning via scanfriendly diagonal dynamics. We evaluate BioOSS on both synthetic and real-world tasks, demonstrating superior performance and enhanced interpretability compared to alternative architectures.

artificial intelligence, deep learning, machine learning, (20 more...)

Neural Information Processing Systems

Country: Europe (0.46)

Genre: Research Report > Experimental Study (1.00)

Industry:

Energy (0.88)
Health & Medicine > Therapeutic Area > Neurology (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)

Add feedback

From Euler to AI: Unifying Formulas for Mathematical Constants

Neural Information Processing SystemsJun-22-2026, 01:47:21 GMT

The constant πhas fascinated scholars throughout the centuries, inspiring numerous formulas for its evaluation, such as infinite sums and continued fractions. Despite their individual significance, many of the underlying connections among formulas remain unknown, missing unifying theories that could unveil deeper understanding. The absence of a unifying theory reflects a broader challenge across math and science: knowledge is typically accumulated through isolated discoveries, while deeper connections often remain hidden. In this work, we present an automated framework for the unification of mathematical formulas. Our system combines large language models (LLMs) for systematic formula harvesting, an LLM-code feedback loop for validation, and a novel symbolic algorithm for clustering and eventual unification. We demonstrate this methodology on the hallmark case of π, an ideal testing ground for symbolic unification. Applying this approach to 455,050 arXiv papers, we validate 385 distinct formulas for π and prove relations between 360 (94%) of them, of which 166 (43%) can be derived from a single mathematical object--linking canonical formulas by Euler, Gauss, Brouncker, and newer ones from algorithmic discoveries by the Ramanujan Machine. Our method generalizes to other constants, including e, ζ(3), and Catalan's constant, demonstrating the potential of AI-assisted mathematics to uncover hidden structures and unify knowledge across domains.

formula, large language model, machine learning, (20 more...)

Neural Information Processing Systems

Country: North America > United States (1.00)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.92)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Linear Attention for Efficient Bidirectional Sequence Modeling

Neural Information Processing SystemsJun-18-2026, 10:17:14 GMT

Linear Transformers and State Space Models have emerged as efficient alternatives to softmax Transformers for causal sequence modeling, enabling parallel training via matrix multiplication and efficient RNN-style inference. However, despite their success in causal tasks, no unified framework exists for applying Linear Transformers to bidirectional sequence modeling. We introduce LION, the first framework to systematically extend Linear Transformers to the bidirectional setting. LION generalizes three core representations commonly used in the causal case--full Linear Attention, bidirectional RNN, and chunkwise parallel form--to the bidirectional setting. These forms are theoretically equivalent and enable models to exploit the strengths of each during training and inference. We prove that a broad class of Linear Transformers can be extended using LION and validate our framework via three core examples based on the choice of decay type: LION-LIT, the bidirectional extension of [25]; LION-D, based on [44]; and LION-S, a variant using selective decay [34, 13]. Across standard bidirectional tasks, LION enables models to match or exceed the performance of softmax Transformers, while offering significantly faster training and more efficient inference than existing State Space Models.

large language model, machine learning, natural language, (17 more...)

Neural Information Processing Systems

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.92)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.68)

Add feedback

Fixed-Point RNNs: Interpolating from Diagonal to Dense

Neural Information Processing SystemsJun-16-2026, 18:07:09 GMT

Linear recurrent neural networks (RNNs) and state-space models (SSMs) such as Mamba have become promising alternatives to softmax-attention as sequence mixing layers in Transformer architectures. Current models, however, do not exhibit the full state-tracking expressivity of RNNs because they rely on channel-wise (i.e.

artificial intelligence, machine learning, natural language, (17 more...)

Neural Information Processing Systems

Country: North America > United States (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach

Neural Information Processing SystemsJun-16-2026, 12:51:53 GMT

We study a novel language model architecture that is capable of scaling test-time computation by implicitly reasoning in latent space. Our model works by iterating a recurrent block, thereby unrolling to arbitrary depth at test-time. This stands in contrast to mainstream reasoning models that scale up compute by producing more tokens. Unlike approaches based on chain-of-thought, our approach does not require any specialized training data, can work with small context windows, and can capture types of reasoning that are not easily represented in words. We train a proof-of-concept model from scratch with 3.5 billion parameters and 800 billion tokens. We show that this model can effortlessly use varying levels of compute, significantly improving with additional compute especially on reasoning tasks, such as math and coding. Further, this architecture naturally reduces compute costs via zero-shot per-token adaptive compute, KV-cache sharing and speculative decoding.

large language model, machine learning, natural language, (18 more...)

Neural Information Processing Systems

Country:

North America > United States (1.00)
Europe (1.00)

Genre: Research Report > Experimental Study (1.00)

Industry:

Government > Regional Government > North America Government > United States Government (1.00)
Education (1.00)
Energy (0.67)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

RAT Bridging and Attention Accuracy via Chunk based Sequence Modeling

Neural Information Processing SystemsJun-16-2026, 02:53:25 GMT

Transformers have become the cornerstone of modern large-scale language models, but their reliance on softmax attention poses a computational bottleneck at both training and inference. Recurrent models offer high efficiency, but compressing the full sequence into a fixed-size and holistic representation can suffer from memory degradation in long contexts and limit fine-grained retrieval. To address this, we propose RAT, an intermediate design that bridges the efficiency of RNNs and capacity of attention. RATpartitions the input into chunks, applies recurrence within each chunk for local dependencies, and softmax-based attention across chunks for longrange interactions. This design mitigates memory degradation and enables direct access to distant tokens, while retaining computational efficiency. Empirically, with a chunk size of 16, the RAT block achieves a 7 improvement in training speed for 100K sequence length and 9 in generation at the 4K position, while maintaining similar performance compared to standard attention. We demonstrate this by training 1.3B parameter models from scratch and performing large-scale evaluations, including short-and long-context benchmarks, as well as supervised finetuning (SFT). We further propose a hybrid architecture that interleaves RATwith local attention. By combining efficient long-range modeling with strong local interactions, this hybrid design not only improves inference speed and reduces cache memory usage, but also consistently enhances performance and shows the overall best results.

Add feedback

Bridging Expressivity and Scalability with Adaptive Unitary SSMs

Neural Information Processing SystemsJun-15-2026, 13:24:44 GMT

Recent work has revealed that state space models (SSMs), while efficient for longsequence processing, are fundamentally limited in their ability to represent formal languages--particularly due to time-invariant and real-valued recurrence structures. In this work, we draw inspiration from adaptive and structured dynamics observed in biological neural systems and introduce the Adaptive Unitary State Space Model (AUSSM): a novel class of SSMs that leverages skew-symmetric, input-dependent recurrence to achieve unitary evolution and high expressive power. Using algebraic automata theory, we prove that AUSSM can perform modulo counting and simulate solvable group automata at precision logarithmically bounded in the input length, enabling SSMs to model a broad class of regular languages out of reach for other SSM architectures. To overcome the practical inefficiencies of adaptive recurrence, we develop a separable convolution formulation and a CUDA implementation that enables scalable parallel training. Empirically, we show that AUSSM and its hybrid variant--interleaved with Mamba--outperform prior SSMs on formal algorithmic tasks such as parity and modular arithmetic, and achieve competent performance on real-world long time-series classification benchmarks. Our results demonstrate that adaptive unitary recurrence provides a powerful and efficient inductive bias for both symbolic and continuous sequence modeling.

large language model, machine learning, natural language, (20 more...)

Neural Information Processing Systems

Country: North America > United States (0.92)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.93)

Add feedback

Enhanced Cyclic Coordinate Descent Methods for Elastic Net Penalized Linear Models

Neural Information Processing SystemsJun-14-2026, 03:57:22 GMT

We present a novel enhanced cyclic coordinate descent (ECCD) framework for solving generalized linear models with elastic net constraints that reduces training time in comparison to existing state-of-the-art methods. We redesign the CD method by performing a Taylor expansion around the current iterate to avoid nonlinear operations arising in the gradient computation. By introducing this approximation we are able to unroll the vector recurrences occurring in the CD method and reformulate the resulting computations into more efficient batched computations. We show empirically that the recurrence can be unrolled by a tunable integer parameter, $s$, such that $s > 1$ yields performance improvements without affecting convergence, whereas $s = 1$ yields the original CD method. A key advantage of ECCD is that it avoids the convergence delay and numerical instability exhibited by block coordinate descent. Finally, we implement our proposed method in C++ using Eigen to accelerate linear algebra computations. Comparison of our method against existing state-of-the-art solvers show consistent performance improvements of $3\times$ in average for regularization path variant on diverse benchmark datasets.

artificial intelligence, machine learning, proceedings, (9 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.80)

Add feedback

Filters

Collaborating Authors

recurrence

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Deep Continuous-Time State-Space Models for Marked Event Sequences

ABio Inspired Oscillatory State System with Temporal Dynamics

From Euler to AI: Unifying Formulas for Mathematical Constants

Linear Attention for Efficient Bidirectional Sequence Modeling

Fixed-Point RNNs: Interpolating from Diagonal to Dense

Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach

RAT Bridging and Attention Accuracy via Chunk based Sequence Modeling

Bridging Expressivity and Scalability with Adaptive Unitary SSMs

Enhanced Cyclic Coordinate Descent Methods for Elastic Net Penalized Linear Models

ff9783ec29688387d44779d67d06ef66-Paper-Conference.pdf