AITopics | deep state space model

Collaborating Authors

deep state space model

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Deep State Space Models for Time Series Forecasting

Neural Information Processing SystemsNov-20-2025, 22:17:21 GMT

We present a novel approach to probabilistic time series forecasting that combines state space models with deep learning. By parametrizing a per-time-series linear state space model with a jointly-learned recurrent neural network, our method retains desired properties of state space models such as data efficiency and interpretability, while making use of the ability to learn complex patterns from raw data offered by deep learning approaches. Our method scales gracefully from regimes where little training data is available to regimes where data from millions of time series can be leveraged to learn accurate models. We provide qualitative as well as quantitative results with the proposed method, showing that it compares favorably to the state-of-the-art.

deep state space model, name change, time series forecasting, (3 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Deep State Space Models for Unconditional Word Generation

Neural Information Processing SystemsNov-20-2025, 21:46:53 GMT

Autoregressive feedback is considered a necessity for successful unconditional text generation using stochastic sequence models. However, such feedback is known to introduce systematic biases into the training process and it obscures a principle of generation: committing to global information and forgetting local nuances. We show that a non-autoregressive deep state space model with a clear separation of global and local uncertainty can be built from only two ingredients: An independent noise source and a deterministic transition function. Recent advances on flow-based variational inference can be used to train an evidence lower-bound without resorting to annealing, auxiliary losses or similar measures. The result is a highly interpretable generative model on par with comparable auto-regressive models on the task of word generation.

deep state space model, name change, unconditional word generation, (1 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Natural Language (0.62)
Information Technology > Artificial Intelligence > Machine Learning (0.42)

Add feedback

Layer-Adaptive State Pruning for Deep State Space Models

Neural Information Processing SystemsMay-26-2025, 16:51:43 GMT

Due to the lack of state dimension optimization methods, deep state space models (SSMs) have sacrificed model capacity, training search space, or stability to alleviate computational costs caused by high state dimensions. In this work, we provide a structured pruning method for SSMs, Layer-Adaptive STate pruning (LAST), which reduces the state dimension of each layer in minimizing model-level output energy loss by extending modal truncation for a single system. LAST scores are evaluated using the \mathcal{H}_{\infty} norms of subsystems and layer-wise energy normalization. The scores serve as global pruning criteria, enabling cross-layer comparison of states and layer-adaptive pruning. Notably, we demonstrate that, on average, pruning 33\% of states still maintains performance with 0.52\% accuracy loss in multi-input multi-output SSMs without retraining.

artificial intelligence, deep state space model, layer-adaptive state pruning, (2 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)

Add feedback

Exploring Adversarial Robustness of Deep State Space Models

Neural Information Processing SystemsMay-26-2025, 16:01:08 GMT

Deep State Space Models (SSMs) have proven effective in numerous task scenarios but face significant security challenges due to Adversarial Perturbations (APs) in real-world deployments. Adversarial Training (AT) is a mainstream approach to enhancing Adversarial Robustness (AR) and has been validated on various traditional DNN architectures. However, its effectiveness in improving the AR of SSMs remains unclear.While many enhancements in SSM components, such as integrating Attention mechanisms and expanding to data-dependent SSM parameterizations, have brought significant gains in Standard Training (ST) settings, their potential benefits in AT remain unexplored. To investigate this, we evaluate existing structural variants of SSMs with AT to assess their AR performance. We observe that pure SSM structures struggle to benefit from AT, whereas incorporating Attention yields a markedly better trade-off between robustness and generalization for SSMs in AT compared to other components.

adversarial robustness, artificial intelligence, deep state space model, (3 more...)

Neural Information Processing Systems

Industry: Information Technology > Security & Privacy (0.60)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.63)
Information Technology > Security & Privacy (0.60)

Add feedback

Numerical Analysis of HiPPO-LegS ODE for Deep State Space Models

Park, Jaesung R., Suh, Jaewook J., Ryu, Ernest K.

arXiv.org Artificial IntelligenceDec-11-2024

In deep learning, the recently introduced state space models utilize HiPPO (High-order Polynomial Projection Operators) memory units to approximate continuous-time trajectories of input functions using ordinary differential equations (ODEs), and these techniques have shown empirical success in capturing long-range dependencies in long input sequences. However, the mathematical foundations of these ODEs, particularly the singular HiPPO-LegS (Legendre Scaled) ODE, and their corresponding numerical discretizations remain unexplored. In this work, we fill this gap by establishing that HiPPO-LegS ODE is well-posed despite its singularity, albeit without the freedom of arbitrary initial conditions, and by establishing convergence of the associated numerical discretization schemes for Riemann-integrable input functions.

artificial intelligence, deep state space model, numerical analysis, (1 more...)

arXiv.org Artificial Intelligence

2412.08595

Genre: Research Report (0.69)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning (0.60)

Add feedback

Deep State Space Models for Time Series Forecasting

Neural Information Processing SystemsOct-8-2024, 16:29:42 GMT

deep state space model, regime, time series forecasting

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Reviews: Deep State Space Models for Time Series Forecasting

Neural Information Processing SystemsOct-7-2024, 10:58:09 GMT

Effective Bayesian Modeling of Groups of Related Count Time Series.

covariate, deep state space model, time series forecasting, (11 more...)

Neural Information Processing Systems

Country: North America > Trinidad and Tobago > Trinidad > Arima > Arima (0.05)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.41)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.36)

Add feedback

Reviews: Deep State Space Models for Unconditional Word Generation

Neural Information Processing SystemsOct-7-2024, 05:00:56 GMT

This paper introduces a probabilistic model for unconditional word generation that uses state space models whose distributions are parameterized with deep neural networks. Normalizing flows are used to define flexible distributions both in the generative model and in the inference network. To improve inference the inference networks uses samples from the prior SSM transitions borrowing ideas from importance-weighted autoencoders. I enjoyed reading this paper, as it gives many useful insights on deep state space models and more in general on probabilistic models for sequential data. Also, it introduces novel ways of parameterizing the inference network by constructing a variational approximation over the noise term rather than the state.

deep state space model, inference network, unconditional word generation, (5 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.40)

Add feedback

Towards a theory of learning dynamics in deep state space models

Smékal, Jakub, Smith, Jimmy T. H., Kleinman, Michael, Biderman, Dan, Linderman, Scott W.

arXiv.org Machine LearningJul-9-2024

State space models (SSMs) have shown remarkable empirical performance on many long sequence modeling tasks, but a theoretical understanding of these models is still lacking. In this work, we study the learning dynamics of linear SSMs to understand how covariance structure in data, latent state size, and initialization affect the evolution of parameters throughout learning with gradient descent. We show that focusing on the learning dynamics in the frequency domain affords analytical solutions under mild assumptions, and we establish a link between one-dimensional SSMs and the dynamics of deep linear feed-forward networks. Finally, we analyze how latent state over-parameterization affects convergence time and describe future work in extending our results to the study of deep SSMs with nonlinear connections. This work is a step toward a theory of learning dynamics in deep state space models.

frequency domain, international conference, ssm, (14 more...)

arXiv.org Machine Learning

2407.07279

Country: North America > United States > California > Santa Clara County > Palo Alto (0.05)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.97)

Add feedback

Interpretable Latent Variables in Deep State Space Models

Wu, Haoxuan, Matteson, David S., Wells, Martin T.

arXiv.org Machine LearningMay-19-2022

We introduce a new version of deep state-space models (DSSMs) that combines a recurrent neural network with a state-space framework to forecast time series data. The model estimates the observed series as functions of latent variables that evolve non-linearly through time. Due to the complexity and non-linearity inherent in DSSMs, previous works on DSSMs typically produced latent variables that are very difficult to interpret. Our paper focus on producing interpretable latent parameters with two key modifications. First, we simplify the predictive decoder by restricting the response variables to be a linear transformation of the latent variables plus some noise. Second, we utilize shrinkage priors on the latent variables to reduce redundancy and improve robustness. These changes make the latent variables much easier to understand and allow us to interpret the resulting latent variables as random effects in a linear mixed model. We show through two public benchmark datasets the resulting model improves forecasting performances.

artificial intelligence, interpretable latent variable, machine learning, (1 more...)

arXiv.org Machine Learning

2203.02057

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.80)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.53)

Add feedback