AITopics | stan

Collaborating Authors

stan

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Automatic Variational Inference in Stan

Alp Kucukelbir, Rajesh Ranganath, Andrew Gelman, David Blei

Neural Information Processing SystemsOct-2-2025, 03:52:17 GMT

approximation, inference, variational inference, (14 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > New York (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Add feedback

Robust Amortized Bayesian Inference with Self-Consistency Losses on Unlabeled Data

Mishra, Aayush, Habermann, Daniel, Schmitt, Marvin, Radev, Stefan T., Bürkner, Paul-Christian

arXiv.org Machine LearningFeb-11-2025

Neural amortized Bayesian inference (ABI) can solve probabilistic inverse problems orders of magnitude faster than classical methods. However, neural ABI is not yet sufficiently robust for widespread and safe applicability. In particular, when performing inference on observations outside of the scope of the simulated data seen during training, for example, because of model misspecification, the posterior approximations are likely to become highly biased. Due to the bad pre-asymptotic behavior of current neural posterior estimators in the out-of-simulation regime, the resulting estimation biases cannot be fixed in acceptable time by just simulating more training data. In this proof-of-concept paper, we propose a semi-supervised approach that enables training not only on (labeled) simulated data generated from the model, but also on unlabeled data originating from any source, including real-world data. To achieve the latter, we exploit Bayesian self-consistency properties that can be transformed into strictly proper losses without requiring knowledge of true parameter values, that is, without requiring data labels. The results of our initial experiments show remarkable improvements in the robustness of ABI on out-of-simulation data. Even if the observed data is far away from both labeled and unlabeled training data, inference remains highly accurate. If our findings also generalize to other scenarios and model classes, we believe that our new method represents a major breakthrough in neural ABI.

artificial intelligence, machine learning, self-consistency loss, (16 more...)

arXiv.org Machine Learning

2501.13483

Country:

North America > United States (0.14)
Europe > Germany (0.04)

Genre: Research Report > New Finding (0.68)

Industry: Transportation (0.95)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.93)

Add feedback

STAN: Smooth Transition Autoregressive Networks

Inzirillo, Hugo, Genet, Remi

arXiv.org Artificial IntelligenceJan-30-2025

Traditional Smooth Transition Autoregressive (STAR) models offer an effective way to model these dynamics through smooth regime changes based on specific transition variables. In this paper, we propose a novel approach by drawing an analogy between STAR models and a multilayer neural network architecture. Our proposed neural network architecture mimics the STAR framework, employing multiple layers to simulate the smooth transition between regimes and capturing complex, nonlinear relationships. The network's hidden layers and activation functions are structured to replicate the gradual switching behavior typical of STAR models, allowing for a more flexible and scalable approach to regime-dependent modeling. This research suggests that neural networks can provide a powerful alternative to STAR models, with the potential to enhance predictive accuracy in economic and financial forecasting.

artificial intelligence, deep learning, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2501.18699

Country:

North America > United States (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Europe > France > Île-de-France > Paris > Paris (0.04)

Genre: Research Report > New Finding (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Reviews: GENO -- GENeric Optimization for Classical Machine Learning

Neural Information Processing SystemsJan-25-2025, 06:06:58 GMT

Is there something particular about it compared to the other solvers?

classical machine learning, geno, solver, (9 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.68)

Add feedback

Transformers Use Causal World Models in Maze-Solving Tasks

Spies, Alex F., Edwards, William, Ivanitskiy, Michael I., Skapars, Adrians, Räuker, Tilman, Inoue, Katsumi, Russo, Alessandra, Shanahan, Murray

arXiv.org Artificial IntelligenceDec-16-2024

Recent studies in interpretability have explored the inner workings of transformer models trained on tasks across various domains, often discovering that these networks naturally develop surprisingly structured representations. When such representations comprehensively reflect the task domain's structure, they are commonly referred to as ``World Models'' (WMs). In this work, we discover such WMs in transformers trained on maze tasks. In particular, by employing Sparse Autoencoders (SAEs) and analysing attention patterns, we examine the construction of WMs and demonstrate consistency between the circuit analysis and the SAE feature-based analysis. We intervene upon the isolated features to confirm their causal role and, in doing so, find asymmetries between certain types of interventions. Surprisingly, we find that models are able to reason with respect to a greater number of active features than they see during training, even if attempting to specify these in the input token sequence would lead the model to fail. Futhermore, we observe that varying positional encodings can alter how WMs are encoded in a model's residual stream. By analyzing the causal role of these WMs in a toy domain we hope to make progress toward an understanding of emergent structure in the representations acquired by Transformers, leading to the development of more interpretable and controllable AI systems.

artificial intelligence, machine learning, representation, (18 more...)

arXiv.org Artificial Intelligence

2412.11867

Country: North America > United States > Colorado (0.04)

Genre: Research Report (1.00)

Industry: Leisure & Entertainment > Games (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Automatic Variational Inference in Stan

Neural Information Processing SystemsOct-11-2024, 11:49:51 GMT

Variational inference is a scalable technique for approximate Bayesian inference. Deriving variational inference algorithms requires tedious model-specific calculations; this makes it difficult for non-experts to use. We propose an automatic variational inference algorithm, automatic differentiation variational inference (ADVI); we implement it in Stan (code available), a probabilistic programming system. In ADVI the user provides a Bayesian model and a dataset, nothing else. We make no conjugacy assumptions and support a broad class of models.

advi, automatic variational inference, stan, (1 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.67)

Add feedback

Scalable Inference for Bayesian Multinomial Logistic-Normal Dynamic Linear Models

Saxena, Manan, Chen, Tinghua, Silverman, Justin D.

arXiv.org Machine LearningOct-7-2024

Many scientific fields collect longitudinal multivariate count data where the total number of counts is arbitrary (e.g., multinomial observations). These data are often called count compositional as the information in the data relates to the relative frequencies of the categories (Silverman et al., 2018). These data occur frequently in molecular biology (Espinoza et al., 2020), microbiome studies (Silverman et al., 2018; Joseph et al., 2020; Äijö et al., 2018), natural language processing (Linderman et al., 2015), biomedicine (Fokianos and Kedem, 2003), and social sciences (Cargnoni et al., 1997). Although the counting process used to collect these data is often modeled as multinomial, other sources of noise in the system being studied often lead to extra-multinomial variation. While some account for this extra-multinomial variability with multinomial-Dirichlet models (Mosimann, 1962), multinomial logistic-normal models are often superior, as they can account for both positive and negative covariation between multinomial categories (Aitchison and Shen, 1980; Cargnoni et al., 1997; Joseph et al., 2020; Silverman et al., 2018). Moreover, under suitable transformation (i.e., link function), the logistic-normal is multivariate Gaussian.

multinomial logistic-normal dynamic linear model, posterior, scalable inference, (10 more...)

arXiv.org Machine Learning

2410.05548

Country:

North America > United States > Pennsylvania (0.04)
North America > United States > New York (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.87)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)

Add feedback

Explaining the (Not So) Obvious: Simple and Fast Explanation of STAN, a Next Point of Interest Recommendation System

Yunus, Fajrian, Abdessalem, Talel

arXiv.org Artificial IntelligenceOct-4-2024

A lot of effort in recent years have been expended to explain machine learning systems. However, some machine learning methods are inherently explainable, and thus are not completely black box. This enables the developers to make sense of the output without a developing a complex and expensive explainability technique. Besides that, explainability should be tailored to suit the context of the problem. In a recommendation system which relies on collaborative filtering, the recommendation is based on the behaviors of similar users, therefore the explanation should tell which other users are similar to the current user. Similarly, if the recommendation system is based on sequence prediction, the explanation should also tell which input timesteps are the most influential. We demonstrate this philosophy/paradigm in STAN (Spatio-Temporal Attention Network for Next Location Recommendation), a next Point of Interest recommendation system based on collaborative filtering and sequence prediction. We also show that the explanation helps to "debug" the output.

artificial intelligence, machine learning, timestep, (18 more...)

arXiv.org Artificial Intelligence

2410.03841

Country:

Europe > France > Île-de-France (0.04)
North America > United States > New York (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > Experimental Study (0.32)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Amortized Bayesian Multilevel Models

Habermann, Daniel, Schmitt, Marvin, Kühmichel, Lars, Bulling, Andreas, Radev, Stefan T., Bürkner, Paul-Christian

arXiv.org Machine LearningAug-23-2024

Obtaining accurate inference and faithful uncertainty quantification in reasonable time is a frontier of today's statistical research (Cranmer et al., 2020). One major difficulty arising in most experimental and almost all observational data is the presence of complex dependency structures, for example, due to natural groupings (e.g., data gathered in different countries) or repeated measurements of the same observational units over time (e.g., particles, bacteria, or people; Gelman and Hill, 2006). To leverage these dependency structures, multilevel models (MLMs), also referred to as latent variable, hierarchical, random, or mixed effects models, have become an integral part of modern Bayesian statistics (Goldstein, 2011; Gelman et al., 2013; McGlothlin and Viele, 2018; Finch et al., 2019; Yao et al., 2022). Despite the wide success of Bayesian MLMs across the quantitative sciences, a major challenge is their limited efficiency and scalability when dealing with large and complex data. This is because estimating the full posterior distribution of all parameters of interest can be very costly (Gelman et al., 2013).

inference, neural network, posterior inference, (14 more...)

arXiv.org Machine Learning

2408.1323

Country:

North America > United States (0.28)
Europe > Germany > Baden-Württemberg > Stuttgart Region > Stuttgart (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(14 more...)

Genre: Research Report (1.00)

Industry:

Transportation (0.95)
Health & Medicine (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Unveiling the Power of Self-supervision for Multi-view Multi-human Association and Tracking

Feng, Wei, Wang, Feifan, Han, Ruize, Qian, Zekun, Wang, Song

arXiv.org Artificial IntelligenceJan-31-2024

Multi-view multi-human association and tracking (MvMHAT), is a new but important problem for multi-person scene video surveillance, aiming to track a group of people over time in each view, as well as to identify the same person across different views at the same time, which is different from previous MOT and multi-camera MOT tasks only considering the over-time human tracking. This way, the videos for MvMHAT require more complex annotations while containing more information for self learning. In this work, we tackle this problem with a self-supervised learning aware end-to-end network. Specifically, we propose to take advantage of the spatial-temporal self-consistency rationale by considering three properties of reflexivity, symmetry and transitivity. Besides the reflexivity property that naturally holds, we design the self-supervised learning losses based on the properties of symmetry and transitivity, for both appearance feature learning and assignment matrix optimization, to associate the multiple humans over time and across views. Furthermore, to promote the research on MvMHAT, we build two new large-scale benchmarks for the network training and testing of different algorithms. Extensive experiments on the proposed benchmarks verify the effectiveness of our method. We have released the benchmark and code to the public.

computer vision, matrix, tracking, (13 more...)

arXiv.org Artificial Intelligence

2401.17617

Country:

North America > United States > South Carolina > Richland County > Columbia (0.14)
Asia > China > Tianjin Province > Tianjin (0.04)

Genre: Research Report (0.63)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.92)

Add feedback