AITopics | dobrushin

Collaborating Authors

dobrushin

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

HOGWILD!-Gibbs can be PanAccurate

Constantinos Daskalakis, Nishanth Dikkala, Siddhartha Jayanti

Neural Information Processing SystemsFeb-14-2026, 00:47:34 GMT

Asynchronous Gibbs sampling has been recently shown to be fast-mixing and an accurate method for estimating probabilities of events on a small number of variables of a graphical model satisfying Dobrushin's condition [DSOR16].

artificial intelligence, hogwild, machine learning, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.05)
North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)
North America > Canada > Quebec > Montreal (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.38)

Add feedback

bca382c81484983f2d437f97d1e141f3-AuthorFeedback.pdf

Neural Information Processing SystemsFeb-10-2026, 02:40:08 GMT

estimator, ising model, reviewer, (11 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.51)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.31)

Add feedback

HOGWILD!-Gibbs can be PanAccurate

Constantinos Daskalakis, Nishanth Dikkala, Siddhartha Jayanti

Neural Information Processing SystemsNov-20-2025, 19:02:54 GMT

Neural Information Processing Systems http://nips.cc/

artificial intelligence, hogwild, machine learning, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.05)
North America > Canada > Quebec > Montreal (0.04)
Europe > United Kingdom > England > Tyne and Wear > Sunderland (0.04)
Asia > Middle East > Jordan (0.04)

Industry: Government > Regional Government (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.32)

Add feedback

bca382c81484983f2d437f97d1e141f3-AuthorFeedback.pdf

Neural Information Processing SystemsAug-16-2025, 03:23:23 GMT

estimator, ising model, reviewer, (11 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.51)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.31)

Add feedback

Transformers and Their Roles as Time Series Foundation Models

Wu, Dennis, He, Yihan, Cao, Yuan, Fan, Jianqing, Liu, Han

arXiv.org Artificial IntelligenceFeb-5-2025

We give a comprehensive analysis of transformers as time series foundation models, focusing on their approximation and generalization capabilities. First, we demonstrate that there exist transformers that fit an autoregressive model on input univariate time series via gradient descent. We then analyze MOIRAI, a multivariate time series foundation model capable of handling an arbitrary number of covariates. We prove that it is capable of automatically fitting autoregressive models with an arbitrary number of covariates, offering insights into its design and empirical success. For generalization, we establish bounds for pretraining when the data satisfies Dobrushin's condition. Experiments support our theoretical findings, highlighting the efficacy of transformers as time series foundation models.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2502.03383

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > China > Hong Kong (0.04)

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Data Science (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Emergence of meta-stable clustering in mean-field transformer models

Bruno, Giuseppe, Pasqualotto, Federico, Agazzi, Andrea

arXiv.org Artificial IntelligenceOct-30-2024

We model the evolution of tokens within a deep stack of Transformer layers as a continuous-time flow on the unit sphere, governed by a mean-field interacting particle system, building on the framework introduced in (Geshkovski et al., 2023). Studying the corresponding mean-field Partial Differential Equation (PDE), which can be interpreted as a Wasserstein gradient flow, in this paper we provide a mathematical investigation of the long-term behavior of this system, with a particular focus on the emergence and persistence of meta-stable phases and clustering phenomena, key elements in applications like next-token prediction. More specifically, we perform a perturbative analysis of the mean-field PDE around the iid uniform initialization and prove that, in the limit of large number of tokens, the model remains close to a meta-stable manifold of solutions with a given structure (e.g., periodicity). Further, the structure characterizing the meta-stable manifold is explicitly identified, as a function of the inverse temperature parameter of the model, by the index maximizing a certain rescaling of Gegenbauer polynomials.

evolution, federico pasqualotto, meta-stable clustering, (13 more...)

arXiv.org Artificial Intelligence

2410.23228

Country: North America > United States > California > Alameda County > Berkeley (0.04)

Genre: Research Report (0.63)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Reviews: HOGWILD!-Gibbs can be PanAccurate

Neural Information Processing SystemsOct-7-2024, 22:43:09 GMT

The authors prove theorems about the accuracy of asynchronous Gibbs sampling in graphical models with discrete variables that satisfy Dobrushin's condition. I am not familiar with this literature, so I'm taking the authors' description of the state of the literature as a given. The authors' results are as follows (let n be the number of variables in the graphical model, let t be the time index, and let tau be the maximum expected read delay in the asynchronous sampler): - Lemma 2. The asynchronous Gibbs sampler can be coupled to a synchronous Gibbs sampler with the same initial state such that the expected Hamming distance between them is bounded by O(tau*log(n)) uniformly in t. Lemma 3 gives an analogous bound for the dth moment of the Hamming distance. If a function f is K-Lipschitz with respect to the dth power of the Hamming distance, the bias of the asynchronous Gibbs sampler for the expectation of f is bounded by log d(n) (plus a constant, times a constant, and for sufficiently large t).

asynchronous gibbs, gibbs, hamming distance, (11 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (1.00)

Add feedback

On counterfactual inference with unobserved confounding

Shah, Abhin, Dwivedi, Raaz, Shah, Devavrat, Wornell, Gregory W.

arXiv.org Artificial IntelligenceSep-14-2023

Given an observational study with $n$ independent but heterogeneous units, our goal is to learn the counterfactual distribution for each unit using only one $p$-dimensional sample per unit containing covariates, interventions, and outcomes. Specifically, we allow for unobserved confounding that introduces statistical biases between interventions and outcomes as well as exacerbates the heterogeneity across units. Modeling the conditional distribution of the outcomes as an exponential family, we reduce learning the unit-level counterfactual distributions to learning $n$ exponential family distributions with heterogeneous parameters and only one sample per distribution. We introduce a convex objective that pools all $n$ samples to jointly learn all $n$ parameter vectors, and provide a unit-wise mean squared error bound that scales linearly with the metric entropy of the parameter space. For example, when the parameters are $s$-sparse linear combination of $k$ known vectors, the error is $O(s\log k/p)$. En route, we derive sufficient conditions for compactly supported distributions to satisfy the logarithmic Sobolev inequality. As an application of the framework, our results enable consistent imputation of sparsely missing covariates.

inequality, lemma, proposition, (16 more...)

arXiv.org Artificial Intelligence

2211.08209

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > Florida > Palm Beach County > Boca Raton (0.04)
(3 more...)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.67)

Add feedback

Non asymptotic bounds in asynchronous sum-weight gossip protocols

Picard, David, Fellus, Jérôme, Garnier, Stéphane

arXiv.org Machine LearningNov-19-2021

This paper focuses on non-asymptotic diffusion time in asynchronous gossip protocols. Asynchronous gossip protocols are designed to perform distributed computation in a network of nodes by randomly exchanging messages on the associated graph. To achieve consensus among nodes, a minimal number of messages has to be exchanged. We provides a probabilistic bound to such number for the general case. We provide a explicit formula for fully connected graphs depending only on the number of nodes and an approximation for any graph depending on the spectrum of the graph.

graph, node, protocol, (17 more...)

arXiv.org Machine Learning

2111.10248

Country:

North America > United States > Texas > Dallas County > Dallas (0.04)
Europe > France > Île-de-France > Yvelines > Cergy-Pontoise (0.04)
Europe > France > Île-de-France > Val-d'Oise > Cergy-Pontoise (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Communications > Networks (0.68)
Information Technology > Data Science > Data Mining (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

Outlier-Robust Learning of Ising Models Under Dobrushin's Condition

Diakonikolas, Ilias, Kane, Daniel M., Stewart, Alistair, Sun, Yuxin

arXiv.org Machine LearningFeb-3-2021

Probabilistic graphical models [KF09] provide a rich and unifying framework to model structured high-dimensional distributions in terms of the local dependencies between the input variables. The problem of inference in graphical models arises in many applications across scientific disciplines, see, e.g., [WJ08]. In this work, we study the inverse problem of learning graphical models from data. Various formalizations of this general learning problem have been studied during the past five decades, see, e.g., [CL68, Das97, AKN06, WRL06, AHHK12, SW12, LW12, BMS13, BGS14, Bre15, KM17], resulting in general theory and algorithms for various settings. In this work, we focus on learning Ising models [Isi25], the prototypical family of binary undirected graphical models with applications in computer vision, computational biology, and statistical physics [Li09, JEMF06, Fel04, Cha05].

algorithm, exponential family, ising model, (13 more...)

arXiv.org Machine Learning

2102.02171

Country:

North America > United States > Wisconsin > Dane County > Madison (0.04)
North America > United States > New York (0.04)
North America > United States > California > San Diego County > San Diego (0.04)
(4 more...)

Genre: Research Report (0.64)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.67)

Add feedback