AITopics | hutchinson

Collaborating Authors

hutchinson

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

StAD: Stein Amortized Divergence for Fast Likelihoods with Diffusion and Flow

Jagwani, Gurjeet, Thorp, Stephen, Deger, Sinan, Peiris, Hiranya

arXiv.org Machine LearningMay-19-2026

Diffusion and flow-based models are ubiquitously used for generative modelling and density estimation. They admit a deterministic probability flow ordinary differential equation (PF-ODE), analogous to continuous normalizing flows (CNFs), which describes the transport of the probability mass. Obtaining the likelihood from these models is of interest to many workflows, especially Bayesian analysis, and requires solving the trace of the Jacobian to compute the divergence of the learned PF-ODE, which is either $\mathcal{O}(D^2)$ to compute exactly or $\mathcal{O}(D)$ with a noisy estimate. We introduce StAD, a new distillation method to predict and learn the divergence of the PF-ODE using the Langevin-Stein operator without ever computing the Jacobian. We show that our method is competitive with the Hutchinson and Hutch++ on CIFAR-10, ImageNet and other density estimation tasks, consistently improving the variance and speed of the likelihood predictions compared to the Hutchinson. We additionally show our method will generalize to a varied class of generative models, and show that under some regularity conditions these learned vector fields can be made to satisfy the Stein class.

artificial intelligence, machine learning, stad, (16 more...)

arXiv.org Machine Learning

2605.16486

Country: Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.28)

Genre:

Overview (0.92)
Research Report (0.82)

Industry: Health & Medicine (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Add feedback

DynamicTraceEstimation

Neural Information Processing SystemsFeb-19-2026, 12:01:28 GMT

We study a dynamic version of the implicit trace estimation problem.

artificial intelligence, hutchinson, machine learning, (18 more...)

Neural Information Processing Systems

Country:

North America > United States > Minnesota (0.05)
North America > United States > Texas > Brazos County > College Station (0.04)
North America > United States > New York > New York County > New York City (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.47)

Add feedback

Optimal Query Complexities for Dynamic Trace Estimation

David P. Woodruff, Carnegie Mellon University, dwoodruf@cs.cmu.edu "3026 Fred Zhang, UC Berkeley, z0@berkeley.edu, "3026 Qiuyi (Richard) Zhang, Google Brain, qiuyiz@google.com

Neural Information Processing SystemsFeb-12-2026, 11:18:05 GMT

We consider the problem of minimizing the number of matrix-vector queries needed for accurate trace estimation in the dynamic setting where our underlying matrix is changing slowly, such as during an optimization process.

artificial intelligence, machine learning, trace estimation, (19 more...)

Neural Information Processing Systems

Country:

North America > United States > Texas > Brazos County > College Station (0.04)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

Optimal Query Complexities for Dynamic Trace Estimation

David P. Woodruff, Carnegie Mellon University, dwoodruf@cs.cmu.edu "3026 Fred Zhang, UC Berkeley, z0@berkeley.edu, "3026 Qiuyi (Richard) Zhang, Google Brain, qiuyiz@google.com

Neural Information Processing SystemsFeb-12-2026, 11:18:01 GMT

Inther valued, forsuf"andanyp2[1,2], p log ( 1/ )/" p number isnecessary"kAkp errorwith1 .

artificial intelligence, hutchinson, theorem 5, (12 more...)

Neural Information Processing Systems

Country:

North America > United States > Texas > Brazos County > College Station (0.04)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)

Technology: Information Technology > Artificial Intelligence (0.48)

Add feedback

c77bfda61a0204d445185053e6a9a8fe-Paper.pdf

Neural Information Processing SystemsFeb-11-2026, 03:03:18 GMT

Recently,the Hutch++ algorithm was proposed, which reduces the number of matrix-vector queries fromO(1/2) to the optimalO(1/), and the algorithm succeeds with constant probability.

algorithm, artificial intelligence, machine learning, (18 more...)

Neural Information Processing Systems

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.05)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > Texas > Brazos County > College Station (0.04)
(3 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.47)

Add feedback

Appendixfor RiemannianContinuousNormalizingFlows

Neural Information Processing SystemsFeb-7-2026, 16:06:00 GMT

In the following, we provide a brief overview of Riemannian geometry and constant curvature manifolds, specifically the Poincaré ball and the hypersphere models. Sphere In the two-dimensional settingd = 2, we rely on polar coordinates to parametrize the sphere S2. In the following subsection we remind that this regularization term can also be motivated from an estimator'svarianceperspective. 5 D.2 Frobeniusnorm Hutchinson'sestimator Hutchinson'sestimator(Hutchinson,1990)isasimple waytoobtain a stochastic estimate ofthetrace ofamatrix. The variance of this estimator thus depends on the Frobenius norm of the vector's field Jacobian Thenγ(tn) is also a Cauchy sequence by Equation 16. So for every sequence (tn) in (a,b) that converges tob, we have that(γ(tn)) converges top.

artificial intelligence, hutchinson, machine learning, (19 more...)

Neural Information Processing Systems

Country: Asia > Japan > Honshū > Chūbu > Nagano Prefecture > Nagano (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.96)

Add feedback

What Trace Powers Reveal About Log-Determinants: Closed-Form Estimators, Certificates, and Failure Modes

Sao, Piyush

arXiv.org Machine LearningJan-21-2026

Computing $\log\det(A)$ for large symmetric positive definite matrices arises in Gaussian process inference and Bayesian model comparison. Standard methods combine matrix-vector products with polynomial approximations. We study a different model: access to trace powers $p_k = \tr(A^k)$, natural when matrix powers are available. Classical moment-based approximations Taylor-expand $\log(λ)$ around the arithmetic mean. This requires $|λ- \AM| < \AM$ and diverges when $κ> 4$. We work instead with the moment-generating function $M(t) = \E[X^t]$ for normalized eigenvalues $X = λ/\AM$. Since $M'(0) = \E[\log X]$, the log-determinant becomes $\log\det(A) = n(\log \AM + M'(0))$ -- the problem reduces to estimating a derivative at $t = 0$. Trace powers give $M(k)$ at positive integers, but interpolating $M(t)$ directly is ill-conditioned due to exponential growth. The transform $K(t) = \log M(t)$ compresses this range. Normalization by $\AM$ ensures $K(0) = K(1) = 0$. With these anchors fixed, we interpolate $K$ through $m+1$ consecutive integers and differentiate to estimate $K'(0)$. However, this local interpolation cannot capture arbitrary spectral features. We prove a fundamental limit: no continuous estimator using finitely many positive moments can be uniformly accurate over unbounded conditioning. Positive moments downweight the spectral tail; $K'(0) = \E[\log X]$ is tail-sensitive. This motivates guaranteed bounds. From the same traces we derive upper bounds on $(\det A)^{1/n}$. Given a spectral floor $r \leq λ_{\min}$, we obtain moment-constrained lower bounds, yielding a provable interval for $\log\det(A)$. A gap diagnostic indicates when to trust the point estimate and when to report bounds. All estimators and bounds cost $O(m)$, independent of $n$. For $m \in \{4, \ldots, 8\}$, this is effectively constant time.

large language model, machine learning, natural language, (21 more...)

arXiv.org Machine Learning

2601.12612

Country: North America > United States (1.00)

Genre: Research Report (0.40)

Industry: Government > Regional Government > North America Government > United States Government (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.45)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.45)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.45)

Add feedback

Optimal Sketching for Trace Estimation

Neural Information Processing SystemsDec-24-2025, 21:46:09 GMT

Matrix trace estimation is ubiquitous in machine learning applications and has traditionally relied on Hutchinson's method, which requires $O(\log(1/\delta)/\epsilon^2)$ matrix-vector product queries to achieve a $(1 \pm \epsilon)$-multiplicative approximation to $\text{trace}(A)$ with failure probability $\delta$ on positive-semidefinite input matrices $A$. Recently, the Hutch++ algorithm was proposed, which reduces the number of matrix-vector queries from $O(1/\epsilon^2)$ to the optimal $O(1/\epsilon)$, and the algorithm succeeds with constant probability. However, in the high probability setting, the non-adaptive Hutch++ algorithm suffers an extra $O(\sqrt{\log(1/\delta)})$ multiplicative factor in its query complexity. Non-adaptive methods are important, as they correspond to sketching algorithms, which are mergeable, highly parallelizable, and provide low-memory streaming algorithms as well as low-communication distributed protocols. In this work, we close the gap between non-adaptive and adaptive algorithms, showing that even non-adaptive algorithms can achieve $O(\sqrt{\log(1/\delta)}/\epsilon + \log(1/\delta))$ matrix-vector products. In addition, we prove matching lower bounds demonstrating that, up to a $\log \log(1/\delta)$ factor, no further improvement in the dependence on $\delta$ or $\epsilon$ is possible by any non-adaptive algorithm. Finally, our experiments demonstrate the superior performance of our sketch over the adaptive Hutch++ algorithm, which is less parallelizable, as well as over the non-adaptive Hutchinson's method.

algorithm, delta, optimal sketching, (9 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.39)

Add feedback

Matrix Phylogeny: Compact Spectral Fingerprints for Trap-Robust Preconditioner Selection

Baek, Jinwoo

arXiv.org Artificial IntelligenceNov-4-2025

Matrix Phylogeny introduces compact spectral fingerprints (CSF/ASF) that characterize matrices at the family level. These fingerprints are low-dimensional, eigendecomposition-free descriptors built from Chebyshev trace moments estimated by Hutchinson sketches. A simple affine rescaling to [-1,1] makes them permutation/similarity invariant and robust to global scaling. Across synthetic and real tests, we observe phylogenetic compactness: only a few moments are needed. CSF with K=3-5 already yields perfect clustering (ARI=1.0; silhouettes ~0.89) on four synthetic families and a five-family set including BA vs ER, while ASF adapts the dimension on demand (median K*~9). On a SuiteSparse mini-benchmark (Hutchinson p~100), both CSF-H and ASF-H reach ARI=1.0. Against strong alternatives (eigenvalue histograms + Wasserstein, heat-kernel traces, WL-subtree), CSF-K=5 matches or exceeds accuracy while avoiding eigendecompositions and using far fewer features (K<=10 vs 64/9153). The descriptors are stable to noise (log-log slope ~1.03, R^2~0.993) and support a practical trap->recommend pipeline for automated preconditioner selection. In an adversarial E6+ setting with a probe-and-switch mechanism, our physics-guided recommender attains near-oracle iteration counts (p90 regret=0), whereas a Frobenius 1-NN baseline exhibits large spikes (p90~34-60). CSF/ASF deliver compact (K<=10), fast, invariant fingerprints that enable scalable, structure-aware search and recommendation over large matrix repositories. We recommend CSF with K=5 by default, and ASF when domain-specific adaptivity is desired.

artificial intelligence, machine learning, spectral fingerprint, (12 more...)

arXiv.org Artificial Intelligence

2511.00012

Genre: Research Report (0.82)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Filters

Collaborating Authors

hutchinson

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

StAD: Stein Amortized Divergence for Fast Likelihoods with Diffusion and Flow

DynamicTraceEstimation

Optimal Query Complexities for Dynamic Trace Estimation

Optimal Query Complexities for Dynamic Trace Estimation

fcdf698a5d673435e0a5a6f9ffea05ca-Supplemental.pdf

c77bfda61a0204d445185053e6a9a8fe-Paper.pdf

Appendixfor RiemannianContinuousNormalizingFlows

What Trace Powers Reveal About Log-Determinants: Closed-Form Estimators, Certificates, and Failure Modes

Optimal Sketching for Trace Estimation

Matrix Phylogeny: Compact Spectral Fingerprints for Trap-Robust Preconditioner Selection