AITopics | random vector

Collaborating Authors

random vector

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Stochastic trace estimation with tensor train random vectors

Bujanović, Zvonimir, Kressner, Daniel, Olić, Hrvoje

arXiv.org Machine LearningJun-16-2026

Stochastic trace estimation is a standard tool for approximating the trace of a large-scale matrix available only through matrix-vector products. However, in tensor-structured settings, unstructured Gaussian or Rademacher test vectors may be prohibitively expensive to store and compute with, while cheaper rank-one tensor-product vectors can require sample complexities that grow exponentially with the tensor order. This work studies Gaussian random tensor train vectors as a structured alternative for stochastic trace estimation. We show that, with a suitable choice of the tensor train rank, random tensor train vectors recover dimension-independent guarantees for the Girard--Hutchinson estimator. In particular, a median-of-means variant with tensor train rank $r \geq d-1$ achieves the same dependence on the accuracy $\varepsilon$ and failure probability $δ$ as the classical estimator based on unstructured Gaussian vectors. We further prove an oblivious subspace injection result for sketches formed from independent Gaussian random tensor train vectors: tensor train rank $r\geq d-1$ and $\mathcal{O}(\varepsilon^{-2}(k+\log(1/δ)))$ samples suffice for a $k$-dimensional target subspace. Finally, we investigate the use of such sketches within the Nyström++ framework. We show that the resulting estimator can achieve the desired $\mathcal{O}(\varepsilon^{-1})$ sample complexity under an additional spectral-tail condition. These results provide clarififcation on both the potential and the limitations of random tensor train vectors in stochastic trace estimation.

artificial intelligence, tt vector, vector, (16 more...)

arXiv.org Machine Learning

2606.15679

Country: Europe (0.93)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning (0.46)

Add feedback

artificial intelligence, machine learning, pruning, (18 more...)

Neural Information Processing Systems

Country:

North America > United States (1.00)
Europe (1.00)
North America > Canada > British Columbia (0.28)

Genre:

Contests & Prizes (0.50)
Research Report (0.46)

Industry: Leisure & Entertainment > Gambling (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.82)

Add feedback

Appendix for based Test of Independence for Cluster correlated Data Contents

Neural Information Processing SystemsApr-25-2026, 21:59:29 GMT

In this section, we present some preliminary results that will be useful in proving Theorem 3.2, Theorem 3.3 and Proposition 3.4. We draw upon existing theory on properties of random kernel matrices and extend these properties to cluster-correlated data. Specifically, we show the convergence of eigenvalues and eigenvectors of an empirical kernel matrix based on clustered data. Let (X,F,P) be a probability space and H be a Hilbert space over (X,F,P) with a symmetric kernel function k: X X R. Let H be a compact operator on H, defined by Hg(x) = Z Equivalently, Hn can be viewed as an n nreal matrix whose (i,j)-th entry is {Hn}i,j = 1 n k(Xi,Xj). This is the empirical kernel matrix scaled by a factor of 1/n. Here we restrict our discussion to a reproducing kernel Hilbert space (RKHS) H, where the kernel function k is positive semi-definite. We also assume that the operator H is Hilbert-Schmidt, with E[k2(X,X0)] < . Let λ(T) denote the spectrum of a compact, symmetric operator T. Then λ(H) and λ(Hn) are the sets of eigenvalues for H and Hn, respectively.

artificial intelligence, eigenvalue, machine learning, (17 more...)

Neural Information Processing Systems

Country: North America > United States (0.93)

Genre:

Research Report > Strength High (1.00)
Research Report > Experimental Study (1.00)

Industry: Health & Medicine (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Robustifying Algorithms of Learning Latent Trees with Vector Variables

Neural Information Processing SystemsApr-24-2026, 18:50:09 GMT

We consider learning the structures of Gaussian latent tree models with vector observations when a subset of them are arbitrarily corrupted. First, we present the sample complexities of Recursive Grouping (RG) and Chow-Liu Recursive Grouping (CLRG) without the assumption that the effective depth is bounded in the number of observed nodes, significantly generalizing the results in Choi et al. (2011). We show that Chow-Liu initialization in CLRG greatly reduces the sample complexity of RG from being exponential in the diameter of the tree to only logarithmic in the diameter for the hidden Markov model (HMM).

artificial intelligence, machine learning, sample complexity, (17 more...)

Neural Information Processing Systems

Country: North America > United States (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Add feedback

PRIM-cipal components analysis

Liu, Tianhao, Díaz-Pachón, Daniel Andrés, Rao, J. Sunil

arXiv.org Machine LearningApr-20-2026

EVEN supervised learning is subject to the famous NoFree Lunch Theorems [1]-[3], which say that, in combinatorial optimization, there is no universal algorithm that works better than its competitors for every objective function [4]-[6]. Indeed, David Wolpert has recently proven that, on average, cross-validation performs as well as anti-crossvalidation (choosing among a set of candidate algorithms based on which has the worst out-of-sample behavior) for supervised learning. Still, he acknowledges that "it is hard to imagine any scientist who would not prefer to use [crossvalidation] to using anti-cross-validation" [7]. On the other hand, unsupervised learning has seldom been studied from the perspective of the NFLTs. This may be because the adjective "unsupervised" suggests that no human input is needed, which is misleading as many unsupervised tasks are combinatorial optimization problems that depend on the choice of the objective function. For instance, it is well known that, among the eigenvectors of the covariance matrix, Principal Components Analysis selects those with the largest variances [8]. However, mode-hunting techniques that rely on spectral manipulation aim at the opposite objective: selecting the eigenvectors of the covariance matrix with the smallest variances [9], [10]. Therefore, unlike in supervised learning, where it is difficult to identify reasons to optimize with respect to anti-cross-validation, in unsupervised learning there are strong reasons to reduce dimensionality for variance minimization. D. A. D ıaz-Pach on and T. Liu are with the Division of Biostatistics, University of Miami, Miami, FL, 33136 USA (e-mail: ddiaz3@miami.edu,

artificial intelligence, machine learning, principal component, (18 more...)

arXiv.org Machine Learning

2604.15538

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.28)
North America > United States > Florida > Miami-Dade County > Miami (0.24)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
(4 more...)

Genre: Research Report (0.40)

Industry: Health & Medicine (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.66)

Add feedback

ResNets of All Shapes and Sizes: Convergence of Training Dynamics in the Large-scale Limit

Chaintron, Louis-Pierre, Chizat, Lénaïc, Maass, Javier

arXiv.org Machine LearningMar-23-2026

We establish convergence of the training dynamics of residual neural networks (ResNets) to their joint infinite depth L, hidden width M, and embedding dimension D limit. Specifically, we consider ResNets with two-layer perceptron blocks in the maximal local feature update (MLU) regime and prove that, after a bounded number of training steps, the error between the ResNet and its large-scale limit is O(1/L + sqrt(D/(L M)) + 1/sqrt(D)). This error rate is empirically tight when measured in embedding space. For a budget of P = Theta(L M D) parameters, this yields a convergence rate O(P^(-1/6)) for the scalings of (L, M, D) that minimize the bound. Our analysis exploits in an essential way the depth-two structure of residual blocks and applies formally to a broad class of state-of-the-art architectures, including Transformers with bounded key-query dimension. From a technical viewpoint, this work completes the program initiated in the companion paper [Chi25] where it is proved that for a fixed embedding dimension D, the training dynamics converges to a Mean ODE dynamics at rate O(1/L + sqrt(D)/sqrt(L M)). Here, we study the large-D limit of this Mean ODE model and establish convergence at rate O(1/sqrt(D)), yielding the above bound by a triangle inequality. To handle the rich probabilistic structure of the limit dynamics and obtain one of the first rigorous quantitative convergence for a DMFT-type limit, we combine the cavity method with propagation of chaos arguments at a functional level on so-called skeleton maps, which express the weight updates as functions of CLT-type sums from the past.

artificial intelligence, lemma 5, machine learning, (18 more...)

arXiv.org Machine Learning

2603.18168

Country:

Europe > Switzerland > Vaud > Lausanne (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Asia > Singapore > Central Region > Singapore (0.04)

Genre: Research Report (0.49)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

6d5f304fb4ed0243851e41699dca4287-Supplemental-Conference.pdf

Neural Information Processing SystemsFeb-13-2026, 19:52:24 GMT

artificial intelligence, exp, machine learning, (17 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.47)

Add feedback

Falconn++: ALocality-sensitiveFilteringApproach forApproximateNearestNeighborSearch

Neural Information Processing SystemsFeb-11-2026, 22:06:30 GMT

Theoretically,Falconn++ asymptotically achieves lower query time complexity than Falconn, an optimal locality-sensitive hashing scheme on angular distance.

artificial intelligence, iprobe, vector, (18 more...)

Neural Information Processing Systems

Country: Oceania > New Zealand > North Island > Auckland Region > Auckland (0.04)

Technology: Information Technology > Artificial Intelligence (0.94)

Add feedback