AITopics | George, Anand Jerry

Collaborating Authors

George, Anand Jerry

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Denoising Score Matching with Random Features: Insights on Diffusion Models from Precise Learning Curves

George, Anand Jerry, Veiga, Rodrigo, Macris, Nicolas

arXiv.org Machine LearningFeb-1-2025

We derive asymptotically precise expressions for test and train errors of denoising score matching (DSM) in generative diffusion models. The score function is parameterized by random features neural networks, with the target distribution being $d$-dimensional standard Gaussian. We operate in a regime where the dimension $d$, number of data samples $n$, and number of features $p$ tend to infinity while keeping the ratios $\psi_n=\frac{n}{d}$ and $\psi_p=\frac{p}{d}$ fixed. By characterizing the test and train errors, we identify regimes of generalization and memorization in diffusion models. Furthermore, our work sheds light on the conditions enhancing either generalization or memorization. Consistent with prior empirical observations, our findings indicate that the model complexity ($p$) and the number of noise samples per data sample ($m$) used during DSM significantly influence generalization and memorization behaviors.

artificial intelligence, diffusion model, machine learning, (14 more...)

arXiv.org Machine Learning

2502.00336

Country: North America > United States > Hawaii (0.14)

Genre: Research Report > New Finding (1.00)

Industry: Energy > Oil & Gas (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Analysis of Diffusion Models for Manifold Data

George, Anand Jerry, Veiga, Rodrigo, Macris, Nicolas

arXiv.org Artificial IntelligenceFeb-1-2025

We analyze the time reversed dynamics of generative diffusion models. If the exact empirical score function is used in a regime of large dimension and exponentially large number of samples, these models are known to undergo transitions between distinct dynamical regimes. We extend this analysis and compute the transitions for an analytically tractable manifold model where the statistical model for the data is a mixture of lower dimensional Gaussians embedded in higher dimensional space. We compute the so-called speciation and collapse transition times, as a function of the ratio of manifold-to-ambient space dimensions, and other characteristics of the data model. An important tool used in our analysis is the exact formula for the mutual information (or free energy) of Generalized Linear Models.

artificial intelligence, machine learning, manifold, (16 more...)

arXiv.org Artificial Intelligence

2502.04339

Country:

North America > United States (0.14)
Europe > Switzerland (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.46)

Add feedback

Sampling in High-Dimensions using Stochastic Interpolants and Forward-Backward Stochastic Differential Equations

George, Anand Jerry, Macris, Nicolas

arXiv.org Machine LearningFeb-1-2025

We present a class of diffusion-based algorithms to draw samples from high-dimensional probability distributions given their unnormalized densities. Ideally, our methods can transport samples from a Gaussian distribution to a specified target distribution in finite time. Our approach relies on the stochastic interpolants framework to define a time-indexed collection of probability densities that bridge a Gaussian distribution to the target distribution. Subsequently, we derive a diffusion process that obeys the aforementioned probability density at each time instant. Obtaining such a diffusion process involves solving certain Hamilton-Jacobi-Bellman PDEs. We solve these PDEs using the theory of forward-backward stochastic differential equations (FBSDE) together with machine learning-based methods. Through numerical experiments, we demonstrate that our algorithm can effectively draw samples from distributions that conventional methods struggle to handle.

artificial intelligence, interpolant, machine learning, (16 more...)

arXiv.org Machine Learning

2502.00355

Country: North America > United States (0.28)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.67)

Add feedback

Continual Mean Estimation Under User-Level Privacy

George, Anand Jerry, Ramesh, Lekshmi, Singh, Aditya Vikram, Tyagi, Himanshu

arXiv.org Artificial IntelligenceDec-19-2022

We consider the problem of continually releasing an estimate of the population mean of a stream of samples that is user-level differentially private (DP). At each time instant, a user contributes a sample, and the users can arrive in arbitrary order. Until now these requirements of continual release and user-level privacy were considered in isolation. But, in practice, both these requirements come together as the users often contribute data repeatedly and multiple queries are made. We provide an algorithm that outputs a mean estimate at every time instant $t$ such that the overall release is user-level $\varepsilon$-DP and has the following error guarantee: Denoting by $M_t$ the maximum number of samples contributed by a user, as long as $\tilde{\Omega}(1/\varepsilon)$ users have $M_t/2$ samples each, the error at time $t$ is $\tilde{O}(1/\sqrt{t}+\sqrt{M}_t/t\varepsilon)$. This is a universal error guarantee which is valid for all arrival patterns of the users. Furthermore, it (almost) matches the existing lower bounds for the single-release setting at all time instants when users have contributed equal number of samples.

artificial intelligence, machine learning, noisypartialsum, (18 more...)

arXiv.org Artificial Intelligence

2212.0998

Country:

Europe (0.92)
North America > United States (0.67)

Genre: Research Report (0.40)

Industry: Information Technology > Security & Privacy (0.92)

Technology:

Information Technology > Security & Privacy (0.67)
Information Technology > Artificial Intelligence > Machine Learning (0.46)

Add feedback

Robust Testing in High-Dimensional Sparse Models

George, Anand Jerry, Canonne, Clément L.

arXiv.org Artificial IntelligenceNov-4-2022

We consider the problem of robustly testing the norm of a high-dimensional sparse signal vector under two different observation models. In the first model, we are given $n$ i.i.d. samples from the distribution $\mathcal{N}\left(\theta,I_d\right)$ (with unknown $\theta$), of which a small fraction has been arbitrarily corrupted. Under the promise that $\|\theta\|_0\le s$, we want to correctly distinguish whether $\|\theta\|_2=0$ or $\|\theta\|_2>\gamma$, for some input parameter $\gamma>0$. We show that any algorithm for this task requires $n=\Omega\left(s\log\frac{ed}{s}\right)$ samples, which is tight up to logarithmic factors. We also extend our results to other common notions of sparsity, namely, $\|\theta\|_q\le s$ for any $0 < q < 2$. In the second observation model that we consider, the data is generated according to a sparse linear regression model, where the covariates are i.i.d. Gaussian and the regression coefficient (signal) is known to be $s$-sparse. Here too we assume that an $\epsilon$-fraction of the data is arbitrarily corrupted. We show that any algorithm that reliably tests the norm of the regression coefficient requires at least $n=\Omega\left(\min(s\log d,{1}/{\gamma^4})\right)$ samples. Our results show that the complexity of testing in these two settings significantly increases under robustness constraints. This is in line with the recent observations made in robust mean testing and robust covariance testing.

artificial intelligence, machine learning, sample complexity, (15 more...)

arXiv.org Artificial Intelligence

2205.07488

Genre: Research Report > New Finding (0.74)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.97)

Add feedback

An MCMC Method to Sample from Lattice Distributions

George, Anand Jerry, Kashyap, Navin

arXiv.org Machine LearningJan-16-2021

We introduce a Markov Chain Monte Carlo (MCMC) algorithm to generate samples from probability distributions supported on a $d$-dimensional lattice $\Lambda = \mathbf{B}\mathbb{Z}^d$, where $\mathbf{B}$ is a full-rank matrix. Specifically, we consider lattice distributions $P_\Lambda$ in which the probability at a lattice point is proportional to a given probability density function, $f$, evaluated at that point. To generate samples from $P_\Lambda$, it suffices to draw samples from a pull-back measure $P_{\mathbb{Z}^d}$ defined on the integer lattice. The probability of an integer lattice point under $P_{\mathbb{Z}^d}$ is proportional to the density function $\pi = |\det(\mathbf{B})|f\circ \mathbf{B}$. The algorithm we present in this paper for sampling from $P_{\mathbb{Z}^d}$ is based on the Metropolis-Hastings framework. In particular, we use $\pi$ as the proposal distribution and calculate the Metropolis-Hastings acceptance ratio for a well-chosen target distribution. We can use any method, denoted by ALG, that ideally draws samples from the probability density $\pi$, to generate a proposed state. The target distribution is chosen such that the coordinate-wise rounding of a sample drawn from the target distribution gives a sample from $P_{\mathbb{Z}^d}$. When ALG is ideal, we show that our algorithm is uniformly ergodic if $-\log(\pi)$ satisfies a gradient Lipschitz condition.

algorithm, artificial intelligence, machine learning, (17 more...)

arXiv.org Machine Learning

2101.06453

Country: Asia > India (0.14)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback