AITopics | Farghly, Tyler

Collaborating Authors

Farghly, Tyler

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Towards a Complete Analysis of Langevin Monte Carlo: Beyond Poincar\'e Inequality

Mousavi-Hosseini, Alireza, Farghly, Tyler, He, Ye, Balasubramanian, Krishnakumar, Erdogdu, Murat A.

arXiv.org Machine LearningJul-20-2023

Langevin diffusions are rapidly convergent under appropriate functional inequality assumptions. Hence, it is natural to expect that with additional smoothness conditions to handle the discretization errors, their discretizations like the Langevin Monte Carlo (LMC) converge in a similar fashion. This research program was initiated by Vempala and Wibisono (2019), who established results under log-Sobolev inequalities. Chewi et al. (2022) extended the results to handle the case of Poincar\'e inequalities. In this paper, we go beyond Poincar\'e inequalities, and push this research program to its limit. We do so by establishing upper and lower bounds for Langevin diffusions and LMC under weak Poincar\'e inequalities that are satisfied by a large class of densities including polynomially-decaying heavy-tailed densities (i.e., Cauchy-type). Our results explicitly quantify the effect of the initializer on the performance of the LMC algorithm. In particular, we show that as the tail goes from sub-Gaussian, to sub-exponential, and finally to Cauchy-like, the dependency on the initial error goes from being logarithmic, to polynomial, and then finally to being exponential. This three-step phase transition is in particular unavoidable as demonstrated by our lower bounds, clearly defining the boundaries of LMC.

artificial intelligence, inequality, machine learning, (15 more...)

arXiv.org Machine Learning

2303.03589

Country:

North America > Canada > Ontario > Toronto (0.14)
North America > United States > California (0.14)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)

Genre: Research Report (0.70)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)

Add feedback

Generalisation under gradient descent via deterministic PAC-Bayes

Clerico, Eugenio, Farghly, Tyler, Deligiannidis, George, Guedj, Benjamin, Doucet, Arnaud

arXiv.org Artificial IntelligenceApr-4-2023

We establish disintegrated PAC-Bayesian generalisation bounds for models trained with gradient descent methods or continuous gradient flows. Contrary to standard practice in the PAC-Bayesian setting, our result applies to optimisation algorithms that are deterministic, without requiring any de-randomisation step. Our bounds are fully computable, depending on the density of the initial distribution and the Hessian of the training objective over the trajectory. We show that our framework can be applied to a variety of iterative optimisation algorithms, including stochastic gradient descent (SGD), momentum-based schemes, and damped Hamiltonian dynamics.

artificial intelligence, bayesian inference, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2209.02525

Country: Europe > United Kingdom > England (0.14)

Genre: Research Report (0.70)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)

Add feedback

Mean-Square Analysis of Discretized It\^o Diffusions for Heavy-tailed Sampling

He, Ye, Farghly, Tyler, Balasubramanian, Krishnakumar, Erdogdu, Murat A.

arXiv.org Artificial IntelligenceMar-1-2023

We analyze the complexity of sampling from a class of heavy-tailed distributions by discretizing a natural class of It\^o diffusions associated with weighted Poincar\'e inequalities. Based on a mean-square analysis, we establish the iteration complexity for obtaining a sample whose distribution is $\epsilon$ close to the target distribution in the Wasserstein-2 metric. In this paper, our results take the mean-square analysis to its limits, i.e., we invariably only require that the target density has finite variance, the minimal requirement for a mean-square analysis. To obtain explicit estimates, we compute upper bounds on certain moments associated with heavy-tailed targets under various assumptions. We also provide similar iteration complexity results for the case where only function evaluations of the unnormalized target density are available by estimating the gradients using a Gaussian smoothing technique. We provide illustrative examples based on the multivariate $t$-distribution.

artificial intelligence, machine learning, null 2, (15 more...)

arXiv.org Artificial Intelligence

2303.0057

Country:

North America > United States (0.28)
North America > Canada > Ontario > Toronto (0.14)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)

Genre: Research Report > New Finding (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Add feedback

Time-independent Generalization Bounds for SGLD in Non-convex Settings

Farghly, Tyler, Rebeschini, Patrick

arXiv.org Machine LearningNov-24-2021

We establish generalization error bounds for stochastic gradient Langevin dynamics (SGLD) with constant learning rate under the assumptions of dissipativity and smoothness, a setting that has received increased attention in the sampling/optimization literature. Unlike existing bounds for SGLD in non-convex settings, ours are time-independent and decay to zero as the sample size increases. Using the framework of uniform stability, we establish time-independent bounds by exploiting the Wasserstein contraction property of the Langevin diffusion, which also allows us to circumvent the need to bound gradients using Lipschitz-like assumptions. Our analysis also supports variants of SGLD that use different discretization methods, incorporate Euclidean projections, or use non-isotropic noise.

algorithm, artificial intelligence, machine learning, (17 more...)

arXiv.org Machine Learning

2111.12876

Country: Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)

Genre: Research Report (0.82)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.49)

Add feedback