AITopics | Ghosal, Promit

Collaborating Authors

Ghosal, Promit

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

LoSAM: Local Search in Additive Noise Models with Unmeasured Confounders, a Top-Down Global Discovery Approach

Hiremath, Sujai, Ghosal, Promit, Gan, Kyra

arXiv.org Artificial IntelligenceNov-10-2024

We address the challenge of causal discovery in structural equation models with additive noise without imposing additional assumptions on the underlying data-generating process. We introduce local search in additive noise model (LoSAM), which generalizes an existing nonlinear method that leverages local causal substructures to the general additive noise setting, allowing for both linear and nonlinear causal mechanisms. We show that LoSAM achieves polynomial runtime, and improves runtime and efficiency by exploiting new substructures to minimize the conditioning set at each step. Further, we introduce a variant of LoSAM, LoSAM-UC, that is robust to unmeasured confounding among roots, a property that is often not satisfied by functional-causal-model-based methods. We numerically demonstrate the utility of LoSAM, showing that it outperforms existing benchmarks.

artificial intelligence, machine learning, vertex, (19 more...)

arXiv.org Artificial Intelligence

2410.11759

Country: North America > United States > New York > New York County > New York City (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.60)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)

Add feedback

Hybrid Global Causal Discovery with Local Search

Hiremath, Sujai, Maasch, Jacqueline R. M. A., Gao, Mengxiao, Ghosal, Promit, Gan, Kyra

arXiv.org Artificial IntelligenceMay-23-2024

Learning the unique directed acyclic graph corresponding to an unknown causal model is a challenging task. Methods based on functional causal models can identify a unique graph, but either suffer from the curse of dimensionality or impose strong parametric assumptions. To address these challenges, we propose a novel hybrid approach for global causal discovery in observational data that leverages local causal substructures. We first present a topological sorting algorithm that leverages ancestral relationships in linear structural equation models to establish a compact top-down hierarchical ordering, encoding more causal information than linear orderings produced by existing methods. We demonstrate that this approach generalizes to nonlinear settings with arbitrary noise. We then introduce a nonparametric constraint-based algorithm that prunes spurious edges by searching for local conditioning sets, achieving greater accuracy than current methods. We provide theoretical guarantees for correctness and worst-case polynomial time complexities, with empirical validation on synthetic data.

artificial intelligence, machine learning, relation, (19 more...)

arXiv.org Artificial Intelligence

2405.14496

Country: North America > United States > New York > New York County > New York City (0.14)

Genre: Research Report (0.50)

Industry: Health & Medicine > Therapeutic Area (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

Towards Understanding the Dynamics of Gaussian-Stein Variational Gradient Descent

Liu, Tianle, Ghosal, Promit, Balasubramanian, Krishnakumar, Pillai, Natesh S.

arXiv.org Machine LearningOct-27-2023

Stein Variational Gradient Descent (SVGD) is a nonparametric particle-based deterministic sampling algorithm. Despite its wide usage, understanding the theoretical properties of SVGD has remained a challenging problem. For sampling from a Gaussian target, the SVGD dynamics with a bilinear kernel will remain Gaussian as long as the initializer is Gaussian. Inspired by this fact, we undertake a detailed theoretical study of the Gaussian-SVGD, i.e., SVGD projected to the family of Gaussian distributions via the bilinear kernel, or equivalently Gaussian variational inference (GVI) with SVGD. We present a complete picture by considering both the mean-field PDE and discrete particle systems. When the target is strongly log-concave, the mean-field Gaussian-SVGD dynamics is proven to converge linearly to the Gaussian distribution closest to the target in KL divergence. In the finite-particle setting, there is both uniform in time convergence to the mean-field limit and linear convergence in time to the equilibrium if the target is Gaussian. In the general case, we propose a density-based and a particle-based implementation of the Gaussian-SVGD, and show that several recent algorithms for GVI, proposed from different perspectives, emerge as special cases of our unified framework. Interestingly, one of the new particle-based instance from this framework empirically outperforms existing approaches. Our results make concrete contributions towards obtaining a deeper understanding of both SVGD and GVI.

algorithm, artificial intelligence, machine learning, (14 more...)

arXiv.org Machine Learning

2305.14076

Country:

North America > United States > Massachusetts (0.14)
North America > United States > California (0.14)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.71)

Add feedback

Randomly Initialized One-Layer Neural Networks Make Data Linearly Separable

Ghosal, Promit, Mahankali, Srinath, Sun, Yihang

arXiv.org Machine LearningOct-8-2023

Recently, neural networks have demonstrated remarkable capabilities in mapping two arbitrary sets to two linearly separable sets. The prospect of achieving this with randomly initialized neural networks is particularly appealing due to the computational efficiency compared to fully trained networks. This paper contributes by establishing that, given sufficient width, a randomly initialized one-layer neural network can, with high probability, transform two sets into two linearly separable sets without any training. Moreover, we furnish precise bounds on the necessary width of the neural network for this phenomenon to occur. Our initial bound exhibits exponential dependence on the input dimension while maintaining polynomial dependence on all other parameters. In contrast, our second bound is independent of input dimension, effectively surmounting the curse of dimensionality. The main tools used in our proof heavily relies on a fusion of geometric principles and concentration of random matrices.

artificial intelligence, deep learning, machine learning, (19 more...)

arXiv.org Machine Learning

2205.11716

Country: North America > United States (0.14)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

From Stability to Chaos: Analyzing Gradient Descent Dynamics in Quadratic Regression

Chen, Xuxing, Balasubramanian, Krishnakumar, Ghosal, Promit, Agrawalla, Bhavya

arXiv.org Machine LearningOct-2-2023

We conduct a comprehensive investigation into the dynamics of gradient descent using large-order constant step-sizes in the context of quadratic regression models. Within this framework, we reveal that the dynamics can be encapsulated by a specific cubic map, naturally parameterized by the step-size. Through a fine-grained bifurcation analysis concerning the step-size parameter, we delineate five distinct training phases: (1) monotonic, (2) catapult, (3) periodic, (4) chaotic, and (5) divergent, precisely demarcating the boundaries of each phase. As illustrations, we provide examples involving phase retrieval and two-layer neural networks employing quadratic activation functions and constant outer-layers, utilizing orthogonal training data. Our simulations indicate that these five phases also manifest with generic non-orthogonal data. We also empirically investigate the generalization performance when training in the various non-monotonic (and non-divergent) phases. In particular, we observe that performing an ergodic trajectory averaging stabilizes the test error in non-monotonic (and non-divergent) phases.

artificial intelligence, log loss 3, machine learning, (9 more...)

arXiv.org Machine Learning

2310.01687

Country:

North America > United States > California (0.14)
North America > United States > Massachusetts (0.14)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.86)

Add feedback

High-dimensional scaling limits and fluctuations of online least-squares SGD with smooth covariance

Balasubramanian, Krishnakumar, Ghosal, Promit, He, Ye

arXiv.org Machine LearningApr-2-2023

We derive high-dimensional scaling limits and fluctuations for the online least-squares Stochastic Gradient Descent (SGD) algorithm by taking the properties of the data generating model explicitly into consideration. Our approach treats the SGD iterates as an interacting particle system, where the expected interaction is characterized by the covariance structure of the input. Assuming smoothness conditions on moments of order up to eight orders, and without explicitly assuming Gaussianity, we establish the high-dimensional scaling limits and fluctuations in the form of infinite-dimensional Ordinary Differential Equations (ODEs) or Stochastic Differential Equations (SDEs). Our results reveal a precise three-step phase transition of the iterates; it goes from being ballistic, to diffusive, and finally to purely random behavior, as the noise variance goes from low, to moderate and finally to very-high noise setting. In the low-noise setting, we further characterize the precise fluctuations of the (scaled) iterates as infinite-dimensional SDEs. We also show the existence and uniqueness of solutions to the derived limiting ODEs and SDEs. Our results have several applications, including characterization of the limiting mean-square estimation or prediction errors and their fluctuations which can be obtained by analytically or numerically solving the limiting equations.

artificial intelligence, assumption 1, machine learning, (16 more...)

arXiv.org Machine Learning

2304.00707

Country:

North America > United States > California (0.14)
North America > United States > Massachusetts (0.14)

Genre: Research Report > New Finding (0.54)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.55)

Add feedback

High-dimensional Central Limit Theorems for Linear Functionals of Online Least-Squares SGD

Agrawalla, Bhavya, Balasubramanian, Krishnakumar, Ghosal, Promit

arXiv.org Machine LearningFeb-19-2023

Stochastic gradient descent (SGD) has emerged as the quintessential method in a data scientist's toolbox. Much progress has been made in the last two decades toward understanding the iteration complexity of SGD (in expectation and high-probability) in the learning theory and optimization literature. However, using SGD for high-stakes applications requires careful quantification of the associated uncertainty. Toward that end, in this work, we establish high-dimensional Central Limit Theorems (CLTs) for linear functionals of online least-squares SGD iterates under a Gaussian design assumption. Our main result shows that a CLT holds even when the dimensionality is of order exponential in the number of iterations of the online SGD, thereby enabling high-dimensional inference with online SGD. Our proof technique involves leveraging Berry-Esseen bounds developed for martingale difference sequences and carefully evaluating the required moment and quadratic variation terms through recent advances in concentration inequalities for product random matrices. We also provide an online approach for estimating the variance appearing in the CLT (required for constructing confidence intervals in practice) and establish consistency results in the high-dimensional setting.

artificial intelligence, krishnakumar balasubramanian, machine learning, (14 more...)

arXiv.org Machine Learning

2302.09727

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.55)

Add feedback

Rates of Estimation of Optimal Transport Maps using Plug-in Estimators via Barycentric Projections

Deb, Nabarun, Ghosal, Promit, Sen, Bodhisattva

arXiv.org Machine LearningJul-4-2021

Optimal transport maps between two probability distributions $\mu$ and $\nu$ on $\mathbb{R}^d$ have found extensive applications in both machine learning and statistics. In practice, these maps need to be estimated from data sampled according to $\mu$ and $\nu$. Plug-in estimators are perhaps most popular in estimating transport maps in the field of computational optimal transport. In this paper, we provide a comprehensive analysis of the rates of convergences for general plug-in estimators defined via barycentric projections. Our main contribution is a new stability estimate for barycentric projections which proceeds under minimal smoothness assumptions and can be used to analyze general plug-in estimators. We illustrate the usefulness of this stability estimate by first providing rates of convergence for the natural discrete-discrete and semi-discrete estimators of optimal transport maps. We then use the same stability estimate to show that, under additional smoothness assumptions of Besov type or Sobolev type, wavelet based or kernel smoothed plug-in estimators respectively speed up the rates of convergence and significantly mitigate the curse of dimensionality suffered by the natural discrete-discrete/semi-discrete estimators. As a by-product of our analysis, we also obtain faster rates of convergence for plug-in estimators of $W_2(\mu,\nu)$, the Wasserstein distance between $\mu$ and $\nu$, under the aforementioned smoothness assumptions, thereby complementing recent results in Chizat et al. (2020). Finally, we illustrate the applicability of our results in obtaining rates of convergence for Wasserstein barycenters between two probability distributions and obtaining asymptotic detection thresholds for some recent optimal-transport based tests of independence.

artificial intelligence, estimator, machine learning, (15 more...)

arXiv.org Machine Learning

2107.01718

Country:

Europe (0.67)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)

Genre: Research Report (0.70)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.45)
Information Technology > Artificial Intelligence > Machine Learning > Learning in High Dimensional Spaces (0.34)

Add feedback