AITopics | Verchand, Kabir Aladin

Collaborating Authors

Verchand, Kabir Aladin

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

High-dimensional logistic regression with missing data: Imputation, regularization, and universality

Verchand, Kabir Aladin, Montanari, Andrea

arXiv.org Machine LearningOct-1-2024

We study high-dimensional, ridge-regularized logistic regression in a setting in which the covariates may be missing or corrupted by additive noise. When both the covariates and the additive corruptions are independent and normally distributed, we provide exact characterizations of both the prediction error as well as the estimation error. Moreover, we show that these characterizations are universal: as long as the entries of the data matrix satisfy a set of independence and moment conditions, our guarantees continue to hold. Universality, in turn, enables the detailed study of several imputation-based strategies when the covariates are missing completely at random. We ground our study by comparing the performance of these strategies with the conjectured performance -- stemming from replica theory in statistical physics -- of the Bayes optimal procedure. Our analysis yields several insights including: (i) a distinction between single imputation and a simple variant of multiple imputation and (ii) that adding a simple ridge regularization term to single-imputed logistic regression can yield an estimator whose prediction error is nearly indistinguishable from the Bayes optimal prediction error. We supplement our findings with extensive numerical experiments.

artificial intelligence, inequality, machine learning, (20 more...)

arXiv.org Machine Learning

2410.01093

Country:

North America > United States (0.28)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.65)

Add feedback

Low-degree phase transitions for detecting a planted clique in sublinear time

Mardia, Jay, Verchand, Kabir Aladin, Wein, Alexander S.

arXiv.org Machine LearningFeb-8-2024

We consider the problem of detecting a planted clique of size $k$ in a random graph on $n$ vertices. When the size of the clique exceeds $\Theta(\sqrt{n})$, polynomial-time algorithms for detection proliferate. We study faster -- namely, sublinear time -- algorithms in the high-signal regime when $k = \Theta(n^{1/2 + \delta})$, for some $\delta > 0$. To this end, we consider algorithms that non-adaptively query a subset $M$ of entries of the adjacency matrix and then compute a low-degree polynomial function of the revealed entries. We prove a computational phase transition for this class of non-adaptive low-degree algorithms: under the scaling $\lvert M \rvert = \Theta(n^{\gamma})$, the clique can be detected when $\gamma > 3(1/2 - \delta)$ but not when $\gamma < 3(1/2 - \delta)$. As a result, the best known runtime for detecting a planted clique, $\widetilde{O}(n^{3(1/2-\delta)})$, cannot be improved without looking beyond the non-adaptive low-degree class. Our proof of the lower bound -- based on bounding the conditional low-degree likelihood ratio -- reveals further structure in non-adaptive detection of a planted clique. Using (a bound on) the conditional low-degree likelihood ratio as a potential function, we show that for every non-adaptive query pattern, there is a highly structured query pattern of the same size that is at least as effective.

artificial intelligence, natural language, question answering, (17 more...)

arXiv.org Machine Learning

2402.05451

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
North America > United States > California (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.54)

Add feedback

Hyperparameter tuning via trajectory predictions: Stochastic prox-linear methods in matrix sensing

Lou, Mengqi, Verchand, Kabir Aladin, Pananjady, Ashwin

arXiv.org Machine LearningFeb-2-2024

This model finds applications in diverse areas of science and engineering, including astronomy, medical imaging, and communications (Jefferies and Christou, 1993; W ang and Poor, 1998; Campisi and Egiazarian, 2017). For instance, it forms an example of the blind deconvolution problem in statistical signal processing (see, e.g., Recht et al. (2010); Ahmed et al. (2013) and the references therein for several applications of this problem). W e are interested in the model-fitting problem, and the natural least squares population objectiveL: R d R d R (corresponding to the scaled negative log-likelihood of our observations under Gaussian noise) can be written as L ( µ, ν) = E nullnull y x, µ z, ν null 2null, (2) where the conditional distribution of y given x, z is as specified by the model (1). Note that L is a jointly nonconvex function in the parameters ( µ, ν) . With the goal of minimizing the population loss L, we consider online algorithms which operate on a mini-batch of size m with 1 m d for which we draw a fresh 1 set of observations { y i, x i, z i} m i = 1 at each iteration and form the averaged loss L m( µ, ν) = 1 m m i = 1 null y i x i, µ z i, ν null 2 .

artificial intelligence, machine learning, null, (19 more...)

arXiv.org Machine Learning

2402.01599

Country: Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.13)

Genre:

Workflow (0.68)
Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.92)

Add feedback