AITopics | Parulekar, Advait

Collaborating Authors

Parulekar, Advait

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

In-Context Learning with Transformers: Softmax Attention Adapts to Function Lipschitzness

Collins, Liam, Parulekar, Advait, Mokhtari, Aryan, Sanghavi, Sujay, Shakkottai, Sanjay

arXiv.org Artificial IntelligenceMay-28-2024

A striking property of transformers is their ability to perform in-context learning (ICL), a machine learning framework in which the learner is presented with a novel context during inference implicitly through some data, and tasked with making a prediction in that context. As such, that learner must adapt to the context without additional training. We explore the role of softmax attention in an ICL setting where each context encodes a regression task. We show that an attention unit learns a window that it uses to implement a nearest-neighbors predictor adapted to the landscape of the pretraining tasks. Specifically, we show that this window widens with decreasing Lipschitzness and increasing label noise in the pretraining tasks. We also show that on low-rank, linear problems, the attention unit learns to project onto the appropriate subspace before inference. Further, we show that this adaptivity relies crucially on the softmax activation and thus cannot be replicated by the linear activation often studied in prior theoretical analyses.

artificial intelligence, arxiv preprint arxiv, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2402.11639

Country: North America > United States > Texas > Travis County > Austin (0.14)

Genre: Research Report > New Finding (0.67)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

InfoNCE Loss Provably Learns Cluster-Preserving Representations

Parulekar, Advait, Collins, Liam, Shanmugam, Karthikeyan, Mokhtari, Aryan, Shakkottai, Sanjay

arXiv.org Artificial IntelligenceFeb-15-2023

The goal of contrasting learning is to learn a representation that preserves underlying clusters by keeping samples with similar content, e.g. the ``dogness'' of a dog, close to each other in the space generated by the representation. A common and successful approach for tackling this unsupervised learning problem is minimizing the InfoNCE loss associated with the training samples, where each sample is associated with their augmentations (positive samples such as rotation, crop) and a batch of negative samples (unrelated samples). To the best of our knowledge, it was unanswered if the representation learned by minimizing the InfoNCE loss preserves the underlying data clusters, as it only promotes learning a representation that is faithful to augmentations, i.e., an image and its augmentations have the same representation. Our main result is to show that the representation learned by InfoNCE with a finite number of negative samples is also consistent with respect to clusters in the data, under the condition that the augmentation sets within clusters may be non-overlapping but are close and intertwined, relative to the complexity of the learning function class.

artificial intelligence, machine learning, representation, (16 more...)

arXiv.org Artificial Intelligence

2302.0792

Country: North America > United States > Texas (0.27)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.87)

Add feedback

A Theoretical Justification for Image Inpainting using Denoising Diffusion Probabilistic Models

Rout, Litu, Parulekar, Advait, Caramanis, Constantine, Shakkottai, Sanjay

arXiv.org Artificial IntelligenceFeb-2-2023

We provide a theoretical justification for sample recovery using diffusion based image inpainting in a linear model setting. While most inpainting algorithms require retraining with each new mask, we prove that diffusion based inpainting generalizes well to unseen masks without retraining. We analyze a recently proposed popular diffusion based inpainting algorithm called RePaint (Lugmayr et al., 2022), and show that it has a bias due to misalignment that hampers sample recovery even in a two-state diffusion process. Motivated by our analysis, we propose a modified RePaint algorithm we call RePaint$^+$ that provably recovers the underlying true sample and enjoys a linear rate of convergence. It achieves this by rectifying the misalignment error present in drift and dispersion of the reverse process. To the best of our knowledge, this is the first linear convergence result for a diffusion based image inpainting algorithm.

artificial intelligence, az 0, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2302.01217

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.82)

Add feedback

Improved Algorithms for Misspecified Linear Markov Decision Processes

Vial, Daniel, Parulekar, Advait, Shakkottai, Sanjay, Srikant, R.

arXiv.org Machine LearningSep-12-2021

Due to the large (possibly infinite) state spaces of modern reinforcement learning applications, practical algorithms must generalize across states. To understand generalization on a theoretical level, recent work has studied linear Markov decision processes (LMDPs), among other models (see Section 1.2 for related work). The LMDP model assumes the next-state distribution and reward are linear in known d-dimensional features, which enables tractable generalization when d is small. Of course, this linear assumption most likely fails in practice, which motivates the misspecified LMDP (MLMDP) model.

algorithm 1, artificial intelligence, machine learning, (17 more...)

arXiv.org Machine Learning

2109.05546

Country: North America > United States > Texas (0.14)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.60)

Add feedback

L1 Regression with Lewis Weights Subsampling

Parulekar, Aditya, Parulekar, Advait, Price, Eric

arXiv.org Machine LearningMay-19-2021

We consider the problem of finding an approximate solution to $\ell_1$ regression while only observing a small number of labels. Given an $n \times d$ unlabeled data matrix $X$, we must choose a small set of $m \ll n$ rows to observe the labels of, then output an estimate $\widehat{\beta}$ whose error on the original problem is within a $1 + \varepsilon$ factor of optimal. We show that sampling from $X$ according to its Lewis weights and outputting the empirical minimizer succeeds with probability $1-\delta$ for $m > O(\frac{1}{\varepsilon^2} d \log \frac{d}{\varepsilon \delta})$. This is analogous to the performance of sampling according to leverage scores for $\ell_2$ regression, but with exponentially better dependence on $\delta$. We also give a corresponding lower bound of $\Omega(\frac{d}{\varepsilon^2} + (d + \frac{1}{\varepsilon^2}) \log\frac{1}{\delta})$.

artificial intelligence, lewis weight, machine learning, (19 more...)

arXiv.org Machine Learning

2105.09433

Country: North America > United States (0.29)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

Regret Bounds for Stochastic Shortest Path Problems with Linear Function Approximation

Vial, Daniel, Parulekar, Advait, Shakkottai, Sanjay, Srikant, R.

arXiv.org Machine LearningMay-4-2021

We propose two algorithms for episodic stochastic shortest path problems with linear function approximation. The first is computationally expensive but provably obtains $\tilde{O} (\sqrt{B_\star^3 d^3 K/c_{min}} )$ regret, where $B_\star$ is a (known) upper bound on the optimal cost-to-go function, $d$ is the feature dimension, $K$ is the number of episodes, and $c_{min}$ is the minimal cost of non-goal state-action pairs (assumed to be positive). The second is computationally efficient in practice, and we conjecture that it obtains the same regret bound. Both algorithms are based on an optimistic least-squares version of value iteration analogous to the finite-horizon backward induction approach from Jin et al. 2020. To the best of our knowledge, these are the first regret bounds for stochastic shortest path that are independent of the size of the state and action spaces.

assumption, fuzzy logic, inequality, (19 more...)

arXiv.org Machine Learning

2105.01593

Country:

North America > United States > Texas (0.14)
North America > United States > Illinois (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Fuzzy Logic (0.61)

Add feedback