AITopics | tilde omega

Embedding Dimension of Contrastive Learning and k -Nearest Neighbors

Neural Information Processing SystemsMar-20-2026, 06:40:59 GMT

We study the embedding dimension of distance comparison data in two settings: contrastive learning and $k$-nearest neighbors ($k$-NN). In both cases, the goal is to find the smallest dimension $d$ of an $\ell_p$-space in which a given dataset can be represented. We show that the arboricity of the associated graphs plays a key role in designing embeddings. Using this approach, for the most frequently used $\ell_2$-distance, we get matching upper and lower bounds in both settings. In contrastive learning, we are given $m$ labeled samples of the form $(x_i, y_i^+, z_i^-)$ representing the fact that the positive example $y_i$ is closer to the anchor $x_i$ than the negative example $z_i$. We show that for representing such dataset in:- $\ell_2$: $d = \Theta(\sqrt{m})$ is necessary and sufficient.-

artificial intelligence, machine learning, proceedings, (8 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Sketching Algorithms for Sparse Dictionary Learning: PTAS and Turnstile Streaming

Neural Information Processing SystemsDec-26-2025, 09:30:10 GMT

Sketching algorithms have recently proven to be a powerful approach both for designing low-space streaming algorithms as well as fast polynomial time approximation schemes (PTAS). In this work, we develop new techniques to extend the applicability of sketching-based approaches to the sparse dictionary learning and the Euclidean $k$-means clustering problems. In particular, we initiate the study of the challenging setting where the dictionary/clustering assignment for each of the $n$ input points must be output, which has surprisingly received little attention in prior work. On the fast algorithms front, we obtain a new approach for designing PTAS's for the $k$-means clustering problem, which generalizes to the first PTAS for the sparse dictionary learning problem. On the streaming algorithms front, we obtain new upper bounds and lower bounds for dictionary learning and $k$-means clustering. In particular, given a design matrix $\mathbf A\in\mathbb R^{n\times d}$ in a turnstile stream, we show an $\tilde O(nr/\epsilon^2 + dk/\epsilon)$ space upper bound for $r$-sparse dictionary learning of size $k$, an $\tilde O(n/\epsilon^2 + dk/\epsilon)$ space upper bound for $k$-means clustering, as well as an $\tilde O(n)$ space upper bound for $k$-means clustering on random order row insertion streams with a natural bounded sensitivity assumption. On the lower bounds side, we obtain a general $\tilde\Omega(n/\epsilon + dk/\epsilon)$ lower bound for $k$-means clustering, as well as an $\tilde\Omega(n/\epsilon^2)$ lower bound for algorithms which can estimate the cost of a single fixed set of candidate centers.

pta and turnstile streaming, sketching algorithm, sparse dictionary learning, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

A Few Expert Queries Suffices for Sample-Efficient RL with Resets and Linear Value Approximation

Neural Information Processing SystemsDec-25-2025, 04:11:25 GMT

The current paper studies sample-efficient Reinforcement Learning (RL) in settings where only the optimal value function is assumed to be linearly-realizable. It has recently been understood that, even under this seemingly strong assumption and access to a generative model, worst-case sample complexities can be prohibitively (i.e., exponentially) large. We investigate the setting where the learner additionally has access to interactive demonstrations from an expert policy, and we present a statistically and computationally efficient algorithm (Delphi) for blending exploration with expert queries. In particular, Delphi requires $\tilde O(d)$ expert queries and a $\texttt{poly}(d,H,|A|,1/\varepsilon)$ amount of exploratory samples to provably recover an $\varepsilon$-suboptimal policy. Compared to pure RL approaches, this corresponds to an exponential improvement in sample complexity with surprisingly-little expert input. Compared to prior imitation learning (IL) approaches, our required number of expert demonstrations is independent of $H$ and logarithmic in $1/\varepsilon$, whereas all prior work required at least linear factors of both in addition to the same dependence on $d$. Towards establishing the minimal amount of expert queries needed, we show that, in the same setting, any learner whose exploration budget is \textit{polynomially-bounded} (in terms of $d,H,$ and $|A|$) will require \textit{at least} $\tilde\Omega(\sqrt{d})$ oracle calls to recover a policy competing with the expert's value function. Under the weaker assumption that the expert's policy is linear, we show that the lower bound increases to $\tilde\Omega(d)$.

expert query suffice, reset and linear value approximation, sample-efficient rl, (9 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.75)

Add feedback

Embedding Dimension of Contrastive Learning and k -Nearest Neighbors

Neural Information Processing SystemsAug-12-2025, 23:06:09 GMT

We study the embedding dimension of distance comparison data in two settings: contrastive learning and k -nearest neighbors ( k -NN). In both cases, the goal is to find the smallest dimension d of an \ell_p -space in which a given dataset can be represented. We show that the arboricity of the associated graphs plays a key role in designing embeddings. Using this approach, for the most frequently used \ell_2 -distance, we get matching upper and lower bounds in both settings. In contrastive learning, we are given m labeled samples of the form (x_i, y_i, z_i -) representing the fact that the positive example y_i is closer to the anchor x_i than the negative example z_i .

artificial intelligence, machine learning, tilde omega, (7 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Nearest Neighbor Methods (1.00)

Add feedback

Linear Regression using Heterogeneous Data Batches

Neural Information Processing SystemsMay-27-2025, 10:42:20 GMT

In many learning applications, data are collected from multiple sources, each providing a \emph{batch} of samples that by itself is insufficient to learn its input-output relationship. A common approach assumes that the sources fall in one of several unknown subgroups, each with an unknown input distribution and input-output relationship. We consider one of this setup's most fundamental and important manifestations where the output is a noisy linear combination of the inputs, and there are k subgroups, each with its own regression vector. Prior work [KSS 20] showed that with abundant small-batches, the regression vectors can be learned with only few, \tilde\Omega( k {3/2}), batches of medium-size with \tilde\Omega(\sqrt k) samples each. However, the paper requires that the input distribution for all k subgroups be isotropic Gaussian, and states that removing this assumption is an interesting and challenging problem".

heterogeneous data batch, input distribution, linear regression, (5 more...)

Neural Information Processing Systems

Technology:

Information Technology > Databases (0.40)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.40)

Add feedback

Sketching Algorithms for Sparse Dictionary Learning: PTAS and Turnstile Streaming

Neural Information Processing SystemsJan-19-2025, 16:22:00 GMT

Sketching algorithms have recently proven to be a powerful approach both for designing low-space streaming algorithms as well as fast polynomial time approximation schemes (PTAS). In this work, we develop new techniques to extend the applicability of sketching-based approaches to the sparse dictionary learning and the Euclidean k -means clustering problems. In particular, we initiate the study of the challenging setting where the dictionary/clustering assignment for each of the n input points must be output, which has surprisingly received little attention in prior work. On the fast algorithms front, we obtain a new approach for designing PTAS's for the k -means clustering problem, which generalizes to the first PTAS for the sparse dictionary learning problem. On the streaming algorithms front, we obtain new upper bounds and lower bounds for dictionary learning and k -means clustering.

pta and turnstile streaming, sketching algorithm, sparse dictionary learning, (4 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Add feedback

A Few Expert Queries Suffices for Sample-Efficient RL with Resets and Linear Value Approximation

Neural Information Processing SystemsJan-18-2025, 18:23:36 GMT

The current paper studies sample-efficient Reinforcement Learning (RL) in settings where only the optimal value function is assumed to be linearly-realizable. It has recently been understood that, even under this seemingly strong assumption and access to a generative model, worst-case sample complexities can be prohibitively (i.e., exponentially) large. We investigate the setting where the learner additionally has access to interactive demonstrations from an expert policy, and we present a statistically and computationally efficient algorithm (Delphi) for blending exploration with expert queries. In particular, Delphi requires \tilde O(d) expert queries and a \texttt{poly}(d,H, A,1/\varepsilon) amount of exploratory samples to provably recover an \varepsilon -suboptimal policy. Compared to pure RL approaches, this corresponds to an exponential improvement in sample complexity with surprisingly-little expert input.

expert query suffice, reset and linear value approximation, sample-efficient rl, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.58)

Add feedback

Filters

Collaborating Authors

tilde omega

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Embedding Dimension of Contrastive Learning and k -Nearest Neighbors

Sketching Algorithms for Sparse Dictionary Learning: PTAS and Turnstile Streaming

A Few Expert Queries Suffices for Sample-Efficient RL with Resets and Linear Value Approximation

Embedding Dimension of Contrastive Learning and k -Nearest Neighbors

Linear Regression using Heterogeneous Data Batches

Sketching Algorithms for Sparse Dictionary Learning: PTAS and Turnstile Streaming

A Few Expert Queries Suffices for Sample-Efficient RL with Resets and Linear Value Approximation