Goto

Collaborating Authors

High Dimensional Linear Regression using Lattice Basis Reduction

Neural Information Processing Systems

We consider a high dimensional linear regression problem where the goal is to efficiently recover an unknown vector \beta^* from n noisy linear observations Y=X \beta^*+W in R^n, for known X in R^{n \times p} and unknown W in R^n. Unlike most of the literature on this model we make no sparsity assumption on \beta^*. Instead we adopt a regularization based on assuming that the underlying vectors \beta^* have rational entries with the same denominator Q. We call this Q-rationality assumption. We propose a new polynomial-time algorithm for this task which is based on the seminal Lenstra-Lenstra-Lovasz (LLL) lattice basis reduction algorithm. We establish that under the Q-rationality assumption, our algorithm recovers exactly the vector \beta^* for a large class of distributions for the iid entries of X and non-zero noise W. We prove that it is successful under small noise, even when the learner has access to only one observation (n=1). Furthermore, we prove that in the case of the Gaussian white noise for W, n=o(p/\log p) and Q sufficiently large, our algorithm tolerates a nearly optimal information-theoretic level of the noise.


High Dimensional Linear Regression using Lattice Basis Reduction

Neural Information Processing Systems

We consider a high dimensional linear regression problem where the goal is to efficiently recover an unknown vector \beta^* from n noisy linear observations Y=X \beta^*+W in R^n, for known X in R^{n \times p} and unknown W in R^n. Unlike most of the literature on this model we make no sparsity assumption on \beta^*. Instead we adopt a regularization based on assuming that the underlying vectors \beta^* have rational entries with the same denominator Q. We call this Q-rationality assumption. We propose a new polynomial-time algorithm for this task which is based on the seminal Lenstra-Lenstra-Lovasz (LLL) lattice basis reduction algorithm. We establish that under the Q-rationality assumption, our algorithm recovers exactly the vector \beta^* for a large class of distributions for the iid entries of X and non-zero noise W. We prove that it is successful under small noise, even when the learner has access to only one observation (n=1). Furthermore, we prove that in the case of the Gaussian white noise for W, n=o(p/\log p) and Q sufficiently large, our algorithm tolerates a nearly optimal information-theoretic level of the noise.


Solving 'barren plateaus' is the key to quantum machine learning

#artificialintelligence

"The work solves a key problem of useability for quantum machine learning. We rigorously proved the conditions under which certain architectures of variational quantum algorithms will or will not have barren plateaus as they are scaled up," said Marco Cerezo, lead author on the paper published in Nature Communications today by a Los Alamos National Laboratory team. Cerezo is a post doc researching quantum information theory at Los Alamos. "With our theorems, you can guarantee that the architecture will be scalable to quantum computers with a large number of qubits." "Usually the approach has been to run an optimization and see if it works, and that was leading to fatigue among researchers in the field," said Patrick Coles, a coauthor of the study.


Optimal Low-Degree Hardness of Maximum Independent Set

arXiv.org Machine Learning

We study the algorithmic task of finding a large independent set in a sparse Erd\H{o}s-R\'{e}nyi random graph with $n$ vertices and average degree $d$. The maximum independent set is known to have size $(2 \log d / d)n$ in the double limit $n \to \infty$ followed by $d \to \infty$, but the best known polynomial-time algorithms can only find an independent set of half-optimal size $(\log d / d)n$. We show that the class of low-degree polynomial algorithms can find independent sets of half-optimal size but no larger, improving upon a result of Gamarnik, Jagannath, and the author. This generalizes earlier work by Rahman and Vir\'ag, which proved the analogous result for the weaker class of local algorithms.


Structure learning of antiferromagnetic Ising models

Neural Information Processing Systems

In this paper we investigate the computational complexity of learning the graph structure underlying a discrete undirected graphical model from i.i.d. samples. Our first result is an unconditional computational lower bound of $\Omega (p^{d/2})$ for learning general graphical models on $p$ nodes of maximum degree $d$, for the class of statistical algorithms recently introduced by Feldman et al. The construction is related to the notoriously difficult learning parities with noise problem in computational learning theory. Our lower bound shows that the $\widetilde O(p^{d+2})$ runtime required by Bresler, Mossel, and Sly's exhaustive-search algorithm cannot be significantly improved without restricting the class of models. Aside from structural assumptions on the graph such as it being a tree, hypertree, tree-like, etc., most recent papers on structure learning assume that the model has the correlation decay property. Indeed, focusing on ferromagnetic Ising models, Bento and Montanari showed that all known low-complexity algorithms fail to learn simple graphs when the interaction strength exceeds a number related to the correlation decay threshold. Our second set of results gives a class of repelling (antiferromagnetic) models that have the \emph{opposite} behavior: very strong repelling allows efficient learning in time $\widetilde O(p^2)$. We provide an algorithm whose performance interpolates between $\widetilde O(p^2)$ and $\widetilde O(p^{d+2})$ depending on the strength of the repulsion.