Computational lower bounds in latent models: clustering, sparse-clustering, biclustering

Even, Bertrand, Giraud, Christophe, Verzelen, Nicolas

arXiv.org Machine Learning 

In high-dimensional statistics, the primary goal is to derive computationally efficient estimation procedures, achieving the best possible statistical performance. Y et, in many problems, such as sparse PCA, planted clique or clustering, the best known algorithms with polynomial-time complexity are unable to match the performances provably achievable by the best estimators (without computational constraints). This observation has lead to several conjectures on the existence of gaps (called statistical-computational gaps) between the optimal statistical performance, i.e. the best performance achievable without computational constraints, and the best performance achievable by polynomial time algorithms. In particular, to assess the quality of a computationally efficient algorithm for a given task, the theoretical performance should not be compared to the optimal statistical performance (without computational constraints), but to the performance of the best poly-time algorithm. This raises the problem of establishing lower-bounds on the performance of the best poly-time algorithms for a wide range of problems. Since high-dimensional statistics deal with random instances, the classical notions of worst-case hardness, such as P, NP, etc are not suitable for the high-dimensional framework. Instead, lower bounds are obtained for some specific models of computations, such as SoS [38, 10], overlap gap property [32], statistical query [41, 13], and low-degree polynomials [37, 44, 66], possibly combined with reductions between different statistical problems [12, 11, 14].

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found