Computational lower bounds in latent models: clustering, sparse-clustering, biclustering

Even, Bertrand, Giraud, Christophe, Verzelen, Nicolas

Jun-17-2025–arXiv.org Machine Learning

In high-dimensional statistics, the primary goal is to derive computationally efficient estimation procedures, achieving the best possible statistical performance. Y et, in many problems, such as sparse PCA, planted clique or clustering, the best known algorithms with polynomial-time complexity are unable to match the performances provably achievable by the best estimators (without computational constraints). This observation has lead to several conjectures on the existence of gaps (called statistical-computational gaps) between the optimal statistical performance, i.e. the best performance achievable without computational constraints, and the best performance achievable by polynomial time algorithms. In particular, to assess the quality of a computationally efficient algorithm for a given task, the theoretical performance should not be compared to the optimal statistical performance (without computational constraints), but to the performance of the best poly-time algorithm. This raises the problem of establishing lower-bounds on the performance of the best poly-time algorithms for a wide range of problems. Since high-dimensional statistics deal with random instances, the classical notions of worst-case hardness, such as P, NP, etc are not suitable for the high-dimensional framework. Instead, lower bounds are obtained for some specific models of computations, such as SoS [38, 10], overlap gap property [32], statistical query [41, 13], and low-degree polynomials [37, 44, 66], possibly combined with reductions between different statistical problems [12, 11, 14].

artificial intelligence, machine learning, probability, (18 more...)

arXiv.org Machine Learning

Jun-17-2025

arXiv.org PDF

Add feedback

Country:
- North America > United States
  - New Jersey > Mercer County
    - Princeton (0.04)
  - Florida > Palm Beach County
    - Boca Raton (0.04)
- Europe > France
  - Occitanie > Hérault > Montpellier (0.04)

Genre:
- Research Report (1.00)
- Workflow (0.67)

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning (1.00)
  - Machine Learning > Statistical Learning
    - Clustering (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found