Recovering Imbalanced Clusters via Gradient-Based Projection Pursuit
Eppert, Martin, Mukherjee, Satyaki, Ghoshdastidar, Debarghya
–arXiv.org Artificial Intelligence
Projection Pursuit is a classic exploratory technique for finding interesting projections of a dataset. We propose a method for recovering projections containing either Imbalanced Clusters or a Bernoulli-Rademacher distribution using a gradient-based technique to optimize the projection index. As sample complexity is a major limiting factor in Projection Pursuit, we analyze our algorithm's sample complexity within a Planted Vector setting where we can observe that Imbalanced Clusters can be recovered more easily than balanced ones. Additionally, we give a generalized result that works for a variety of data distributions and projection indices. We compare these results to computational lower bounds in the Low-Degree-Polynomial Framework. Finally, we experimentally evaluate our method's applicability to real-world data using FashionMNIST and the Human Activity Recognition Dataset, where our algorithm outperforms others when only a few samples are available.
arXiv.org Artificial Intelligence
Feb-4-2025
- Country:
- Europe > Germany (0.28)
- North America > United States (0.28)
- Genre:
- Research Report > New Finding (0.46)
- Technology: