AITopics | subspace selection

Reviews: Subspace Detours: Building Transport Plans that are Optimal on Subspace Projections

Neural Information Processing SystemsJun-2-2025, 00:34:21 GMT

Post-response comments (from discussion): "I feel the response did a good job of answering points of confusion, and also added an interesting example application of color transfer. This last example/experiment is heartening, because they finally use one of their maps (MK) in an application instead of just using the distances. I would be quite interested to see a more comprehensive exploration of this, as well as further applications of the MI and MK maps (perhaps in domain adaptation or Waddington-OT, which they mention elsewhere?). It's also important to note that they included a new algorithm for subspace selection which performs projected gradient descent on the basis vectors of the subspace, which outperforms their old method in the synthetic cases. This is a nice discovery, but I think will necessitate more than a minor structural change to the paper. One would expect a more complete presentation of the algorithm, including discussion of convergence (to local optima at least?) and runtime. It would also be nice to find a non-synthetic use case for this subspace selection, if possible. In light of their response, I feel that this paper is on the right track, but could use another iteration to better argue for applicability of their maps, and to update their algorithm for subspace selection."

algorithm, building transport plan, subspace projection, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.60)

Add feedback

Adversarial Subspace Generation for Outlier Detection in High-Dimensional Data

Cribeiro-Ramallo, Jose, Matteucci, Federico, Enciu, Paul, Jenke, Alexander, Arzamasov, Vadim, Strufe, Thorsten, Böhm, Klemens

arXiv.org Artificial IntelligenceApr-11-2025

Outlier detection in high-dimensional tabular data is challenging since data is often distributed across multiple lower-dimensional subspaces -- a phenomenon known as the Multiple Views effect (MV). This effect led to a large body of research focused on mining such subspaces, known as subspace selection. However, as the precise nature of the MV effect was not well understood, traditional methods had to rely on heuristic-driven search schemes that struggle to accurately capture the true structure of the data. Properly identifying these subspaces is critical for unsupervised tasks such as outlier detection or clustering, where misrepresenting the underlying data structure can hinder the performance. We introduce Myopic Subspace Theory (MST), a new theoretical framework that mathematically formulates the Multiple Views effect and writes subspace selection as a stochastic optimization problem. Based on MST, we introduce V-GAN, a generative method trained to solve such an optimization problem. This approach avoids any exhaustive search over the feature space while ensuring that the intrinsic data structure is preserved. Experiments on 42 real-world datasets show that using V-GAN subspaces to build ensemble methods leads to a significant increase in one-class classification performance -- compared to existing subspace selection, feature selection, and embedding methods. Further experiments on synthetic data show that V-GAN identifies subspaces more accurately while scaling better than other relevant subspace selection methods. These results confirm the theoretical guarantees of our approach and also highlight its practical viability in high-dimensional settings.

data mining, err 0, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2504.07522

Country: North America > United States (0.67)

Genre:

Research Report > Experimental Study (0.92)
Research Report > New Finding (0.87)

Technology:

Information Technology > Data Science > Data Mining > Anomaly Detection (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Few-Shot Domain Adaptation for Named-Entity Recognition via Joint Constrained k-Means and Subspace Selection

Hammal, Ayoub, Uthayasooriyar, Benno, Corro, Caio

arXiv.org Artificial IntelligenceDec-12-2024

Named-entity recognition (NER) is a task that typically requires large annotated datasets, which limits its applicability across domains with varying entity definitions. This paper addresses few-shot NER, aiming to transfer knowledge to new domains with minimal supervision. Unlike previous approaches that rely solely on limited annotated data, we propose a weakly supervised algorithm that combines small labeled datasets with large amounts of unlabeled data. Our method extends the k-means algorithm with label supervision, cluster size constraints and domain-specific discriminative subspace selection. This unified framework achieves state-of-the-art results in few-shot NER on several English datasets.

algorithm, constraint, dataset, (13 more...)

arXiv.org Artificial Intelligence

2412.00426

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > Virginia > Fairfax County > Fairfax (0.04)
(11 more...)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.93)

Add feedback

Discriminative K-means for Clustering

Neural Information Processing SystemsApr-6-2023, 14:47:56 GMT

We present a theoretical study on the discriminative clustering framework, recently proposed for simultaneous subspace selection via linear discriminant analysis (LDA) and clustering. Empirical results have shown its favorable performance in comparison with several other popular clustering algorithms. However, the inherent relationship between subspace selection and clustering in this framework is not well understood, due to the iterative nature of the algorithm. We show in this paper that this iterative subspace selection and clustering is equivalent to kernel K-means with a specific kernel Gram matrix. This provides significant and new insights into the nature of this subspace selection procedure.

algorithm, discriminative k-means, subspace selection, (4 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Tightly Robust Optimization via Empirical Domain Reduction

Yabe, Akihiro, Maehara, Takanori

arXiv.org Machine LearningFeb-29-2020

Data-driven decision-making is performed by solving a parameterized optimization problem, and the optimal decision is given by an optimal solution for unknown true parameters. We often need a solution that satisfies true constraints even though these are unknown. Robust optimization is employed to obtain such a solution, where the uncertainty of the parameter is represented by an ellipsoid, and the scale of robustness is controlled by a coefficient. In this study, we propose an algorithm to determine the scale such that the solution has a good objective value and satisfies the true constraints with a given confidence probability. Under some regularity conditions, the scale obtained by our algorithm is asymptotically $O(1/\sqrt{n})$, whereas the scale obtained by a standard approach is $O(\sqrt{d/n})$. This means that our algorithm is less affected by the dimensionality of the parameters.

constraint, optimization, probability, (16 more...)

arXiv.org Machine Learning

2003.00248

Country:

North America > United States (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.55)

Add feedback

Discriminative K-means for Clustering

Ye, Jieping, Zhao, Zheng, Wu, Mingrui

Neural Information Processing SystemsFeb-15-2020, 06:11:51 GMT

We present a theoretical study on the discriminative clustering framework, recently proposed for simultaneous subspace selection via linear discriminant analysis (LDA) and clustering. Empirical results have shown its favorable performance in comparison with several other popular clustering algorithms. However, the inherent relationship between subspace selection and clustering in this framework is not well understood, due to the iterative nature of the algorithm. We show in this paper that this iterative subspace selection and clustering is equivalent to kernel K-means with a specific kernel Gram matrix. This provides significant and new insights into the nature of this subspace selection procedure.

algorithm, discriminative k-means, subspace selection, (4 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Discriminative K-means for Clustering

Ye, Jieping, Zhao, Zheng, Wu, Mingrui

Neural Information Processing SystemsDec-31-2008

We present a theoretical study on the discriminative clustering framework, recently proposed for simultaneous subspace selection via linear discriminant analysis (LDA) and clustering. Empirical results have shown its favorable performance in comparison with several other popular clustering algorithms. However, the inherent relationship between subspace selection and clustering in this framework is not well understood, due to the iterative nature of the algorithm. We show in this paper that this iterative subspace selection and clustering is equivalent to kernel K-means with a specific kernel Gram matrix. This provides significant and new insights into the nature of this subspace selection procedure. Based on this equivalence relationship, we propose the Discriminative K-means (DisKmeans) algorithm for simultaneous LDA subspace selection and clustering, as well as an automatic parameter estimation procedure. We also present the nonlinear extension of DisKmeans using kernels. We show that the learning of the kernel matrix over a convex set of pre-specified kernel matrices can be incorporated into the clustering formulation. The connection between DisKmeans and several other clustering algorithms is also analyzed. The presented theories and algorithms are evaluated through experiments on a collection of benchmark data sets.

algorithm, diskmean, matrix, (15 more...)

Neural Information Processing Systems

Country: