AITopics | Dimensionality Reduction

Collaborating Authors

Dimensionality Reduction

Dimensionality reduction or dimension reduction is the process of reducing the number of random variables under consideration by obtaining a set of principal variables. It can be divided into feature selection (find a subset of the original variables) and feature extraction (transform the data in the high-dimensional space to a space of fewer dimensions). (Wikipedia)

News Overviews Instructional Materials AI-Alerts Classics

ESPACE: Dimensionality Reduction of Activations for Model Compression

Neural Information Processing SystemsMay-26-2025, 18:16:40 GMT

We propose ESPACE, an LLM compression technique based on dimensionality reduction of activations. Unlike prior works on weight-centric tensor decomposition, ESPACE projects activations onto a pre-calibrated set of principal components. The activation-centrality of the approach enables retraining LLMs with no loss of expressivity; while at inference, weight decomposition is obtained as a byproduct of matrix multiplication associativity. Theoretical results on the construction of projection matrices with optimal computational accuracy are provided. Experimentally, we find ESPACE enables 50% compression of GPT3, Llama2, and Nemotron4 models with small accuracy degradation, as low as a 0.18 perplexity increase on GPT3-22B.

large language model, machine learning, natural language, (7 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.74)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Dimensionality Reduction (0.65)

Add feedback

Navigating the Effect of Parametrization for Dimensionality Reduction

Neural Information Processing SystemsMay-26-2025, 17:19:03 GMT

Parametric dimensionality reduction methods have gained prominence for their ability to generalize to unseen datasets, an advantage that traditional non-parametric approaches typically lack. Despite their growing popularity, there remains a prevalent misconception among practitioners about the equivalence in performance between parametric and non-parametric methods. Here, we show that these methods are not equivalent -- parametric methods retain global structure but lose significant local details. To explain this, we provide evidence that parameterized approaches lack the ability to repulse negative samples, and the choice of loss function also has an impact.Addressing these issues, we developed a new parametric method, ParamRepulsor, that incorporates Hard Negative Mining and a loss function that applies a strong repulsive force. This new method achieves state-of-the-art performance on local structure preservation for parametric methods without sacrificing the fidelity of global structural representation.

artificial intelligence, dimensionality reduction, machine learning, (4 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Dimensionality Reduction (0.66)

Add feedback

Optimization of embeddings storage for RAG systems using quantization and dimensionality reduction techniques

Huerga-Pérez, Naamán, Álvarez, Rubén, Ferrero-Guillén, Rubén, Martínez-Gutiérrez, Alberto, Díez-González, Javier

arXiv.org Artificial IntelligenceMay-2-2025

Retrieval-Augmented Generation enhances language models by retrieving relevant information from external knowledge bases, relying on high-dimensional vector embeddings typically stored in float32 precision. However, storing these embeddings at scale presents significant memory challenges. To address this issue, we systematically investigate on MTEB benchmark two complementary optimization strategies: quantization, evaluating standard formats (float16, int8, binary) and low-bit floating-point types (float8), and dimensionality reduction, assessing methods like PCA, Kernel PCA, UMAP, Random Projections and Autoencoders. Our results show that float8 quantization achieves a 4x storage reduction with minimal performance degradation (<0.3%), significantly outperforming int8 quantization at the same compression level, being simpler to implement. PCA emerges as the most effective dimensionality reduction technique. Crucially, combining moderate PCA (e.g., retaining 50% dimensions) with float8 quantization offers an excellent trade-off, achieving 8x total compression with less performance impact than using int8 alone (which provides only 4x compression). To facilitate practical application, we propose a methodology based on visualizing the performance-storage trade-off space to identify the optimal configuration that maximizes performance within their specific memory constraints.

data mining, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2505.00105

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Dimensionality Reduction (0.86)
Information Technology > Data Science > Data Mining > Big Data (0.62)

Add feedback

Interpretable non-linear dimensionality reduction using gaussian weighted linear transformation

Bergh, Erik

arXiv.org Artificial IntelligenceApr-25-2025

Dimensionality reduction techniques are fundamental for analyzing and visualizing high-dimensional data. With established methods like t-SNE and PCA presenting a trade-off between representational power and interpretability. This paper introduces a novel approach that bridges this gap by combining the interpretability of linear methods with the expressiveness of non-linear transformations. The proposed algorithm constructs a non-linear mapping between high-dimensional and low-dimensional spaces through a combination of linear transformations, each weighted by Gaussian functions. This architecture enables complex non-linear transformations while preserving the interpretability advantages of linear methods, as each transformation can be analyzed independently. The resulting model provides both powerful dimensionality reduction and transparent insights into the transformed space. Techniques for interpreting the learned transformations are presented, including methods for identifying suppressed dimensions and how space is expanded and contracted. These tools enable practitioners to understand how the algorithm preserves and modifies geometric relationships during dimensionality reduction. To ensure the practical utility of this algorithm, the creation of user-friendly software packages is emphasized, facilitating its adoption in both academia and industry.

artificial intelligence, machine learning, transformation, (16 more...)

arXiv.org Artificial Intelligence

doi: 10.54364/AAIML.2025.51198

2504.17601

Genre: Research Report (0.70)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Dimensionality Reduction (1.00)

Add feedback

The Shape of Attraction in UMAP: Exploring the Embedding Forces in Dimensionality Reduction

Islam, Mohammad Tariqul, Fleischer, Jason W.

arXiv.org Artificial IntelligenceMar-18-2025

The current era is characterized by a deluge of high-dimensional data. Dimensionality reduction (DR) techniques have emerged as tools for exploratory analysis of such data by visualizing the underlying structure. The most popular methods, t-distributed stochastic neighbor embedding [1] and uniform manifold approximation and projection (UMAP) [2] are grounded in the attraction-repulsion dynamics that bring similar data points closer while pushing dissimilar ones apart. As unsupervised algorithms, these do not rely on labeled data; instead, they identify and preserve the intrinsic structure of high-dimensional data by leveraging local (attractive) and global (repulsive) relationships (forces). This makes these algorithms particularly well-suited for tasks such as clustering [3], exploratory data analysis [4], anomaly detection in semiconductor manufacturing [5], visual search [6], time series analysis [7], studying representation convergence [8], and outlier image detection [9], where visualizing hidden patterns in unlabeled data is critical and meaningful.

artificial intelligence, data mining, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2503.09101

Country:

North America > United States > New Jersey > Mercer County > Princeton (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)

Genre: Research Report (0.64)

Industry:

Health & Medicine > Therapeutic Area (0.46)
Health & Medicine > Pharmaceuticals & Biotechnology (0.46)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Dimensionality Reduction (0.61)

Add feedback

Preserving clusters and correlations: a dimensionality reduction method for exceptionally high global structure preservation

Gildenblat, Jacob, Pahnke, Jens

arXiv.org Artificial IntelligenceMar-17-2025

We present Preserving Clusters and Correlations (PCC), a novel dimensionality reduction (DR) method a novel dimensionality reduction (DR) method that achieves state-of-the-art global structure (GS) preservation while maintaining competitive local structure (LS) preservation. It optimizes two objectives: a GS preservation objective that preserves an approximation of Pearson and Spearman correlations between high- and low-dimensional distances, and an LS preservation objective that ensures clusters in the high-dimensional data are separable in the low-dimensional data. PCC has a state-of-the-art ability to preserve the GS while having competitive LS preservation. In addition, we show the correlation objective can be combined with UMAP to significantly improve its GS preservation with minimal degradation of the LS. We quantitatively benchmark PCC against existing methods and demonstrate its utility in medical imaging, and show PCC is a competitive DR technique that demonstrates superior GS preservation in our benchmarks.

artificial intelligence, machine learning, preservation, (16 more...)

arXiv.org Artificial Intelligence

2503.07609

Country:

Europe > Norway > Eastern Norway > Oslo (0.05)
North America > Canada > Ontario > Toronto (0.04)
Europe > Latvia (0.04)
(3 more...)

Genre: Research Report (0.50)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.94)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Dimensionality Reduction (0.83)

Add feedback

Median Consensus Embedding for Dimensionality Reduction

Tomo, Yui, Yoneoka, Daisuke

arXiv.org Machine LearningMar-11-2025

This study proposes median consensus embedding (MCE) to address variability in low-dimensional embeddings caused by random initialization in dimensionality reduction techniques such as t-distributed stochastic neighbor embedding. MCE is defined as the geometric median of multiple embeddings. By assuming multiple embeddings as independent and identically distributed random samples and applying large deviation theory, we prove that MCE achieves consistency at an exponential rate. Furthermore, we develop a practical algorithm to implement MCE by constructing a distance function between embeddings based on the Frobenius norm of the pairwise distance matrix of data points. Application to real-world data demonstrates that MCE converges rapidly and significantly reduces instability. These results confirm that MCE effectively mitigates random initialization issues in embedding methods.

initialization, mce, optimization problem, (15 more...)

arXiv.org Machine Learning

2503.08103

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
Asia > Japan > Honshū > Chūbu > Toyama Prefecture > Toyama (0.04)

Genre: Research Report > New Finding (0.35)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Dimensionality Reduction (0.62)

Add feedback

DiRe-JAX: A JAX based Dimensionality Reduction Algorithm for Large-scale Data

Kolpakov, Alexander, Rivin, Igor

arXiv.org Artificial IntelligenceMar-5-2025

Summary: DiRe - JAX is a new dimensionality reduction toolkit designed to address some of the challenges faced by traditional methods like UMAP and tSNE such as loss of global structure and computational efficiency. Built on the JAX framework, DiRe leverages modern hardware acceleration to provide an efficient, scalable, and interpretable solution for visualizing complex data structures, and for quantitative analysis of lower-dimensional embeddings. The toolkit shows considerable promise in preserving both local and global structures within the data as compared to state-of-the-art UMAP and tSNE implementations. This makes it suitable for a wide range of applications in machine learning, bio-informatics, and data science. Traditional dimensionality reduction techniques such as UMAP and tSNE are widely used for visualizing high-dimensional data in lower-dimensional spaces, usually 2D and sometimes 3D.

dire-jax, persistence homology preservation, preservation, (12 more...)

arXiv.org Artificial Intelligence

2503.03156

Country:

North America > United States > Texas > Travis County > Austin (0.04)
North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)
North America > United States > New York > New York County > New York City (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.50)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Dimensionality Reduction (0.81)

Add feedback

Topological Autoencoders++: Fast and Accurate Cycle-Aware Dimensionality Reduction

Clémot, Mattéo, Digne, Julie, Tierny, Julien

arXiv.org Artificial IntelligenceFeb-27-2025

This paper presents a novel topology-aware dimensionality reduction approach aiming at accurately visualizing the cyclic patterns present in high dimensional data. To that end, we build on the Topological Autoencoders (TopoAE) formulation. First, we provide a novel theoretical analysis of its associated loss and show that a zero loss indeed induces identical persistence pairs (in high and low dimensions) for the $0$-dimensional persistent homology (PH$^0$) of the Rips filtration. We also provide a counter example showing that this property no longer holds for a naive extension of TopoAE to PH$^d$ for $d\ge 1$. Based on this observation, we introduce a novel generalization of TopoAE to $1$-dimensional persistent homology (PH$^1$), called TopoAE++, for the accurate generation of cycle-aware planar embeddings, addressing the above failure case. This generalization is based on the notion of cascade distortion, a new penalty term favoring an isometric embedding of the $2$-chains filling persistent $1$-cycles, hence resulting in more faithful geometrical reconstructions of the $1$-cycles in the plane. We further introduce a novel, fast algorithm for the exact computation of PH for Rips filtrations in the plane, yielding improved runtimes over previously documented topology-aware methods. Our method also achieves a better balance between the topological accuracy, as measured by the Wasserstein distance, and the visual preservation of the cycles in low dimensions. Our C++ implementation is available at https://github.com/MClemot/TopologicalAutoencodersPlusPlus.

algorithm, point cloud, triangulation, (13 more...)

arXiv.org Artificial Intelligence

2502.20215

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
Europe > Portugal > Coimbra > Coimbra (0.04)

Genre: Research Report (0.64)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Dimensionality Reduction (0.61)

Add feedback

Beyond Worst-Case Dimensionality Reduction for Sparse Vectors

Silwal, Sandeep, Woodruff, David P., Zhang, Qiuyi

arXiv.org Artificial IntelligenceFeb-27-2025

We study beyond worst-case dimensionality reduction for $s$-sparse vectors. Our work is divided into two parts, each focusing on a different facet of beyond worst-case analysis: We first consider average-case guarantees. A folklore upper bound based on the birthday-paradox states: For any collection $X$ of $s$-sparse vectors in $\mathbb{R}^d$, there exists a linear map to $\mathbb{R}^{O(s^2)}$ which \emph{exactly} preserves the norm of $99\%$ of the vectors in $X$ in any $\ell_p$ norm (as opposed to the usual setting where guarantees hold for all vectors). We give lower bounds showing that this is indeed optimal in many settings: any oblivious linear map satisfying similar average-case guarantees must map to $\Omega(s^2)$ dimensions. The same lower bound also holds for a wide class of smooth maps, including `encoder-decoder schemes', where we compare the norm of the original vector to that of a smooth function of the embedding. These lower bounds reveal a separation result, as an upper bound of $O(s \log(d))$ is possible if we instead use arbitrary (possibly non-smooth) functions, e.g., via compressed sensing algorithms. Given these lower bounds, we specialize to sparse \emph{non-negative} vectors. For a dataset $X$ of non-negative $s$-sparse vectors and any $p \ge 1$, we can non-linearly embed $X$ to $O(s\log(|X|s)/\epsilon^2)$ dimensions while preserving all pairwise distances in $\ell_p$ norm up to $1\pm \epsilon$, with no dependence on $p$. Surprisingly, the non-negativity assumption enables much smaller embeddings than arbitrary sparse vectors, where the best known bounds suffer exponential dependence. Our map also guarantees \emph{exact} dimensionality reduction for $\ell_{\infty}$ by embedding into $O(s\log |X|)$ dimensions, which is tight. We show that both the non-linearity of $f$ and the non-negativity of $X$ are necessary, and provide downstream algorithmic improvements.

dimension, sparse vector, vector, (15 more...)

arXiv.org Artificial Intelligence

2502.19865

Country:

Asia > Afghanistan > Parwan Province > Charikar (0.04)
South America > Brazil > Rio de Janeiro > Rio de Janeiro (0.04)
North America > United States > Virginia > Alexandria County > Alexandria (0.04)
(8 more...)

Genre: Research Report > New Finding (0.45)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Dimensionality Reduction (0.82)

Add feedback