AITopics | Dimensionality Reduction

Collaborating Authors

Dimensionality Reduction

Dimensionality reduction or dimension reduction is the process of reducing the number of random variables under consideration by obtaining a set of principal variables. It can be divided into feature selection (find a subset of the original variables) and feature extraction (transform the data in the high-dimensional space to a space of fewer dimensions). (Wikipedia)

News Overviews Instructional Materials AI-Alerts Classics

A Normative Theory of Adaptive Dimensionality Reduction in Neural Networks

Cengiz Pehlevan, Dmitri Chklovskii

Neural Information Processing SystemsOct-2-2025, 10:33:27 GMT

To make sense of the world our brains must analyze high-dimensional datasets streamed by our sensory organs. Because such analysis begins with dimensionality reduction, modeling early sensory processing requires biologically plausible online dimensionality reduction algorithms. Recently, we derived such an algorithm, termed similarity matching, from a Multidimensional Scaling (MDS) objective function. However, in the existing algorithm, the number of output dimensions is set a priori by the number of output neurons and cannot be changed. Because the number of informative dimensions in sensory inputs is variable there is a need for adaptive dimensionality reduction.

algorithm, artificial intelligence, machine learning, (14 more...)

Neural Information Processing Systems

Country: Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.46)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Dimensionality Reduction (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Multi-Criteria Dimensionality Reduction with Applications to Fairness

Uthaipon Tantipongpipat, Samira Samadi, Mohit Singh, Jamie H. Morgenstern, Santosh Vempala

Neural Information Processing SystemsOct-2-2025, 09:07:39 GMT

Neural Information Processing Systems http://nips.cc/

algorithm, constraint, theorem 1, (14 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Los Angeles County > Long Beach (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Dimensionality Reduction (0.43)

Add feedback

Convexity-Driven Projection for Point Cloud Dimensionality Reduction

Sanyal, Suman

arXiv.org Machine LearningSep-29-2025

We propose Convexity-Driven Projection (CDP), a boundary-free linear method for dimensionality reduction of point clouds that targets preserving detour-induced local non-convexity. CDP builds a $k$-NN graph, identifies admissible pairs whose Euclidean-to-shortest-path ratios are below a threshold, and aggregates their normalized directions to form a positive semidefinite non-convexity structure matrix. The projection uses the top-$k$ eigenvectors of the structure matrix. We give two verifiable guarantees. A pairwise a-posteriori certificate that bounds the post-projection distortion for each admissible pair, and an average-case spectral bound that links expected captured direction energy to the spectrum of the structure matrix, yielding quantile statements for typical distortion. Our evaluation protocol reports fixed- and reselected-pairs detour errors and certificate quantiles, enabling practitioners to check guarantees on their data.

admissible pair, convexity-driven projection, graph, (13 more...)

arXiv.org Machine Learning

2509.22043

Country:

North America > United States > New York (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Asia > India > Goa (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Dimensionality Reduction (0.62)

Add feedback

Benchmarking Dimensionality Reduction Techniques for Spatial Transcriptomics

Mahmud, Md Ishtyaq, Kochat, Veena, Satpati, Suresh, Dwarampudi, Jagan Mohan Reddy, Rai, Kunal, Banerjee, Tania

arXiv.org Artificial IntelligenceSep-18-2025

We introduce a unified framework for evaluating dimensionality reduction techniques in spatial transcriptomics beyond standard PCA approaches. We benchmark six methods PCA, NMF, autoencoder, VAE, and two hybrid embeddings on a cholangiocarcinoma Xenium dataset, systematically varying latent dimensions ($k$=5-40) and clustering resolutions ($ρ$=0.1-1.2). Each configuration is evaluated using complementary metrics including reconstruction error, explained variance, cluster cohesion, and two novel biologically-motivated measures: Cluster Marker Coherence (CMC) and Marker Exclusion Rate (MER). Our results demonstrate distinct performance profiles: PCA provides a fast baseline, NMF maximizes marker enrichment, VAE balances reconstruction and interpretability, while autoencoders occupy a middle ground. We provide systematic hyperparameter selection using Pareto optimal analysis and demonstrate how MER-guided reassignment improves biological fidelity across all methods, with CMC scores improving by up to 12\% on average. This framework enables principled selection of dimensionality reduction methods tailored to specific spatial transcriptomics analyses.

data mining, machine learning, nmf, (15 more...)

arXiv.org Artificial Intelligence

2509.13344

Country:

North America > United States > Texas > Harris County > Houston (0.14)
North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.05)
Europe > Netherlands > South Holland > Leiden (0.05)
(5 more...)

Genre: Research Report > New Finding (0.68)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (0.48)
Health & Medicine > Therapeutic Area > Oncology (0.34)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Dimensionality Reduction (0.85)

Add feedback

Curvature as a tool for evaluating dimensionality reduction and estimating intrinsic dimension

Beylier, Charlotte, Joharinad, Parvaneh, Jost, Jürgen, Torbati, Nahid

arXiv.org Artificial IntelligenceSep-18-2025

Utilizing recently developed abstract notions of sectional curvature, we introduce a method for constructing a curvature-based geometric profile of discrete metric spaces. The curvature concept that we use here captures the metric relations between triples of points and other points. More significantly, based on this curvature profile, we introduce a quantitative measure to evaluate the effectiveness of data representations, such as those produced by dimensionality reduction techniques. Furthermore, Our experiments demonstrate that this curvature-based analysis can be employed to estimate the intrinsic dimensionality of datasets. We use this to explore the large-scale geometry of empirical networks and to evaluate the effectiveness of dimensionality reduction techniques.

curvature, data mining, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2509.13385

Country:

Europe > Germany > Saxony > Leipzig (0.04)
North America > United States > Wisconsin (0.04)
North America > United States > New Mexico (0.04)

Genre: Research Report (0.50)

Industry: Health & Medicine > Therapeutic Area (0.69)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Dimensionality Reduction (0.82)

Add feedback

Linear Dimensionality Reduction for Word Embeddings in Tabular Data Classification

Ressel, Liam, Gardi, Hamza A. A.

arXiv.org Artificial IntelligenceSep-17-2025

The Engineers' Salary Prediction Challenge requires classifying salary categories into three classes based on tabular data. The job description is represented as a 300-dimensional word embedding incorporated into the tabular features, drastically increasing dimensionality. Additionally, the limited number of training samples makes classification challenging. Linear dimensionality reduction of word embeddings for tabular data classification remains underexplored. This paper studies Principal Component Analysis (PCA) and Linear Discriminant Analysis (LDA). We show that PCA, with an appropriate subspace dimension, can outperform raw embeddings. LDA without regularization performs poorly due to covariance estimation errors, but applying shrinkage improves performance significantly, even with only two dimensions. We propose Partitioned-LDA, which splits embeddings into equal-sized blocks and performs LDA separately on each, thereby reducing the size of the covariance matrices. Partitioned-LDA outperforms regular LDA and, combined with shrinkage, achieves top-10 accuracy on the competition public leaderboard. This method effectively enhances word embedding performance in tabular data classification with limited training samples.

artificial intelligence, job description, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2509.12346

Country:

Europe > Germany (0.05)
North America > United States (0.04)

Genre: Research Report (0.85)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Dimensionality Reduction (0.63)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Preserving Vector Space Properties in Dimensionality Reduction: A Relationship Preserving Loss Framework

Weinwurm, Eddi, Kovalenko, Alexander

arXiv.org Artificial IntelligenceSep-3-2025

Dimensionality reduction can distort vector space properties such as orthogonality and linear independence, which are critical for tasks including cross-modal retrieval, clustering, and classification. We propose a Relationship Preserving Loss (RPL), a loss function that preserves these properties by minimizing discrepancies between relationship matrices (e.g., Gram or cosine) of high-dimensional data and their low-dimensional embeddings. RPL trains neural networks for non-linear projections and is supported by error bounds derived from matrix perturbation theory. Initial experiments suggest that RPL reduces embedding dimensions while largely retaining performance on downstream tasks, likely due to its preservation of key vector space properties. While we describe here the use of RPL in dimensionality reduction, this loss can also be applied more broadly, for example to cross-domain alignment and transfer learning, knowledge distillation, fairness and invariance, dehubbing, graph and manifold learning, and federated learning, where distributed embeddings must remain geometrically consistent.

artificial intelligence, machine learning, preservation, (13 more...)

arXiv.org Artificial Intelligence

2509.01198

Country: