AITopics | Dimensionality Reduction

Collaborating Authors

Dimensionality Reduction

Dimensionality reduction or dimension reduction is the process of reducing the number of random variables under consideration by obtaining a set of principal variables. It can be divided into feature selection (find a subset of the original variables) and feature extraction (transform the data in the high-dimensional space to a space of fewer dimensions). (Wikipedia)

News Overviews Instructional Materials AI-Alerts Classics

Splines-Based Feature Importance in Kolmogorov-Arnold Networks: A Framework for Supervised Tabular Data Dimensionality Reduction

Akazan, Ange-Clément, Mbingui, Verlon Roel

arXiv.org Artificial IntelligenceNov-24-2025

Feature selection is a key step in many tabular prediction problems, where multiple candidate variables may be redundant, noisy, or weakly informative. We investigate feature selection based on Kolmogorov-Arnold Networks (KANs), which parameterize feature transformations with splines and expose per-feature importance scores in a natural way. From this idea we derive four KAN-based selection criteria (coefficient norms, gradient-based saliency, and knockout scores) and compare them with standard methods such as LASSO, Random Forest feature importance, Mutual Information, and SVM-RFE on a suite of real and synthetic classification and regression datasets. Using average F1 and $R^2$ scores across three feature-retention levels (20%, 40%, 60%), we find that KAN-based selectors are generally competitive with, and sometimes superior to, classical baselines. In classification, KAN criteria often match or exceed existing methods on multi-class tasks by removing redundant features and capturing nonlinear interactions. In regression, KAN-based scores provide robust performance on noisy and heterogeneous datasets, closely tracking strong ensemble predictors; we also observe characteristic failure modes, such as overly aggressive pruning with an $\ell_1$ criterion. Stability and redundancy analyses further show that KAN-based selectors yield reproducible feature subsets across folds while avoiding unnecessary correlation inflation, ensuring reliable and non-redundant variable selection. Overall, our findings demonstrate that KAN-based feature selection provides a powerful and interpretable alternative to traditional methods, capable of uncovering nonlinear and multivariate feature relevance beyond sparsity or impurity-based measures.

artificial intelligence, dataset, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2509.23366

Country: North America > United States (0.46)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine > Therapeutic Area > Oncology (0.47)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.90)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Dimensionality Reduction (0.41)

Add feedback

Practical Hash Functions for Similarity Estimation and Dimensionality Reduction

Søren Dahlgaard, Mathias Knudsen, Mikkel Thorup

Neural Information Processing SystemsNov-21-2025, 09:59:06 GMT

Neural Information Processing Systems http://nips.cc/

data mining, hash function, machine learning, (18 more...)

Neural Information Processing Systems

Country:

Europe > Denmark > Capital Region > Copenhagen (0.04)
North America > United States > California > Los Angeles County > Long Beach (0.04)
North America > Canada > Quebec > Capitale-Nationale Region > Québec (0.04)
(2 more...)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Dimensionality Reduction (0.41)

Add feedback

Dimensionality Reduction of Massive Sparse Datasets Using Coresets

Dan Feldman, Mikhail Volkov, Daniela Rus

Neural Information Processing SystemsNov-21-2025, 09:17:39 GMT

Neural Information Processing Systems http://nips.cc/

artificial intelligence, coreset, machine learning, (18 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
North America > United States > California > Los Angeles County > Los Angeles (0.14)
Asia > Middle East > Israel > Haifa District > Haifa (0.04)
(2 more...)

Genre: Research Report > New Finding (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Dimensionality Reduction (0.42)

Add feedback

Large Margin Discriminant Dimensionality Reduction in Prediction Space

Mohammad Saberian, Jose Costa Pereira, Nuno Nvasconcelos, Can Xu

Neural Information Processing SystemsNov-21-2025, 06:41:37 GMT

In this paper we establish a duality between boosting and SVM, and use this to derive a novel discriminant dimensionality reduction algorithm. In particular, using the multiclass formulation of boosting and SVM we note that both use a combination of mapping and linear classification to maximize the multiclass margin.

codeword, data mining, machine learning, (18 more...)

Neural Information Processing Systems

Country:

North America > Canada > Ontario > Toronto (0.14)
North America > United States > California > San Diego County > San Diego (0.04)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Dimensionality Reduction (0.63)

Add feedback

Model-based targeted dimensionality reduction for neuronal population data

Neural Information Processing SystemsNov-20-2025, 22:33:56 GMT

Summarizing high-dimensional data using a small number of parameters is a ubiquitous first step in the analysis of neuronal population activity. Recently developed methods use targeted approaches that work by identifying multiple, distinct low-dimensional subspaces of activity that capture the population response to individual experimental task variables, such as the value of a presented stimulus or the behavior of the animal. These methods have gained attention because they decompose total neural activity into what are ostensibly different parts of a neuronal computation. However, existing targeted methods have been developed outside of the confines of probabilistic modeling, making some aspects of the procedures ad hoc, or limited in flexibility or interpretability. Here we propose a new model-based method for targeted dimensionality reduction based on a probabilistic generative model of the population response data.

dimensionality reduction, model-based, name change, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Dimensionality Reduction (0.66)

Add feedback

Dimensionality Reduction for Stationary Time Series via Stochastic Nonconvex Optimization

Minshuo Chen, Lin Yang, Mengdi Wang, Tuo Zhao

Neural Information Processing SystemsNov-20-2025, 20:17:15 GMT

Neural Information Processing Systems http://nips.cc/

artificial intelligence, machine learning, saddle point, (18 more...)

Neural Information Processing Systems

Country:

North America > Canada > Quebec > Montreal (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Industry: Education (0.46)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Dimensionality Reduction (0.40)

Add feedback

Model-based targeted dimensionality reduction for neuronal population data

Mikio Aoi, Jonathan W. Pillow

Neural Information Processing SystemsNov-20-2025, 18:03:55 GMT

Neural Information Processing Systems http://nips.cc/

artificial intelligence, machine learning, task variable, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > New Jersey > Mercer County > Princeton (0.04)
North America > United States > New York (0.04)
North America > Canada > Quebec > Montreal (0.04)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.70)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Dimensionality Reduction (0.42)

Add feedback

Mind the Gaps: Measuring Visual Artifacts in Dimensionality Reduction

Ros, Jaume, Arleo, Alessio, Paulovich, Fernando

arXiv.org Artificial IntelligenceNov-19-2025

Dimensionality Reduction (DR) techniques are commonly used for the visual exploration and analysis of high-dimensional data due to their ability to project datasets of high-dimensional points onto the 2D plane. However, projecting datasets in lower dimensions often entails some distortion, which is not necessarily easy to recognize but can lead users to misleading conclusions. Several Projection Quality Metrics (PQMs) have been developed as tools to quantify the goodness-of-fit of a DR projection; however, they mostly focus on measuring how well the projection captures the global or local structure of the data, without taking into account the visual distortion of the resulting plots, thus often ignoring the presence of outliers or artifacts that can mislead a visual analysis of the projection. In this work, we introduce the Warping Index (WI), a new metric for measuring the quality of DR projections onto the 2D plane, based on the assumption that the correct preservation of empty regions between points is of crucial importance towards a faithful visual representation of the data.

artificial intelligence, data mining, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2511.14544

Genre: Research Report (0.40)

Technology:

Information Technology > Data Science > Data Mining (0.62)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Dimensionality Reduction (0.62)

Add feedback

UMATO: Bridging Local and Global Structures for Reliable Visual Analytics with Dimensionality Reduction

Jeon, Hyeon, Ko, Kwon, Lee, Soohyun, Hyun, Jake, Yang, Taehyun, Go, Gyehun, Jo, Jaemin, Seo, Jinwook

arXiv.org Artificial IntelligenceNov-18-2025

Due to the intrinsic complexity of high-dimensional (HD) data, dimensionality reduction (DR) techniques cannot preserve all the structural characteristics of the original data. Therefore, DR techniques focus on preserving either local neighborhood structures (local techniques) or global structures such as pairwise distances between points (global techniques). However, both approaches can mislead analysts to erroneous conclusions about the overall arrangement of manifolds in HD data. For example, local techniques may exaggerate the compactness of individual manifolds, while global techniques may fail to separate clusters that are well-separated in the original space. In this research, we provide a deeper insight into Uniform Manifold Approximation with Two-phase Optimization (UMATO), a DR technique that addresses this problem by effectively capturing local and global structures. UMATO achieves this by dividing the optimization process of UMAP into two phases. In the first phase, it constructs a skeletal layout using representative points, and in the second phase, it projects the remaining points while preserving the regional characteristics. Quantitative experiments validate that UMATO outperforms widely used DR techniques, including UMAP, in terms of global structure preservation, with a slight loss in local structure. We also confirm that UMATO outperforms baseline techniques in terms of scalability and stability against initialization and subsampling, making it more effective for reliable HD data analysis. Finally, we present a case study and a qualitative demonstration that highlight UMATO's effectiveness in generating faithful projections, enhancing the overall reliability of visual analytics using DR.

data mining, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/TVCG.2025.3602735

2508.16227

Country: North America > United States (0.46)

Genre: Research Report > New Finding (0.92)

Industry:

Health & Medicine (0.46)
Education > Educational Setting > Higher Education (0.45)

Technology:

Information Technology > Human Computer Interaction > Interfaces (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(2 more...)

Add feedback

A general framework for adaptive nonparametric dimensionality reduction

Di Noia, Antonio, Ravenda, Federico, Mira, Antonietta

arXiv.org Machine LearningNov-13-2025

Dimensionality reduction is a fundamental task in modern data science. Several projection methods specifically tailored to take into account the non-linearity of the data via local embeddings have been proposed. Such methods are often based on local neighbourhood structures and require tuning the number of neighbours that define this local structure, and the dimensionality of the lower-dimensional space onto which the data are projected. Such choices critically influence the quality of the resulting embedding. In this paper, we exploit a recently proposed intrinsic dimension estimator which also returns the optimal locally adaptive neighbourhood sizes according to some desirable criteria. In principle, this adaptive framework can be employed to perform an optimal hyper-parameter tuning of any dimensionality reduction algorithm that relies on local neighbourhood structures. Numerical experiments on both real-world and simulated datasets show that the proposed method can be used to significantly improve well-known projection methods when employed for various learning tasks, with improvements measurable through both quantitative metrics and the quality of low-dimensional visualizations.

artificial intelligence, data mining, machine learning, (14 more...)

arXiv.org Machine Learning

2511.09486

Country: