AITopics | Dimensionality Reduction

Collaborating Authors

Dimensionality Reduction

Dimensionality reduction or dimension reduction is the process of reducing the number of random variables under consideration by obtaining a set of principal variables. It can be divided into feature selection (find a subset of the original variables) and feature extraction (transform the data in the high-dimensional space to a space of fewer dimensions). (Wikipedia)

News Overviews Instructional Materials AI-Alerts Classics

Universal Feature Selection Tool (UniFeat): An Open-Source Tool for Dimensionality Reduction

Tabakhi, Sina, Moradi, Parham

arXiv.org Artificial IntelligenceNov-30-2022

The Universal Feature Selection Tool (UniFeat) is an open-source tool developed entirely in Java for performing feature selection processes in various research areas. It provides a set of well-known and advanced feature selection methods within its significant auxiliary tools. This allows users to compare the performance of feature selection methods. Moreover, due to the open-source nature of UniFeat, researchers can use and modify it in their research, which facilitates the rapid development of new feature selection algorithms.

artificial intelligence, feature selection method, machine learning, (14 more...)

arXiv.org Artificial Intelligence

doi: 10.1016/j.neucom.2023.03.037

2211.16846

Country:

Asia > Middle East > Iraq > Kurdistan Region (0.05)
Europe > United Kingdom > England > South Yorkshire > Sheffield (0.04)
Asia > Middle East > Iran > Kurdistan Province > Sanandaj (0.04)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Dimensionality Reduction (0.41)

Add feedback

DimenFix: A novel meta-dimensionality reduction method for feature preservation

Luo, Qiaodan, Christino, Leonardo, Paulovich, Fernando V, Milios, Evangelos

arXiv.org Artificial IntelligenceNov-30-2022

Dimensionality reduction has become an important research topic as demand for interpreting high-dimensional datasets has been increasing rapidly in recent years. There have been many dimensionality reduction methods with good performance in preserving the overall relationship among data points when mapping them to a lower-dimensional space. However, these existing methods fail to incorporate the difference in importance among features. To address this problem, we propose a novel meta-method, DimenFix, which can be operated upon any base dimensionality reduction method that involves a gradient-descent-like process. By allowing users to define the importance of different features, which is considered in dimensionality reduction, DimenFix creates new possibilities to visualize and understand a given dataset. Meanwhile, DimenFix does not increase the time cost or reduce the quality of dimensionality reduction with respect to the base dimensionality reduction used.

artificial intelligence, dataset, machine learning, (13 more...)

arXiv.org Artificial Intelligence

2211.16752

Country:

North America > Canada (0.04)
Europe > Netherlands > North Brabant > Eindhoven (0.04)

Genre: Research Report (0.82)

Industry: Health & Medicine > Therapeutic Area (0.48)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Dimensionality Reduction (1.00)

Add feedback

Identifying Chemicals Through Dimensionality Reduction

Anand, Emile, Steinhardt, Charles, Hansen, Martin

arXiv.org Artificial IntelligenceNov-26-2022

Civilizations have tried to make drinking water safe to consume for thousands of years. The process of determining water contaminants has evolved with the complexity of the contaminants due to pesticides and heavy metals. The routine procedure to determine water safety is to use targeted analysis which searches for specific substances from some known list; however, we do not explicitly know which substances should be on this list. Before experimentally determining which substances are contaminants, how do we answer the sampling problem of identifying all the substances in the water? Here, we present an approach that builds on the work of Jaanus Liigand et al., which used non-targeted analysis that conducts a broader search on the sample to develop a random-forest regression model, to predict the names of all the substances in a sample, as well as their respective concentrations[1]. This work utilizes techniques from dimensionality reduction and linear decompositions to present a more accurate model using data from the European Massbank Metabolome Library to produce a global list of chemicals that researchers can then identify and test for when purifying water.

artificial intelligence, basis vector, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2211.14708

Country:

Europe > Denmark > Capital Region > Copenhagen (0.05)
North America > United States > California > Los Angeles County > Pasadena (0.04)

Genre: Research Report (0.40)

Industry:

Water & Waste Management > Water Management (0.68)
Materials > Chemicals (0.49)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Dimensionality Reduction (0.63)

Add feedback

Principal Component Analysis for Dimensionality Reduction in Python - MachineLearningMastery.com Principal Component Analysis for Dimensionality Reduction in Python - MachineLearningMastery.com

#artificialintelligenceNov-25-2022, 00:23:45 GMT

Reducing the number of input variables for a predictive model is referred to as dimensionality reduction. Fewer input variables can result in a simpler predictive model that may have better performance when making predictions on new data. Perhaps the most popular technique for dimensionality reduction in machine learning is Principal Component Analysis, or PCA for short. This is a technique that comes from the field of linear algebra and can be used as a data preparation technique to create a projection of a dataset prior to fitting a model. In this tutorial, you will discover how to use PCA for dimensionality reduction when developing predictive models.

classification accuracy, dimensionality reduction, principal component analysis, (10 more...)

#artificialintelligence

Country: North America > United States (0.15)

Genre: Instructional Material > Course Syllabus & Notes (0.49)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Dimensionality Reduction (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Principal Component Analysis (0.98)

Add feedback

Comparing Explanation Methods for Traditional Machine Learning Models Part 2: Quantifying Model Explainability Faithfulness and Improvements with Dimensionality Reduction

Flora, Montgomery, Potvin, Corey, McGovern, Amy, Handler, Shawn

arXiv.org Artificial IntelligenceNov-18-2022

Machine learning (ML) models are becoming increasingly common in the atmospheric science community with a wide range of applications. To enable users to understand what an ML model has learned, ML explainability has become a field of active research. In Part I of this two-part study, we described several explainability methods and demonstrated that feature rankings from different methods can substantially disagree with each other. It is unclear, though, whether the disagreement is overinflated due to some methods being less faithful in assigning importance. Herein, "faithfulness" or "fidelity" refer to the correspondence between the assigned feature importance and the contribution of the feature to model performance. In the present study, we evaluate the faithfulness of feature ranking methods using multiple methods. Given the sensitivity of explanation methods to feature correlations, we also quantify how much explainability faithfulness improves after correlated features are limited. Before dimensionality reduction, the feature relevance methods [e.g., SHAP, LIME, ALE variance, and logistic regression (LR) coefficients] were generally more faithful than the permutation importance methods due to the negative impact of correlated features. Once correlated features were reduced, traditional permutation importance became the most faithful method. In addition, the ranking uncertainty (i.e., the spread in rank assigned to a feature by the different ranking methods) was reduced by a factor of 2-10, and excluding less faithful feature ranking methods reduces it further. This study is one of the first to quantify the improvement in explainability from limiting correlated features and knowing the relative fidelity of different explainability methods.

artificial intelligence, dataset, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2211.10378

Country:

North America > United States > Oklahoma > Cleveland County > Norman (0.14)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > New Jersey (0.04)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Dimensionality Reduction (0.61)

Add feedback

Interpretable Dimensionality Reduction by Feature Preserving Manifold Approximation and Projection

Yang, Yang, Sun, Hongjian, Gong, Jialei, Du, Yali, Yu, Di

arXiv.org Artificial IntelligenceNov-16-2022

Nonlinear dimensionality reduction methods are ubiquitously applied for visualization and preprocessing highdimensional data in machine learning [1, 2, 3, 4, 5, 6, 7, 8]. These methods assume that the intrinsic dimension of the underlying manifold is much lower than the ambient dimension of the real-world data [9, 10, 11]. Based on approximating the manifold by k nearest neighbour (kNN) graph, nonlinear dimensionality reduction projects data from high to low-dimensional space and retains the topological structure of original data. While nonlinear dimensionality reduction is effective for visualizing high-dimensional data, one major weakness is lacking interpretability of the reduced-dimension results [8]. The reduced dimensions of nonlinear dimensionality reduction have no specific meaning, compared with linear methods like Principal Component Analysis (PCA) where the dimensions of the embedding space represent the directions of the largest variance of original data. Particularly, nonlinear dimensionality reduction focuses on preserving distance between observations and thereby loses source feature information in the embedding space, resulting in failing to illustrate feature loadings that linear methods such as PCA can provide to explain the feature contribution in each dimension. In this paper, we seek to improve the interpretability of nonlinear dimensionality reduction. In addition to preserving the local topological structure between observations in the embedding space, we aim to incorporate the source features to devise an interpretable nonlinear dimensionality reduction method. The feature information is encoded in the column space of data, and we use the tangent space to locally depict the column space [12, 13].

artificial intelligence, machine learning, tangent space, (18 more...)

arXiv.org Artificial Intelligence

2211.09321

Country:

Oceania > Australia > Queensland (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Portugal > Lisbon > Lisbon (0.04)
(3 more...)

Genre: Research Report (0.50)

Industry: Health & Medicine (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Dimensionality Reduction (1.00)

Add feedback

Supervised Dimensionality Reduction and Image Classification Utilizing Convolutional Autoencoders

Nellas, Ioannis A., Tasoulis, Sotiris K., Plagianakos, Vassilis P., Georgakopoulos, Spiros V.

arXiv.org Artificial IntelligenceNov-3-2022

The joint optimization of the reconstruction and classification error is a hard non convex problem, especially when a non linear mapping is utilized. In order to overcome this obstacle, a novel optimization strategy is proposed, in which a Convolutional Autoencoder for dimensionality reduction and a classifier composed by a Fully Connected Network, are combined to simultaneously produce supervised dimensionality reduction and predictions. It turned out that this methodology can also be greatly beneficial in enforcing explainability of deep learning architectures. Additionally, the resulting Latent Space, optimized for the classification task, can be utilized to improve traditional, interpretable classification algorithms. The experimental results, showed that the proposed methodology achieved competitive results against the state of the art deep learning methods, while being much more efficient in terms of parameter count. Finally, it was empirically justified that the proposed methodology introduces advanced explainability regarding, not only the data structure through the produced latent space, but also about the classification behaviour.

artificial intelligence, latent space, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2208.12152

Country:

Europe > Greece (0.04)
South America > Brazil > São Paulo (0.04)
Oceania > Australia > Victoria > Melbourne (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)

Genre: Research Report > New Finding (0.54)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.50)
Health & Medicine > Therapeutic Area > Pulmonary/Respiratory Diseases (0.31)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Dimensionality Reduction (0.85)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

Advancing the dimensionality reduction of speaker embeddings for speaker diarisation: disentangling noise and informing speech activity

Kim, You Jin, Heo, Hee-Soo, Jung, Jee-weon, Kwon, Youngki, Lee, Bong-Jin, Chung, Joon Son

arXiv.org Artificial IntelligenceNov-3-2022

The objective of this work is to train noise-robust speaker embeddings adapted for speaker diarisation. Speaker embeddings play a crucial role in the performance of diarisation systems, but they often capture spurious information such as noise, adversely affecting performance. Our previous work has proposed an auto-encoder-based dimensionality reduction module to help remove the redundant information. However, they do not explicitly separate such information and have also been found to be sensitive to hyper-parameter values. To this end, we propose two contributions to overcome these issues: (i) a novel dimensionality reduction framework that can disentangle spurious information from the speaker embeddings; (ii) the use of speech activity vector to prevent the speaker code from representing the background noise. Through a range of experiments conducted on four datasets, our approach consistently demonstrates the state-of-the-art performance among models without system fusion.

data mining, dimensionality reduction, machine learning, (3 more...)

arXiv.org Artificial Intelligence

2110.0338

Genre: Research Report (0.40)

Technology:

Information Technology > Data Science > Data Mining (0.80)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Dimensionality Reduction (0.80)

Add feedback

Gravitational Dimensionality Reduction Using Newtonian Gravity and Einstein's General Relativity

Ghojogh, Benyamin, Sharma, Smriti

arXiv.org Artificial IntelligenceOct-30-2022

Due to the effectiveness of using machine learning in physics, it has been widely received increased attention in the literature. However, the notion of applying physics in machine learning has not been given much awareness to. This work is a hybrid of physics and machine learning where concepts of physics are used in machine learning. We propose the supervised Gravitational Dimensionality Reduction (GDR) algorithm where the data points of every class are moved to each other for reduction of intra-class variances and better separation of classes. For every data point, the other points are considered to be gravitational particles, such as stars, where the point is attracted to the points of its class by gravity. The data points are first projected onto a spacetime manifold using principal component analysis. We propose two variants of GDR -- one with the Newtonian gravity and one with the Einstein's general relativity. The former uses Newtonian gravity in a straight line between points but the latter moves data points along the geodesics of spacetime manifold. For GDR with relativity gravitation, we use both Schwarzschild and Minkowski metric tensors to cover both general relativity and special relativity. Our simulations show the effectiveness of GDR in discrimination of classes.

algorithm, artificial intelligence, machine learning, (11 more...)

arXiv.org Artificial Intelligence

2211.01369

Country:

North America > United States > Illinois > Cook County > Chicago (0.04)
Europe > Germany > Lower Saxony > Gottingen (0.04)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Dimensionality Reduction (0.62)

Add feedback

Towards a machine learning pipeline in reduced order modelling for inverse problems: neural networks for boundary parametrization, dimensionality reduction and solution manifold approximation

Ivagnes, Anna, Demo, Nicola, Rozza, Gianluigi

arXiv.org Artificial IntelligenceOct-26-2022

In this work, we propose a model order reduction framework to deal with inverse problems in a non-intrusive setting. Inverse problems, especially in a partial differential equation context, require a huge computational load due to the iterative optimization process. To accelerate such a procedure, we apply a numerical pipeline that involves artificial neural networks to parametrize the boundary conditions of the problem in hand, compress the dimensionality of the (full-order) snapshots, and approximate the parametric solution manifold. It derives a general framework capable to provide an ad-hoc parametrization of the inlet boundary and quickly converges to the optimal solution thanks to model order reduction. We present in this contribution the results obtained by applying such methods to two different CFD test cases.

artificial intelligence, machine learning, optimization problem, (20 more...)

arXiv.org Artificial Intelligence

2210.14764

Country:

Europe > Switzerland (0.04)
Europe > Italy > Friuli Venezia Giulia > Trieste Province > Trieste (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Dimensionality Reduction (0.41)

Add feedback