Dimensionality Reduction


Riemannian joint dimensionality reduction and dictionary learning on symmetric positive definite manifold

arXiv.org Machine Learning

Dictionary leaning (DL) and dimensionality reduction (DR) are powerful tools to analyze high-dimensional noisy signals. This paper presents a proposal of a novel Riemannian joint dimensionality reduction and dictionary learning (R-JDRDL) on symmetric positive definite (SPD) manifolds for classification tasks. The joint learning considers the interaction between dimensionality reduction and dictionary learning procedures by connecting them into a unified framework. We exploit a Riemannian optimization framework for solving DL and DR problems jointly. Finally, we demonstrate that the proposed R-JDRDL outperforms existing state-of-the-arts algorithms when used for image classification tasks.


Data Dimensionality Reduction in the Age of Machine Learning

#artificialintelligence

Machine Learning is all the rage as companies try to make sense of the mountains of data they are collecting. Data is everywhere and proliferating at unprecedented speed. But, more data is not always better. In fact, large amounts of data can not only considerably slow down the system execution but can sometimes even produce worse performances in Data Analytics applications. We have found, through years of formal and informal testing, that data dimensionality reduction -- or the process of reducing the number of attributes under consideration when running analytics -- is useful not only for speeding up algorithm execution but also for improving overall model performance.


Data Dimensionality Reduction in the Age of Machine Learning - DATAVERSITY

#artificialintelligence

Click to learn more about author Rosaria Silipo. Machine Learning is all the rage as companies try to make sense of the mountains of data they are collecting. Data is everywhere and proliferating at unprecedented speed. But, more data is not always better. In fact, large amounts of data can not only considerably slow down the system execution but can sometimes even produce worse performances in Data Analytics applications.


Model-based targeted dimensionality reduction for neuronal population data

Neural Information Processing Systems

Summarizing high-dimensional data using a small number of parameters is a ubiquitous first step in the analysis of neuronal population activity. Recently developed methods use "targeted" approaches that work by identifying multiple, distinct low-dimensional subspaces of activity that capture the population response to individual experimental task variables, such as the value of a presented stimulus or the behavior of the animal. These methods have gained attention because they decompose total neural activity into what are ostensibly different parts of a neuronal computation. However, existing targeted methods have been developed outside of the confines of probabilistic modeling, making some aspects of the procedures ad hoc, or limited in flexibility or interpretability. Here we propose a new model-based method for targeted dimensionality reduction based on a probabilistic generative model of the population response data. The low-dimensional structure of our model is expressed as a low-rank factorization of a linear regression model. We perform efficient inference using a combination of expectation maximization and direct maximization of the marginal likelihood. We also develop an efficient method for estimating the dimensionality of each subspace. We show that our approach outperforms alternative methods in both mean squared error of the parameter estimates, and in identifying the correct dimensionality of encoding using simulated data. We also show that our method provides more accurate inference of low-dimensional subspaces of activity than a competing algorithm, demixed PCA.


Deep Variational Sufficient Dimensionality Reduction

arXiv.org Machine Learning

We consider the problem of sufficient dimensionality reduction (SDR), where the high-dimensional observation is transformed to a low-dimensional sub-space in which the information of the observations regarding the label variable is preserved. We propose DVSDR, a deep variational approach for sufficient dimensionality reduction. The deep structure in our model has a bottleneck that represent the low-dimensional embedding of the data. We explain the SDR problem using graphical models and use the framework of variational autoencoders to maximize the lower bound of the log-likelihood of the joint distribution of the observation and label. We show that such a maximization problem can be interpreted as solving the SDR problem. DVSDR can be easily adopted to semi-supervised learning setting. In our experiment we show that DVSDR performs competitively on classification tasks while being able to generate novel data samples.


Optimal terminal dimensionality reduction in Euclidean space

arXiv.org Machine Learning

Let $\varepsilon\in(0,1)$ and $X\subset\mathbb R^d$ be arbitrary with $|X|$ having size $n>1$. The Johnson-Lindenstrauss lemma states there exists $f:X\rightarrow\mathbb R^m$ with $m = O(\varepsilon^{-2}\log n)$ such that $$ \forall x\in X\ \forall y\in X, \|x-y\|_2 \le \|f(x)-f(y)\|_2 \le (1+\varepsilon)\|x-y\|_2 . $$ We show that a strictly stronger version of this statement holds, answering one of the main open questions of [MMMR18]: "$\forall y\in X$" in the above statement may be replaced with "$\forall y\in\mathbb R^d$", so that $f$ not only preserves distances within $X$, but also distances to $X$ from the rest of space. Previously this stronger version was only known with the worse bound $m = O(\varepsilon^{-4}\log n)$. Our proof is via a tighter analysis of (a specific instantiation of) the embedding recipe of [MMMR18].


Dimensionality Reduction For Dummies -- Part 1: Intuition

#artificialintelligence

We need to see in order to believe. When you have a dataset with more than three dimensions, it becomes impossible to see what's going on with our eyes. But who said that these extra dimensions are really necessary? Isn't there a way to somehow reduce it to one, two, or three humanly dimensions? It turns out there is.


Dimensionality Reduction : Does PCA really improve classification outcome?

#artificialintelligence

I have come across a couple of resources about dimensionality reduction techniques. This topic is definitively one of the most interesting ones, and it is great to think that there are algorithms able to reduce the number of features by choosing the most important ones that still represent the entire dataset. One of the advantages pointed out by authors is that these algorithms can improve the results of a classification task. In this post, I am going to verify this statement using a Principal Component Analysis ( PCA) to try to improve the classification performance of a neural network over a dataset. Does PCA really improve classification outcome?


Spectral feature scaling method for supervised dimensionality reduction

arXiv.org Machine Learning

Spectral dimensionality reduction methods enable linear separations of complex data with high-dimensional features in a reduced space. However, these methods do not always give the desired results due to irregularities or uncertainties of the data. Thus, we consider aggressively modifying the scales of the features to obtain the desired classification. Using prior knowledge on the labels of partial samples to specify the Fiedler vector, we formulate an eigenvalue problem of a linear matrix pencil whose eigenvector has the feature scaling factors. The resulting factors can modify the features of entire samples to form clusters in the reduced space, according to the known labels. In this study, we propose new dimensionality reduction methods supervised using the feature scaling associated with the spectral clustering. Numerical experiments show that the proposed methods outperform well-established supervised methods for toy problems with more samples than features, and are more robust regarding clustering than existing methods. Also, the proposed methods outperform existing methods regarding classification for real-world problems with more features than samples of gene expression profiles of cancer diseases. Furthermore, the feature scaling tends to improve the clustering and classification accuracies of existing unsupervised methods, as the proportion of training data increases.


Nonlinear Dimensionality Reduction for Discriminative Analytics of Multiple Datasets

arXiv.org Machine Learning

Principal component analysis (PCA) is widely used for feature extraction and dimensionality reduction, with documented merits in diverse tasks involving high-dimensional data. Standard PCA copes with one dataset at a time, but it is challenged when it comes to analyzing multiple datasets jointly. In certain data science settings however, one is often interested in extracting the most discriminative information from one dataset of particular interest (a.k.a. target data) relative to the other(s) (a.k.a. background data). To this end, this paper puts forth a novel approach, termed discriminative (d) PCA, for such discriminative analytics of multiple datasets. Under certain conditions, dPCA is proved to be least-squares optimal in recovering the component vector unique to the target data relative to background data. To account for nonlinear data correlations, (linear) dPCA models for one or multiple background datasets are generalized through kernel-based learning. Interestingly, all dPCA variants admit an analytical solution obtainable with a single (generalized) eigenvalue decomposition. Finally, corroborating dimensionality reduction tests using both synthetic and real datasets are provided to validate the effectiveness of the proposed methods.