Collaborating Authors

Dimensionality Reduction with Generalized Linear Models

AAAI Conferences

In this paper, we propose a general dimensionality reduction method for data generated from a very broad family of distributions and nonlinear functions based on the generalized linear model, called Generalized Linear Principal Component Analysis (GLPCA). Data of different domains often have very different structures. These data can be modeled by different distributions and reconstruction functions. For example, real valued data can be modeled by the Gaussian distribution with a linear reconstruction function, whereas binary valued data may be more appropriately modeled by the Bernoulli distribution with a logit or probit function. Based on general linear models, we propose a unified framework for extracting features from data of different domains. A general optimization algorithm based on natural gradient ascent on distribution manifold is proposed for obtaining the maximum likelihood solutions. We also present some specific algorithms derived from this framework to deal with specific data modeling problems such as document modeling. Experimental results of these algorithms on several data sets are shown for the validation of GLPCA.

Dimensionality Reduction Using the Sparse Linear Model

Neural Information Processing Systems

We propose an approach for linear unsupervised dimensionality reduction, based on the sparse linear model that has been used to probabilistically interpret sparse coding. We formulate an optimization problem for learning a linear projection from the original signal domain to a lower-dimensional one in a way that approximately preserves, in expectation, pairwise inner products in the sparse domain. We derive solutions to the problem, present nonlinear extensions, and discuss relations to compressed sensing. Our experiments using facial images, texture patches, and images of object categories suggest that the approach can improve our ability to recover meaningful structure in many classes of signals. Papers published at the Neural Information Processing Systems Conference.

Guide To Dimensionality Reduction With Recursive Feature Elimination


Therefore, feature elimination in statistics and machine learning is referred to as choosing a subset of relevant features from the dataset to use in further …

Training a Machine Learning Model on a Dataset with Highly-Correlated Features


In a previous article, we've shown that a covariance matrix plot can be used for feature selection and dimensionality reduction: Feature Selection and Dimensionality Reduction Using Covariance Matrix Plot. We, therefore, were able to reduce the dimension of our feature space from 6 to 4. Now suppose we want to build a model on the new feature space for predicting the crew variable. Looking at the covariance matrix plot between features, we see that there is a strong correlation between the features (predictor variables), see the image above. In this article, we shall use a technique called Principal Component Analysis (PCA) to transform our features into space where the features are independent or uncorrelated. We shall then train our model on the PCA space.

Dimensionality Reduction via Program Induction

AAAI Conferences

How can techniques drawn from machine learning be appliedto the learning of structured, compositional representations? In this work, we adopt functional programs as our representation, and cast the problem of learning symbolic representations as a symbolic analog of dimensionality reduction. By placing program synthesis within a probabilistic machinelearning framework, we are able to model the learning ofsome English inflectional morphology and solve a set of synthetic regression problems.