AITopics | Dimensionality Reduction

Collaborating Authors

Dimensionality Reduction

Dimensionality reduction or dimension reduction is the process of reducing the number of random variables under consideration by obtaining a set of principal variables. It can be divided into feature selection (find a subset of the original variables) and feature extraction (transform the data in the high-dimensional space to a space of fewer dimensions). (Wikipedia)

News Overviews Instructional Materials AI-Alerts Classics

DiscLDA: Discriminative Learning for Dimensionality Reduction and Classification

Lacoste-Julien, Simon, Sha, Fei, Jordan, Michael I.

Neural Information Processing SystemsFeb-15-2020, 02:26:03 GMT

Probabilistic topic models (and their extensions) have become popular as models of latent structures in collections of text documents or images. These models are usually treated as generative models and trained using maximum likelihood estimation, an approach which may be suboptimal in the context of an overall classification problem. In this paper, we describe DiscLDA, a discriminative learning framework for such models as Latent Dirichlet Allocation (LDA) in the setting of dimensionality reduction with supervised side information. In DiscLDA, a class-dependent linear transformation is introduced on the topic mixture proportions. This parameter is estimated by maximizing the conditional likelihood using Monte Carlo EM.

Add feedback

Dimensionality Reduction Using the Sparse Linear Model

Gkioulekas, Ioannis A., Zickler, Todd

Neural Information Processing SystemsFeb-14-2020, 21:42:05 GMT

We propose an approach for linear unsupervised dimensionality reduction, based on the sparse linear model that has been used to probabilistically interpret sparse coding. We formulate an optimization problem for learning a linear projection from the original signal domain to a lower-dimensional one in a way that approximately preserves, in expectation, pairwise inner products in the sparse domain. We derive solutions to the problem, present nonlinear extensions, and discuss relations to compressed sensing. Our experiments using facial images, texture patches, and images of object categories suggest that the approach can improve our ability to recover meaningful structure in many classes of signals. Papers published at the Neural Information Processing Systems Conference.

dimensionality reduction, sparse linear model

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Dimensionality Reduction (0.68)

Add feedback

Dimensionality Reduction has Quantifiable Imperfections: Two Geometric Bounds

Lui, Kry, Ding, Gavin Weiguang, Huang, Ruitong, McCann, Robert

Neural Information Processing SystemsFeb-14-2020, 20:13:48 GMT

In this paper, we investigate Dimensionality reduction (DR) maps in an information retrieval setting from a quantitative topology point of view. In particular, we show that no DR maps can achieve perfect precision and perfect recall simultaneously. Thus a continuous DR map must have imperfect precision. We further prove an upper bound on the precision of Lipschitz continuous DR maps. While precision is a natural measure in an information retrieval setting, it does not measure how' wrong the retrieved data is.

dimensionality reduction, geometric bound, quantifiable imperfection, (3 more...)

Neural Information Processing Systems

Technology:

Information Technology > Data Science > Data Mining (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Dimensionality Reduction (0.66)

Add feedback

Practical Hash Functions for Similarity Estimation and Dimensionality Reduction

Dahlgaard, Søren, Knudsen, Mathias, Thorup, Mikkel

Neural Information Processing SystemsFeb-14-2020, 19:12:16 GMT

Hashing is a basic tool for dimensionality reduction employed in several aspects of machine learning. However, the perfomance analysis is often carried out under the abstract assumption that a truly random unit cost hash function is used, without concern for which concrete hash function is employed. The concrete hash function may work fine on sufficiently random input. The question is if it can be trusted in the real world when faced with more structured input. In this paper we focus on two prominent applications of hashing, namely similarity estimation with the one permutation hashing (OPH) scheme of Li et al. [NIPS'12] and feature hashing (FH) of Weinberger et al. [ICML'09], both of which have found numerous applications, i.e. in approximate near-neighbour search with LSH and large-scale classification with SVM.

application, hash function, similarity estimation and dimensionality reduction, (6 more...)

Neural Information Processing Systems

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Dimensionality Reduction (0.63)

Add feedback

Dimensionality Reduction for Stationary Time Series via Stochastic Nonconvex Optimization

Chen, Minshuo, Yang, Lin, Wang, Mengdi, Zhao, Tuo

Neural Information Processing SystemsFeb-14-2020, 12:41:43 GMT

Stochastic optimization naturally arises in machine learning. Efficient algorithms with provable guarantees, however, are still largely missing, when the objective function is nonconvex and the data points are dependent. Specifically, our goal is to estimate the principle component of time series data with respect to the covariance matrix of the stationary distribution. Computationally, we propose a variant of Oja's algorithm combined with downsampling to control the bias of the stochastic gradient caused by the data dependency. Theoretically, we quantify the uncertainty of our proposed stochastic algorithm based on diffusion approximations.

dimensionality reduction, stationary time sery, stochastic nonconvex optimization, (2 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Dimensionality Reduction (0.40)

Add feedback

Dimensionality Reduction of Massive Sparse Datasets Using Coresets

Feldman, Dan, Volkov, Mikhail, Rus, Daniela

Neural Information Processing SystemsFeb-14-2020, 12:26:55 GMT

In this paper we present a practical solution with performance guarantees to the problem of dimensionality reduction for very large scale sparse matrices. We show applications of our approach to computing the Principle Component Analysis (PCA) of any $n\times d$ matrix, using one pass over the stream of its rows. Our solution uses coresets: a scaled subset of the $n$ rows that approximates their sum of squared distances to \emph{every} $k$-dimensional \emph{affine} subspace. An open theoretical problem has been to compute such a coreset that is independent of both $n$ and $d$. An open practical problem has been to compute a non-trivial approximation to the PCA of very large but sparse databases such as the Wikipedia document-term matrix in a reasonable time.

coreset, dimensionality reduction, massive sparse dataset, (3 more...)

Neural Information Processing Systems

Technology:

Information Technology > Data Science > Data Mining (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Dimensionality Reduction (0.66)

Add feedback

A Normative Theory of Adaptive Dimensionality Reduction in Neural Networks

Pehlevan, Cengiz, Chklovskii, Dmitri

Neural Information Processing SystemsFeb-14-2020, 11:12:48 GMT

To make sense of the world our brains must analyze high-dimensional datasets streamed by our sensory organs. Because such analysis begins with dimensionality reduction, modelling early sensory processing requires biologically plausible online dimensionality reduction algorithms. Recently, we derived such an algorithm, termed similarity matching, from a Multidimensional Scaling (MDS) objective function. However, in the existing algorithm, the number of output dimensions is set a priori by the number of output neurons and cannot be changed. Because the number of informative dimensions in sensory inputs is variable there is a need for adaptive dimensionality reduction.

adaptive dimensionality reduction, algorithm, covariance matrix, (8 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Dimensionality Reduction (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.74)

Add feedback

Large Margin Discriminant Dimensionality Reduction in Prediction Space

Saberian, Mohammad, Pereira, Jose Costa, Xu, Can, Yang, Jian, Nvasconcelos, Nuno

Neural Information Processing SystemsFeb-14-2020, 08:58:49 GMT

In this paper we establish a duality between boosting and SVM, and use this to derive a novel discriminant dimensionality reduction algorithm. In particular, using the multiclass formulation of boosting and SVM we note that both use a combination of mapping and linear classification to maximize the multiclass margin. In SVM this is implemented using a pre-defined mapping (induced by the kernel) and optimizing the linear classifiers. In boosting the linear classifiers are pre-defined and the mapping (predictor) is learned through combination of weak learners. We argue that the intermediate mapping, e.g.

artificial intelligence, machine learning, margin discriminant dimensionality reduction, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Dimensionality Reduction (0.70)

Add feedback

Dimensionality Reduction with Subspace Structure Preservation

Arpit, Devansh, Nwogu, Ifeoma, Govindaraju, Venu

Neural Information Processing SystemsFeb-14-2020, 06:41:43 GMT

Modeling data as being sampled from a union of independent subspaces has been widely applied to a number of real world applications. However, dimensionality reduction approaches that theoretically preserve this independence assumption have not been well studied. Our key contribution is to show that $2K$ projection vectors are sufficient for the independence preservation of any $K$ class data sampled from a union of independent subspaces. It is this non-trivial observation that we use for designing our dimensionality reduction technique. In this paper, we propose a novel dimensionality reduction algorithm that theoretically preserves this structure for a given dataset.

dimensionality reduction, dimensionality reduction technique, subspace structure preservation, (2 more...)

Neural Information Processing Systems

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Dimensionality Reduction (1.00)

Add feedback

Stable Sparse Subspace Embedding for Dimensionality Reduction

Chen, Li, Zhou, Shuizheng, Ma, Jiajun

arXiv.org Machine LearningFeb-7-2020

Sparse random projection (RP) is a popular tool for dimensionality reduction that shows promising performance with low computational complexity. However, in the existing sparse RP matrices, the positions of non-zero entries are usually randomly selected. Although they adopt uniform sampling with replacement, due to large sampling variance, the number of non-zeros is uneven among rows of the projection matrix which is generated in one trial, and more data information may be lost after dimension reduction. To break this bottleneck, based on random sampling without replacement in statistics, this paper builds a stable sparse subspace embedded matrix (S-SSE), in which non-zeros are uniformly distributed. It is proved that the S-SSE is stabler than the existing matrix, and it can maintain Euclidean distance between points well after dimension reduction. Our empirical studies corroborate our theoretical findings and demonstrate that our approach can indeed achieve satisfactory performance.

matrix, nonzero entry, s-sse, (14 more...)

arXiv.org Machine Learning

2002.02844

Country:

North America > United States (0.14)
Asia > China > Henan Province > Zhengzhou (0.04)
Asia > China > Shaanxi Province > Xi'an (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Dimensionality Reduction (0.63)

Add feedback