AITopics | Dimensionality Reduction

Collaborating Authors

Dimensionality Reduction

Dimensionality reduction or dimension reduction is the process of reducing the number of random variables under consideration by obtaining a set of principal variables. It can be divided into feature selection (find a subset of the original variables) and feature extraction (transform the data in the high-dimensional space to a space of fewer dimensions). (Wikipedia)

News Overviews Instructional Materials AI-Alerts Classics

Dimensionality Reduction has Quantifiable Imperfections: Two Geometric Bounds

Lui, Kry, Ding, Gavin Weiguang, Huang, Ruitong, McCann, Robert

Neural Information Processing SystemsDec-31-2018

In this paper, we investigate Dimensionality reduction (DR) maps in an information retrieval setting from a quantitative topology point of view. In particular, we show that no DR maps can achieve perfect precision and perfect recall simultaneously. Thus a continuous DR map must have imperfect precision. We further prove an upper bound on the precision of Lipschitz continuous DR maps. While precision is a natural measure in an information retrieval setting, it does not measure `how' wrong the retrieved data is. We therefore propose a new measure based on Wasserstein distance that comes with similar theoretical guarantee. A key technical step in our proofs is a particular optimization problem of the $L_2$-Wasserstein distance over a constrained set of distributions. We provide a complete solution to this optimization problem, which can be of independent interest on the technical side.

artificial intelligence, machine learning, precision and recall, (16 more...)

Neural Information Processing Systems

Country: North America > Canada (0.46)

Genre: Research Report > New Finding (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Dimensionality Reduction (0.62)

Add feedback

Model-based targeted dimensionality reduction for neuronal population data

Aoi, Mikio, Pillow, Jonathan W.

Neural Information Processing SystemsDec-31-2018

Summarizing high-dimensional data using a small number of parameters is a ubiquitous first step in the analysis of neuronal population activity. Recently developed methods use "targeted" approaches that work by identifying multiple, distinct low-dimensional subspaces of activity that capture the population response to individual experimental task variables, such as the value of a presented stimulus or the behavior of the animal. These methods have gained attention because they decompose total neural activity into what are ostensibly different parts of a neuronal computation. However, existing targeted methods have been developed outside of the confines of probabilistic modeling, making some aspects of the procedures ad hoc, or limited in flexibility or interpretability. Here we propose a new model-based method for targeted dimensionality reduction based on a probabilistic generative model of the population response data. The low-dimensional structure of our model is expressed as a low-rank factorization of a linear regression model. We perform efficient inference using a combination of expectation maximization and direct maximization of the marginal likelihood. We also develop an efficient method for estimating the dimensionality of each subspace. We show that our approach outperforms alternative methods in both mean squared error of the parameter estimates, and in identifying the correct dimensionality of encoding using simulated data. We also show that our method provides more accurate inference of low-dimensional subspaces of activity than a competing algorithm, demixed PCA.

artificial intelligence, machine learning, task variable, (16 more...)

Neural Information Processing Systems

Country: North America (0.28)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.70)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Dimensionality Reduction (0.62)

Add feedback

Dimensionality Reduction for Stationary Time Series via Stochastic Nonconvex Optimization

Chen, Minshuo, Yang, Lin, Wang, Mengdi, Zhao, Tuo

Neural Information Processing SystemsDec-31-2018

Stochastic optimization naturally arises in machine learning. Efficient algorithms with provable guarantees, however, are still largely missing, when the objective function is nonconvex and the data points are dependent. This paper studies this fundamental challenge through a streaming PCA problem for stationary time series data. Specifically, our goal is to estimate the principle component of time series data with respect to the covariance matrix of the stationary distribution. Computationally, we propose a variant of Oja's algorithm combined with downsampling to control the bias of the stochastic gradient caused by the data dependency. Theoretically, we quantify the uncertainty of our proposed stochastic algorithm based on diffusion approximations. This allows us to prove the asymptotic rate of convergence and further implies near optimal asymptotic sample complexity. Numerical experiments are provided to support our analysis.

artificial intelligence, machine learning, saddle point, (18 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Dimensionality Reduction (0.40)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.35)

Add feedback

Visualizing MNIST: An Exploration of Dimensionality Reduction - colah's blog

#artificialintelligenceDec-28-2018, 22:13:13 GMT

At some fundamental level, no one understands machine learning. It isn't a matter of things being too complicated. Almost everything we do is fundamentally very simple. Unfortunately, an innate human handicap interferes with us understanding these simple things. Humans evolved to reason fluidly about two and three dimensions. With some effort, we may think in four dimensions. Machine learning often demands we work with thousands of dimensions – or tens of thousands, or millions! Even very simple things become hard to understand when you do them in very high numbers of dimensions. Reasoning directly about these high dimensional spaces is just short of hopeless.

artificial intelligence, machine learning, visualization, (17 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Dimensionality Reduction (0.41)

Add feedback

Deep Variational Sufficient Dimensionality Reduction

Banijamali, Ershad, Karimi, Amir-Hossein, Ghodsi, Ali

arXiv.org Machine LearningDec-18-2018

We consider the problem of sufficient dimensionality reduction (SDR), where the high-dimensional observation is transformed to a low-dimensional sub-space in which the information of the observations regarding the label variable is preserved. We propose DVSDR, a deep variational approach for sufficient dimensionality reduction. The deep structure in our model has a bottleneck that represent the low-dimensional embedding of the data. We explain the SDR problem using graphical models and use the framework of variational autoencoders to maximize the lower bound of the log-likelihood of the joint distribution of the observation and label. We show that such a maximization problem can be interpreted as solving the SDR problem. DVSDR can be easily adopted to semi-supervised learning setting. In our experiment we show that DVSDR performs competitively on classification tasks while being able to generate novel data samples.

artificial intelligence, latent space, machine learning, (17 more...)

arXiv.org Machine Learning

1812.07641

Country: North America > Canada > Quebec > Montreal (0.04)

Genre: Research Report (0.70)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Dimensionality Reduction (0.86)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.72)

Add feedback

Extending classical surrogate modelling to ultrahigh dimensional problems through supervised dimensionality reduction: a data-driven approach

Lataniotis, C., Marelli, S., Sudret, B.

arXiv.org Machine LearningDec-15-2018

Thanks to their versatility, ease of deployment and high-performance, surrogate models have become staple tools in the arsenal of uncertainty quantification (UQ). From local interpolants to global spectral decompositions, surrogates are characterised by their ability to efficiently emulate complex computational models based on a small set of model runs used for training. An inherent limitation of many surrogate models is their susceptibility to the curse of dimensionality, which traditionally limits their applicability to a maximum of $\co(10^2)$ input dimensions. We present a novel approach at high-dimensional surrogate modelling that is model-, dimensionality reduction- and surrogate model- agnostic (black box), and can enable the solution of high dimensional (i.e. up to $\co(10^4)$) problems. After introducing the general algorithm, we demonstrate its performance by combining Kriging and polynomial chaos expansions surrogates and kernel principal component analysis. In particular, we compare the generalisation performance that the resulting surrogates achieve to the classical sequential application of dimensionality reduction followed by surrogate modelling on several benchmark applications, comprising an analytical function and two engineering applications of increasing dimensionality and complexity.

artificial intelligence, data mining, machine learning, (15 more...)

arXiv.org Machine Learning

1812.06309

Country:

North America > United States (0.46)
Europe > United Kingdom (0.28)

Genre: Research Report > Promising Solution (0.48)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Dimensionality Reduction (0.82)

Add feedback

Anti-drift in electronic nose via dimensionality reduction: a discriminative subspace projection approach

Yi, Zhengkun, Li, Cheng

arXiv.org Machine LearningDec-14-2018

Sensor drift is a well-known issue in the field of sensors and measurement and has plagued the sensor community for many years. In this paper, we propose a sensor drift correction method to deal with the sensor drift problem. Specifically, we propose a discriminative subspace projection approach for sensor drift reduction in electronic noses. The proposed method inherits the merits of the subspace projection method called domain regularized component analysis. Moreover, the proposed method takes the source data label information into consideration, which minimizes the within-class variance of the projected source samples and at the same time maximizes the between-class variance. The label information is exploited to avoid overlapping of samples with different labels in the subspace. Experiments on two sensor drift datasets have shown the effectiveness of the proposed approach. Keywords: Sensor drift; Electronic nose; Subspace projection method; Domain adaptation; Transfer learning.

component analysis, dataset, source domain, (16 more...)

arXiv.org Machine Learning

1901.02321

Country:

North America > United States > New York > New York County > New York City (0.04)
Asia > Singapore (0.04)
Asia > China > Guangdong Province > Shenzhen (0.04)

Genre: Research Report (0.50)

Industry:

Materials > Chemicals > Commodity Chemicals > Petrochemicals (0.69)
Health & Medicine > Therapeutic Area > Endocrinology > Diabetes (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Dimensionality Reduction (0.41)

Add feedback

Combatting Adversarial Attacks through Denoising and Dimensionality Reduction: A Cascaded Autoencoder Approach

Sahay, Rajeev, Mahfuz, Rehana, Gamal, Aly El

arXiv.org Machine LearningDec-7-2018

Abstract--Machine Learning models are vulnerable to adversarial attacksthat rely on perturbing the input data. This work proposes a novel strategy using Autoencoder Deep Neural Networks to defend a machine learning model against two gradient-based attacks: The Fast Gradient Sign attack and Fast Gradient attack. First we use an autoencoder to denoise the test data, which is trained with both clean and corrupted data. Then, we reduce the dimension of the denoised data using the hidden layer representation of another autoencoder. We perform this experiment for multiple values of the bound of adversarial perturbations, and consider different numbers of reduced dimensions. When the test data is preprocessed using this cascaded pipeline, the tested deep neural network classifier yields a much higher accuracy, thus mitigating the effect of the adversarial perturbation. I. INTRODUCTION State of the art machine learning algorithms have revolutionized automatedclassification technologies in various fields like computer vision, natural language processing, and biometric information security [1] [2] [3].

accuracy, artificial intelligence, machine learning, (16 more...)

arXiv.org Machine Learning

1812.03087

Country: North America > United States (0.46)

Genre: Research Report > New Finding (0.46)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Dimensionality Reduction (0.45)

Add feedback

SqueezeFit: Label-aware dimensionality reduction by semidefinite programming

McWhirter, Culver, Mixon, Dustin G., Villar, Soledad

arXiv.org Machine LearningDec-6-2018

Given labeled points in a high-dimensional vector space, we seek a low-dimensional subspace such that projecting onto this subspace maintains some prescribed distance between points of differing labels. Intended applications include compressive classification. Taking inspiration from large margin nearest neighbor classification, this paper introduces a semidefinite relaxation of this problem. Unlike its predecessors, this relaxation is amenable to theoretical analysis, allowing us to provably recover a planted projection operator from the data.

artificial intelligence, machine learning, sqz, (15 more...)

arXiv.org Machine Learning

1812.02768

Country: North America > United States (1.00)

Genre: Research Report (0.82)

Industry: Government > Regional Government (0.46)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Dimensionality Reduction (0.40)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning > Representation Of Examples (0.34)

Add feedback

Spatial-Spectral Regularized Local Scaling Cut for Dimensionality Reduction in Hyperspectral Image Classification

Mohanty, Ramanarayan, Happy, S L, Routray, Aurobinda

arXiv.org Machine LearningDec-6-2018

Dimensionality reduction (DR) methods have attracted extensive attention to provide discriminative information and reduce the computational burden of the hyperspectral image (HSI) classification. However, the DR methods face many challenges due to limited training samples with high dimensional spectra. To address this issue, a graph-based spatial and spectral regularized local scaling cut (SSRLSC) for DR of HSI data is proposed. The underlying idea of the proposed method is to utilize the information from both the spectral and spatial domains to achieve better classification accuracy than its spectral domain counterpart. In SSRLSC, a guided filter is initially used to smoothen and homogenize the pixels of the HSI data in order to preserve the pixel consistency. This is followed by generation of between-class and within-class dissimilarity matrices in both spectral and spatial domains by regularized local scaling cut (RLSC) and neighboring pixel local scaling cut (NPLSC) respectively. Finally, we obtain the projection matrix by optimizing the updated spatial-spectral between-class and total-class dissimilarity. The effectiveness of the proposed DR algorithm is illustrated with two popular real-world HSI datasets.

data mining, information, machine learning, (16 more...)

arXiv.org Machine Learning

doi: 10.1109/LGRS.2018.2885809

1812.08047

Country: Asia > India (0.28)

Genre: Research Report (0.50)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Dimensionality Reduction (0.61)

Add feedback