AITopics | Principal Component Analysis

Collaborating Authors

Principal Component Analysis

News Overviews Instructional Materials AI-Alerts Classics

Hull Form Optimization with Principal Component Analysis and Deep Neural Network

arXiv.org Machine LearningOct-27-2018

Designing and modifying complex hull forms for optimal vessel performances have been a major challenge for naval architects. In the present study, Principal Component Analysis (PCA) is introduced to compress the geometric representation of a group of existing vessels, and the resulting principal scores are manipulated to generate a large number of derived hull forms, which are evaluated computationally for their calm-water performances. The results are subsequently used to train a Deep Neural Network (DNN) to accurately establish the relation between different hull forms and their associated performances. Then, based on the fast, parallel DNN-based hull-form evaluation, the large-scale search for optimal hull forms is performed.

deep learning, hull form, marine transportation, (19 more...)

arXiv.org Machine Learning

1810.11701

Country:

North America > United States (0.46)
Europe > United Kingdom > England (0.14)

Genre:

Research Report (0.50)
Workflow (0.46)

Industry:

Transportation > Marine (1.00)
Shipbuilding (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.84)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Principal Component Analysis (0.61)

Add feedback

Distributionally Robust Reduced Rank Regression and Principal Component Analysis in High Dimensions

Tan, Kean Ming, Sun, Qiang, Witten, Daniela

arXiv.org Machine LearningOct-18-2018

We propose robust sparse reduced rank regression and robust sparse principal component analysis for analyzing large and complex high-dimensional data with heavy-tailed random noise. The proposed methods are based on convex relaxations of rank-and sparsity-constrained non-convex optimization problems, which are solved using the alternating direction method of multipliers (ADMM) algorithm. For robust sparse reduced rank regression, we establish non-asymptotic estimation error bounds under both Frobenius and nuclear norms, while existing results focus mostly on rank-selection and prediction consistency. Our theoretical results quantify the tradeoff between heavy-tailedness of the random noise and statistical bias. For random noise with bounded $(1+\delta)$th moment with $\delta \in (0,1)$, the rate of convergence is a function of $\delta$, and is slower than the sub-Gaussian-type deviation bounds; for random noise with bounded second moment, we recover the results obtained under sub-Gaussian noise. Furthermore, the transition between the two regimes is smooth. For robust sparse principal component analysis, we propose to truncate the observed data, and show that this truncation will lead to consistent estimation of the eigenvectors. We then establish theoretical results similar to those of robust sparse reduced rank regression. We illustrate the performance of these methods via extensive numerical studies and two real data applications.

matrix, oncology, survey article, (17 more...)

arXiv.org Machine Learning

1810.07913

Genre: Research Report (0.82)

Industry: Health & Medicine > Therapeutic Area > Oncology (0.46)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Principal Component Analysis (0.81)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)

Add feedback

Graph filtering for data reduction and reconstruction

Schizas, Ioannis D.

arXiv.org Machine LearningSep-24-2018

ABSTRACT A novel approach is put forth that utilizes data similarity, quantified on a graph, to improve upon the reconstruction per - formance of principal component analysis. The tasks of data dimensionality reduction and reconstruction are formulat ed as graph filtering operations, that enable the exploitation of data node connectivity in a graph via the adjacency matrix. The unknown reducing and reconstruction filters are determined by optimizing a mean-square error cost that entails th e data, as well as their graph adjacency matrix. Working in the graph spectral domain enables the derivation of simple gradient descent recursions used to update the matrix filter tap s. Numerical tests in real image datasets demonstrate the bett er reconstruction performance of the novel method over standard principal component analysis.

artificial intelligence, graph, machine learning, (15 more...)

arXiv.org Machine Learning

1809.09266

Country: North America (0.14)

Genre: Research Report > Promising Solution (0.54)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Principal Component Analysis (0.45)

Add feedback

Diffusion Approximations for Online Principal Component Estimation and Global Convergence

Li, Chris Junchi, Wang, Mengdi, Liu, Han, Zhang, Tong

arXiv.org Machine LearningAug-29-2018

In this paper, we propose to adopt the diffusion approximation tools to study the dynamics of Oja's iteration which is an online stochastic gradient descent method for the principal component analysis. Oja's iteration maintains a running estimate of the true principal component from streaming data and enjoys less temporal and spatial complexities. We show that the Oja's iteration for the top eigenvector generates a continuous-state discrete-time Markov chain over the unit sphere. We characterize the Oja's iteration in three phases using diffusion approximation and weak convergence tools. Our three-phase analysis further provides a finite-sample error bound for the running estimate, which matches the minimax information lower bound for principal component analysis under the additional assumption of bounded samples.

artificial intelligence, iteration, machine learning, (16 more...)

arXiv.org Machine Learning

1808.09645

Country:

North America > United States (0.28)
Asia > China (0.28)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.56)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Principal Component Analysis (0.45)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.36)

Add feedback

XPCA: Extending PCA for a Combination of Discrete and Continuous Variables

Anderson-Bergman, Clifford, Kolda, Tamara G., Kincher-Winoto, Kina

arXiv.org Machine LearningAug-22-2018

Principal component analysis (PCA) is arguably the most popular tool in multivariate exploratory data analysis. In this paper, we consider the question of how to handle heterogeneous variables that include continuous, binary, and ordinal. In the probabilistic interpretation of low-rank PCA, the data has a normal multivariate distribution and, therefore, normal marginal distributions for each column. If some marginals are continuous but not normal, the semiparametric copula-based principal component analysis (COCA) method is an alternative to PCA that combines a Gaussian copula with nonparametric marginals. If some marginals are discrete or semi-continuous, we propose a new extended PCA (XPCA) method that also uses a Gaussian copula and nonparametric marginals and accounts for discrete variables in the likelihood calculation by integrating over appropriate intervals. Like PCA, the factors produced by XPCA can be used to find latent structure in data, build predictive models, and perform dimensionality reduction. We present the new model, its induced likelihood function, and a fitting algorithm which can be applied in the presence of missing data. We demonstrate how to use XPCA to produce an estimated full conditional distribution for each data point, and use this to produce to provide estimates for missing data that are automatically range respecting. We compare the methods as applied to simulated and real-world data sets that have a mixture of discrete and continuous variables.

artificial intelligence, us government, xpca, (18 more...)

arXiv.org Machine Learning

1808.0751

Country: North America > United States (1.00)

Genre: Research Report (0.50)

Industry:

Government > Regional Government > North America Government > United States Government (1.00)
Energy (1.00)
Leisure & Entertainment > Sports (0.93)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Principal Component Analysis (0.45)

Add feedback

Efficient Optimization Algorithms for Robust Principal Component Analysis and Its Variants

Ma, Shiqian, Aybat, Necdet Serhat

arXiv.org Machine LearningJun-9-2018

Robust PCA has drawn significant attention in the last decade due to its success in numerous application domains, ranging from bio-informatics, statistics, and machine learning to image and video processing in computer vision. Robust PCA and its variants such as sparse PCA and stable PCA can be formulated as optimization problems with exploitable special structures. Many specialized efficient optimization methods have been proposed to solve robust PCA and related problems. In this paper we review existing optimization methods for solving convex and nonconvex relaxations/variants of robust PCA, discuss their advantages and disadvantages, and elaborate on their convergence behaviors. We also provide some insights for possible future research directions including new algorithmic frameworks that might be suitable for implementing on multi-processor setting to handle large-scale problems.

algorithm, optimization problem, survey article, (17 more...)

arXiv.org Machine Learning

1806.0343

Country: North America > United States > California (0.28)

Genre:

Research Report (0.40)
Overview (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Principal Component Analysis (0.40)

Add feedback

Tensor Robust Principal Component Analysis with A New Tensor Nuclear Norm

Lu, Canyi, Feng, Jiashi, Chen, Yudong, Liu, Wei, Lin, Zhouchen, Yan, Shuicheng

arXiv.org Machine LearningApr-10-2018

In this paper, we consider the Tensor Robust Principal Component Analysis (TRPCA) problem, which aims to exactly recover the low-rank and sparse components from their sum. Our model is based on the recently proposed tensor-tensor product (or t-product) [13]. Induced by the t-product, we first rigorously deduce the tensor spectral norm, tensor nuclear norm, and tensor average rank, and show that the tensor nuclear norm is the convex envelope of the tensor average rank within the unit ball of the tensor spectral norm. These definitions, their relationships and properties are consistent with matrix cases. Equipped with the new tensor nuclear norm, we then solve the TRPCA problem by solving a convex program and provide the theoretical guarantee for the exact recovery. Our TRPCA model and recovery guarantee include matrix RPCA as a special case. Numerical experiments verify our results, and the applications to image recovery and background modeling problems demonstrate the effectiveness of our method.

artificial intelligence, machine learning, tensor, (16 more...)

arXiv.org Machine Learning

1804.03728

Country: Asia > China (0.46)

Genre: Research Report > New Finding (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Principal Component Analysis (0.60)

Add feedback

Top 10 Challenges to Practicing Data Science at Work

@machinelearnbotApr-7-2018, 19:55:17 GMT

A recent survey of over 16,000 data professionals showed that the most common challenges to data science included dirty data (36%), lack of data science talent (30%) and lack of management support (27%). Also, data professionals reported experiencing around three challenges in the previous year. A principal component analysis of the 20 challenges studied showed that challenges can be grouped into five categories. Data science is about finding useful insights and putting them to use. Data science, however, doesn't occur in a vacuum.

artificial intelligence, data professional, machine learning, (6 more...)

@machinelearnbot

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Principal Component Analysis (0.32)

Add feedback

Sparse Principal Component Analysis via Variable Projection

Erichson, N. Benjamin, Zeng, Peng, Manohar, Krithika, Brunton, Steven L., Kutz, J. Nathan, Aravkin, Aleksandr Y.

arXiv.org Machine LearningApr-1-2018

Sparse principal component analysis (SPCA) has emerged as a powerful technique for modern data analysis. We discuss a robust and scalable algorithm for computing sparse principal component analysis. Specifically, we model SPCA as a matrix factorization problem with orthogonality constraints, and develop specialized optimization algorithms that partially minimize a subset of the variables (variable projection). The framework incorporates a wide variety of sparsity-inducing regularizers for SPCA. We also extend the variable projection approach to robust SPCA, for any robust loss that can be expressed as the Moreau envelope of a simple function, with the canonical example of the Huber loss. Finally, randomized methods for linear algebra are used to extend the approach to the large-scale (big data) setting. The proposed algorithms are demonstrated using both synthetic and real world data.

artificial intelligence, spca, survey article, (18 more...)

arXiv.org Machine Learning

1804.00341

Genre: Research Report (0.40)

Technology: