AITopics

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning in High Dimensional Spaces (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Principal Component Analysis (0.48)

#artificialintelligenceAug-28-2020, 23:59:00 GMT

The Essence of Principal Component Analysis (PCA)

This is one of the most fascinating ideas in Linear Algebra. By multiplying a matrix to a vector, we linearly transform that vector. If you feel like your grip on basic linear algebra is a little loose, I strongly recommend that you watch 3b1b's series on Linear Algebra. A non-trivial vector whose span doesn't change upon being multiplied by a matrix is an eigenvector of that matrix. Now, let me clarify two things here- firstly, the span loosely means the direction of that vector and secondly, although the direction doesn't change, the magnitude can. How stretched or squished the eigenvector becomes i.e. the factor by which the magnitude changes upon multiplication is called the eigenvalue of that eigenvector.

artificial intelligence, eigenvector, machine learning, (7 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Principal Component Analysis (0.40)

#artificialintelligenceAug-17-2020, 08:31:59 GMT

Understand your data with principal component analysis (PCA) and discover underlying patterns

PCA provides valuable insights that reach beyond descriptive statistics and help to discover underlying patterns. Two PCA metrics indicate 1. how many components capture the largest share of variance (explained variance), and 2., which features correlate with the most important components (factor loading). These metrics crosscheck previous steps in the project work flow, such as data collection which then can be adjusted. As a shortcut and ready-to-use tool, I provide the function do_pca() which conducts a PCA for a prepared dataset to inspect its results within seconds in this notebook or this script. When a project structure resembles the one below, the prepared dataset is under scrutiny in the 4. step by looking at descriptive statistics. Among the most common ones are means, distributions and correlations taken across all observations or subgroups.

artificial intelligence, machine learning, variance, (15 more...)

Genre: Workflow (0.90)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Principal Component Analysis (0.40)

Sando, Keishi, Hino, Hideitsu

Modal Principal Component Analysis

arXiv.org Machine LearningAug-7-2020

Principal component analysis (PCA; Jolliffe (2002)) is one of the most popular methods used to find a low-dimensional subspace in which a given dataset lies. Classical PCA (cPCA) can be formulated as a problem to find a subspace that minimizes the sum of squared residuals, but squared residuals make PCA vulnerable to outliers. A lot of PCA algorithms have been proposed to robustify cPCA. The R1-PCA proposed by Ding et al. (2006) replaced the sum of squared residuals in cPCA with the sum of unsquared ones. The optimal solution of R1-PCA has similar properties to those of cPCA, that is, it is given as the eigenvectors of the weighted covariance matrix and it is rotationally invariant. The absolute residuals can reduce negative impact of outliers, but an arbitrary large outlier can still break down the estimate. More recently, Zhang and Lerman (2014) and Lerman et al. (2015) relaxed the optimization problem so that the set of projection matrices is extended to a set of convex set of matrices, and 1

artificial intelligence, machine learning, outlier, (17 more...)

2008.034

Country:

North America > United States > Wisconsin (0.04)
North America > United States > New York (0.04)
Europe > Italy (0.04)
Asia > Japan > Honshū > Kantō > Ibaraki Prefecture > Tsukuba (0.04)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Principal Component Analysis (0.61)

arXiv.org Machine LearningAug-5-2020

Robust Tensor Principal Component Analysis: Exact Recovery via Deterministic Model

Shen, Bo, Zhenyu, null, Kong, null

Tensor, also known as multi-dimensional array, arises from many applications in signal processing, manufacturing processes, healthcare, among others. As one of the most popular methods in tensor literature, Robust tensor principal component analysis (RTPCA) is a very effective tool to extract the low rank and sparse components in tensors. In this paper, a new method to analyze RTPCA is proposed based on the recently developed tensor-tensor product and tensor singular value decomposition (t-SVD). Specifically, it aims to solve a convex optimization problem whose objective function is a weighted combination of the tensor nuclear norm and the l1-norm. In most of literature of RTPCA, the exact recovery is built on the tensor incoherence conditions and the assumption of a uniform model on the sparse support. Unlike this conventional way, in this paper, without any assumption of randomness, the exact recovery can be achieved in a completely deterministic fashion by characterizing the tensor rank-sparsity incoherence, which is an uncertainty principle between the low-rank tensor spaces and the pattern of sparse tensor.

inequality, robust tensor principal component analysis, tensor, (8 more...)

2008.02211

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > Virginia > Montgomery County > Blacksburg (0.04)
Africa > Senegal > Kolda Region > Kolda (0.04)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Principal Component Analysis (0.63)

#artificialintelligenceAug-3-2020, 15:55:34 GMT

The mathematics and Intuitions of Principal Component Analysis (PCA) Using Truncated Singular…

As data scientists or Machine learning experts, we are faced with tonnes of columns of data to extract insight from, among these features are redundant ones, in more fancier mathematical term -- co-linear features. The numerous columns of features without prior treatment leads to curse of dimensionality which in turn leads to over fitting. To ameliorate this curse of dimensionality, principal component analysis (PCA for short) which is one of many ways to address this, is employed using truncated Singular Value Decomposition (SVD). Principal Component Analysis starts to make sense when the number of measured variables are more than three (3) where visualization of the cloud of the data point is difficult and it is near impossible to get insight from. First: Let's try to grasp the goal of Principal Component Analysis.

artificial intelligence, intuition, machine learning, (13 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Principal Component Analysis (1.00)

#artificialintelligenceJul-20-2020, 23:25:32 GMT

Demystifying Principal Component Analysis

Data visualization has always been an essential part of any machine learning operation. It helps to get a very clear intuition about the distribution of data, which in turn helps us to decide which model is best for the problem, we are dealing with. Currently, with the advancement of machine learning, we more often need to deal with large datasets. The datasets are having a large number of features, and can only be visualized using a large feature space. Now, we can only visualize 2-dimensional planes but, visualization of data is also seems pretty necessary, as we saw in our discussion above. This is where Principal Component Analysis comes in.

artificial intelligence, machine learning, vector, (18 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Principal Component Analysis (0.62)

arXiv.org Machine LearningJul-18-2020

Improved Convergence Speed of Fully Symmetric Learning Rules for Principal Component Analysis

Möller, Ralf

Fully symmetric learning rules for principal component analysis can be derived from a novel objective function suggested in our previous work. We observed that these learning rules suffer from slow convergence for covariance matrices where some principal eigenvalues are close to each other. Here we describe a modified objective function with an additional term which mitigates this convergence problem. We show that the learning rule derived from the modified objective function inherits all fixed points from the original learning rule (but may introduce additional ones). Also the stability of the inherited fixed points remains unchanged. Only the steepness of the objective function is increased in some directions. Simulations confirm that the convergence speed can be noticeably improved, depending on the weight factor of the additional term.

artificial intelligence, machine learning, objective function, (18 more...)

2007.09426

Country: Europe > Germany (0.04)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Principal Component Analysis (0.60)

Jambulapati, Arun, Li, Jerry, Tian, Kevin

Robust Sub-Gaussian Principal Component Analysis and Width-Independent Schatten Packing

arXiv.org Machine LearningJun-12-2020

We develop two methods for the following fundamental statistical task: given an $\epsilon$-corrupted set of $n$ samples from a $d$-dimensional sub-Gaussian distribution, return an approximate top eigenvector of the covariance matrix. Our first robust PCA algorithm runs in polynomial time, returns a $1 - O(\epsilon\log\epsilon^{-1})$-approximate top eigenvector, and is based on a simple iterative filtering approach. Our second, which attains a slightly worse approximation factor, runs in nearly-linear time and sample complexity under a mild spectral gap assumption. These are the first polynomial-time algorithms yielding non-trivial information about the covariance of a corrupted sub-Gaussian distribution without requiring additional algebraic structure of moments. As a key technical tool, we develop the first width-independent solvers for Schatten-$p$ norm packing semidefinite programs, giving a $(1 + \epsilon)$-approximate solution in $O(p\log(\tfrac{nd}{\epsilon})\epsilon^{-1})$ input-sparsity time iterations (where $n$, $d$ are problem dimensions).

algorithm, artificial intelligence, machine learning, (19 more...)

2006.0698

Country:

Europe > Italy > Lazio > Rome (0.04)
Oceania > Australia > New South Wales > Sydney (0.04)
North America > United States > Virginia > Arlington County > Arlington (0.04)
(11 more...)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Principal Component Analysis (0.40)

Pelegrina, Guilherme D., Brotto, Renan D. B., Duarte, Leonardo T., Romano, João M. T., Attux, Romis

A multi-objective-based approach for Fair Principal Component Analysis

arXiv.org Machine LearningJun-10-2020

In dimension reduction problems, the adopted technique may produce disparities between the representation errors of two or more different groups. For instance, in the projected space, a specific class can be better represented in comparison with the other ones. Depending on the situation, this unfair result may introduce ethical concerns. In this context, this paper investigates how a fairness measure can be considered when performing dimension reduction through principal component analysis. Since both reconstruction error and fairness measure must be taken into account, we propose a multi-objective-based approach to tackle the Fair Principal Component Analysis problem. The experiments attest that a fairer result can be achieved with a very small loss in the reconstruction error.

artificial intelligence, evolutionary algorithm, machine learning, (14 more...)

2006.06137

Country:

North America > United States > New York (0.05)
Europe > Switzerland > Zürich > Zürich (0.04)
South America > Brazil > São Paulo > Campinas (0.04)
(9 more...)

Genre: Research Report (1.00)

Industry: Education > Educational Setting (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Principal Component Analysis (0.82)