AITopics

2011.05309

Country: North America > United States > Michigan > Washtenaw County > Ann Arbor (0.14)

Genre: Research Report (1.00)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)
(2 more...)

#artificialintelligenceOct-26-2020, 07:11:07 GMT

Principal Component Analysis (PCA) with Scikit-learn

This is the second unsupervised machine learning algorithm that I'm discussing here. This time, the topic is Principal Component Analysis (PCA). At the very beginning of the tutorial, I'll explain the dimensionality of a dataset, what dimensionality reduction means, main approaches to dimensionality reduction, reasons for dimensionality reduction and what PCA means. Then, I will go deeper into the topic PCA by implementing the PCA algorithm with Scikit-learn machine learning library. This will help you to easily apply PCA to a real-world dataset and get results very fast. In a separate article (not in this one), I will discuss the mathematics behind the principal component analysis by manually executing the algorithm using the powerful numpy and pandas libraries.

dataset, health & medicine, oncology, (19 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Principal Component Analysis (0.82)

#artificialintelligenceOct-25-2020, 00:25:28 GMT

Principal Component Analysis (PCA)

During the Data mining process, we are given raw data. Before visualizing or interpreting data, we have to make sure that certain refinement methods are applied to the data before it is available for analysis. This refinement process includes Preprocessing or cleaning the data, such as removing the null or blank values from the data. Next is the Feature selection or Feature Extraction Technique, which is utilized in PCA where the least contributing features are neglected or removed as per requirement. The last stage is the Data Transformation, where the user will apply normalization techniques to scale all the features in the same range.

artificial intelligence, data mining, eigenvector, (12 more...)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Principal Component Analysis (0.41)

Guo, Yiping, Bondell, Howard D.

On Robust Probabilistic Principal Component Analysis using Multivariate $t$-Distributions

arXiv.org Machine LearningOct-21-2020

Principal Component Analysis (PCA) is a common multivariate statistical analysis method, and Probabilistic Principal Component Analysis (PPCA) is its probabilistic reformulation under the framework of Gaussian latent variable model. To improve the robustness of PPCA, it has been proposed to change the underlying Gaussian distributions to multivariate $t$-distributions. Based on the representation of $t$-distribution as a scale mixture of Gaussians, a hierarchical model is used for implementation. However, although the robust PPCA methods work reasonably well for some simulation studies and real data, the hierarchical model implemented does not yield the equivalent interpretation. In this paper, we present a set of equivalent relationships between those models, and discuss the performance of robust PPCA methods using different multivariate $t$-distributed structures through several simulation studies. In doing so, we clarify a current misrepresentation in the literature, and make connections between a set of hierarchical models for robust PPCA.

artificial intelligence, machine learning, multivariate t-distribution, (16 more...)

2010.10786

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Principal Component Analysis (0.81)

arXiv.org Artificial IntelligenceOct-14-2020

Rapid Robust Principal Component Analysis: CUR Accelerated Inexact Low Rank Estimation

Cai, HanQin, Hamm, Keaton, Huang, Longxiu, Li, Jiaqi, Wang, Tao

Robust principal component analysis (RPCA) is a widely used tool for dimension reduction. In this work, we propose a novel non-convex algorithm, coined Iterated Robust CUR (IRCUR), for solving RPCA problems, which dramatically improves the computational efficiency in comparison with the existing algorithms. IRCUR achieves this acceleration by employing CUR decomposition when updating the low rank component, which allows us to obtain an accurate low rank approximation via only three small submatrices. Consequently, IRCUR is able to process only the small submatrices and avoid expensive computing on the full matrix through the entire algorithm. Numerical experiments establish the computational advantage of IRCUR over the state-of-art algorithms on both synthetic and real-world datasets.

artificial intelligence, decomposition, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2010.07422

Country:

North America > United States > California (0.28)
Asia > China > Guangdong Province (0.28)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Principal Component Analysis (0.62)

#artificialintelligenceOct-9-2020, 04:10:29 GMT

Understanding Principal Component Analysis

Machine learning (ML) is a subset of artificial intelligence (AI) and it provides systems with the ability to automatically learn and improve from experience without being explicitly programmed. The algorithms employed within ML are used to find patterns in data that generate insight and help make data-driven decisions and predictions. These types of algorithms are utilized every day to make critical decisions in medical diagnosis, stock trading, transportation, legal matters and much more. Therefore, it can be seen why data scientists place ML on such a high pedestal; it provides a medium for high priority decisions, that can guide better business and smarter actions, in real-time without much human intervention. To learn, ML models use computational methods to understand information directly from data without relying on a predetermined equation.

artificial intelligence, machine learning, principal component, (18 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Principal Component Analysis (0.44)

arXiv.org Machine LearningOct-9-2020

Machine Learning approach to muon spectroscopy analysis

Tula, T., Möller, G., Quintanilla, J., Giblin, S. R., Hillier, A. D., McCabe, E. E., Ramos, S., Barker, D. S., Gibson, S.

In recent years, artificial intelligence techniques have proved to be very successful when applied to problems in physical sciences. Here we apply an unsupervised machine learning (ML) algorithm called principal component analysis (PCA) as a tool to analyse the data from muon spectroscopy experiments. Specifically, we apply the ML technique to detect phase transitions in various materials. The measured quantity in muon spectroscopy is an asymmetry function, which may hold information about the distribution of the intrinsic magnetic field in combination with the dynamics of the sample. Sharp changes of shape of asymmetry functions - measured at different temperatures - might indicate a phase transition. Existing methods of processing the muon spectroscopy data are based on regression analysis, but choosing the right fitting function requires knowledge about the underlying physics of the probed material. Conversely, principal component analysis focuses on small differences in the asymmetry curves and works without any prior assumptions about the studied samples. We discovered that the PCA method works well in detecting phase transitions in muon spectroscopy experiments and can serve as an alternative to current analysis, especially if the physics of the studied material are not entirely known. Additionally, we found out that our ML technique seems to work best with large numbers of measurements, regardless of whether the algorithm takes data only for a single material or whether the analysis is performed simultaneously for many materials with different physical properties.

artificial intelligence, machine learning, phase transition, (16 more...)

2010.04742

Genre: Research Report > New Finding (0.66)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Principal Component Analysis (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.34)

Upadhyay, Jalaj, Upadhyay, Sarvagya

A Framework for Private Matrix Analysis

arXiv.org Machine LearningSep-6-2020

We study private matrix analysis in the sliding window model where only the last $W$ updates to matrices are considered useful for analysis. We give first efficient $o(W)$ space differentially private algorithms for spectral approximation, principal component analysis, and linear regression. We also initiate and show efficient differentially private algorithms for two important variants of principal component analysis: sparse principal component analysis and non-negative principal component analysis. Prior to our work, no such result was known for sparse and non-negative differentially private principal component analysis even in the static data setting. These algorithms are obtained by identifying sufficient conditions on positive semidefinite matrices formed from streamed matrices. We also show a lower bound on space required to compute low-rank approximation even if the algorithm gives multiplicative approximation and incurs additive error. This follows via reduction to a certain communication complexity problem.

banking & finance, null, survey article, (18 more...)

2009.02668

Country: North America > United States > Massachusetts (0.27)

Genre:

Overview (0.67)
Research Report (0.50)

Industry:

Information Technology > Security & Privacy (0.92)
Banking & Finance (0.92)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Principal Component Analysis (1.00)

arXiv.org Machine LearningSep-3-2020

Fast algorithms for robust principal component analysis with an upper bound on the rank

Sha, Ningyu, Shi, Lei, Yan, Ming

The robust principal component analysis (RPCA) decomposes a data matrix into a low-rank part and a sparse part. There are mainly two types of algorithms for RPCA. The first type of algorithm applies regularization terms on the singular values of a matrix to obtain a low-rank matrix. However, calculating singular values can be very expensive for large matrices. The second type of algorithm replaces the low-rank matrix as the multiplication of two small matrices. They are faster than the first type because no singular value decomposition (SVD) is required. However, the rank of the low-rank matrix is required, and an accurate rank estimation is needed to obtain a reasonable solution. In this paper, we propose algorithms that combine both types. Our proposed algorithms require an upper bound of the rank and SVD on small matrices. First, they are faster than the first type because the cost of SVD on small matrices is negligible. Second, they are more robust than the second type because an upper bound of the rank instead of the exact rank is required. Furthermore, we apply the Gauss-Newton method to increase the speed of our algorithms. Numerical experiments show the better performance of our proposed algorithms.

algorithm, artificial intelligence, machine learning, (17 more...)

2008.07972

Country: North America > United States > Michigan (0.28)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Principal Component Analysis (0.62)

#artificialintelligenceAug-31-2020, 20:16:16 GMT

Risks and Caution on applying PCA for Supervised Learning Problems

The curse of dimensionality is a very crucial problem while dealing with real-life datasets which are generally higher-dimensional data. As the dimensionality of the feature space increases, the number of configurations can grow exponentially, and thus the number of configurations covered by an observation decreases. In such a scenario, Principal Component Analysis plays a major part in efficiently reducing the dimensionality of the data yet retaining as much as possible of the variation present in the data set. Let us give a very brief introduction to Principal Component Analysis before delving into the actual problem. The central idea of Principal Component Analysis (PCA) is to reduce the dimensionality of a data set consisting of a large number of correlated variables, while retaining the maximum possible variation present in the data set.

artificial intelligence, machine learning, regression, (17 more...)

Genre: Research Report (0.31)

Industry: Education > Focused Education > Special Education (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Principal Component Analysis (0.69)