Goto

Collaborating Authors

 eigen vector



Dimensionality Reduction: Principal Component Analysis

#artificialintelligence

A dataset is made up of a number of features. As long as these features are related in someway to the target and are optimal in number a machine learning model will be able to produce decent results after learning from the data. But if the number of features are high and most of the features do not contribute towards the model's learning then the performance of the model will go down and the time taken to output predictions also increases. The process of reducing the number of dimensions by transforming the original feature space into a subspace is one method of performing dimensionality reduction and Principal Component Analysis (PCA) does this. So let's take a look into the building concepts of PCA.


Stochastic Approximation Algorithms for Principal Component Analysis

arXiv.org Machine Learning

Principal Component Analysis (PCA) is a novel way of of dimensionality reduction. This problem essentially boils down to finding the top k eigen vectors of the data covariance matrix. A considerable amount of literature is found on algorithms meant to do so such as an online method be Warmuth and Kuzmin, Matrix Stochastic Gradient by Arora, Oja's method and many others. In this paper we see some of these stochastic approaches to the PCA optimization problem and comment on their convergence and runtime to obtain an ษ› sub-optimal solution. We revisit convex relaxation based methods for stochastic optimization of principal component analysis (PCA). While methods that directly solve the nonconvex problem have been shown to be optimal in terms of statistical and computational efficiency, the methods based on convex relaxation have been shown to enjoy comparable, or even superior, empirical performance this motivates the need for a deeper formal understanding of the latter.


Principal Component Analysis: Your Tutorial and Code

#artificialintelligence

Your data is the life-giving fuel to your Machine Learning model. There are always many ML techniques to choose from and apply to a particular problem, but without a lot of good data you won't get very far. Data is often the driver behind most of your performance gains in a Machine Learning application. Sometimes that data can be complicated. You have so much of it that it may be challenging to understand what it all means and which parts are actually important.


Understanding Principal Component Analysis โ€“ Hacker Noon

#artificialintelligence

The purpose of this post is to give the reader detailed understanding of Principal Component Analysis with the necessary mathematical proofs. In real world data analysis tasks we analyze complex data i.e. multi dimensional data. We plot the data and find various patterns in it or use it to train some machine learning models. One way to think about dimensions is that suppose you have an data point x, if we consider this data point as a physical object then dimensions are merely a basis of view, like where is the data located when it is observed from horizontal axis or vertical axis. As the dimensions of data increases, the difficulty to visualize it and perform computations on it also increases. Variance: It is a measure of the variability or it simply measures how spread the data set is. Mathematically, it is the average squared deviation from the mean score.


Principal Component Analysis

#artificialintelligence

In this post, we will learn about Principal Component Analysis (PCA) -- a popular dimensionality reduction technique in Machine Learning. Our goal is to form an intuitive understanding of PCA without going into all the mathematical details. At the time of writing this post, the population of the United States is roughly 325 million. You may think millions of people will have a million different ideas, opinions, and thoughts, after all, every person is unique. Let's say you select 20 top political questions in the United States and ask millions of people to answer these questions using a yes or a no.


Dimensional Reduction and Principal Component Analysis -- II

@machinelearnbot

In the previous post, we saw why we should be interested in Principal Component Analysis. In this post, we will do some deep dive and get to know how this is implemented. Now that you have some idea about how to change higher dimensions to lower dimensions, we will go through the below description which is shown in a jupyter notebook. I have downloaded the data of three companies that are in the Indian stock market from Quandl. We will try to understand the Indian ecosystem using this.