Invariance principle of random projection for the norm

Duan, JunTao

arXiv.org Machine Learning 

Due to the internet boom and computer technology advancement in the last few decades, data collection and storage have been growing exponentially. With'gold' mining demand on the enormous amount of data reaches to a new level, we are facing many technical challenges in understanding the information we have collected. In many different cases, including text and images, data can be represented as points or vectors in high dimensional space. On one hand, it is very easy to collect more and more information about the object so that the dimensionality grows quickly. On the other hand it is very difficult to analyze and create useful models for high dimensional data due to several reasons including computational difficulty as a result of curse of dimensionality and high noise to signal ratio. It is therefore necessary to reduce the dimensionality of the data while preserving the relevant structures. The celebrated Johnson-Lindenstrauss lemma [6] states that random projections can be used as a general dimension reduction technique to embed topological structures in high dimensional Euclidean space into a low dimensional space without distorting its topology. Let us first recall the Johnson-Lindenstrauss lemma [4].