Non-Stationary Streaming PCA

Bienstock, Daniel, Shukla, Apurv, Yun, SeYoung

arXiv.org Machine Learning 

Principal component analysis is one of the most extensively studied methods for constructing linear low-dimensional representation of high-dimensional data. Modern applications such as privacy presevering distributedcomputations (Hardt and Roth (2013)), covariance estimtion of high-frequency data (Chang et al. (2018),Aït-Sahalia et al. (2010)), detecting power grid attacks (Bienstock and Shukla (2018), Escobar et al. (2018)) etc. require design of sub-linear time algorithms with low storage overhead. Existing workon PCA has focused on design and analysis of single-pass (streaming) algorithms with nearoptimal memoryand storage complexity assuming stationarity of the underlying data-generating process. However, physical systems generating data for such applications undergo rapid evolution. For example, dynamic market behaviour leads to time-series data with volatile covariance matrices. Our understanding of such physical system crucially relies on accurate estimation of the data generating space.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found