Non-Stationary Streaming PCA
Bienstock, Daniel, Shukla, Apurv, Yun, SeYoung
Principal component analysis is one of the most extensively studied methods for constructing linear low-dimensional representation of high-dimensional data. Modern applications such as privacy presevering distributedcomputations (Hardt and Roth (2013)), covariance estimtion of high-frequency data (Chang et al. (2018),Aït-Sahalia et al. (2010)), detecting power grid attacks (Bienstock and Shukla (2018), Escobar et al. (2018)) etc. require design of sub-linear time algorithms with low storage overhead. Existing workon PCA has focused on design and analysis of single-pass (streaming) algorithms with nearoptimal memoryand storage complexity assuming stationarity of the underlying data-generating process. However, physical systems generating data for such applications undergo rapid evolution. For example, dynamic market behaviour leads to time-series data with volatile covariance matrices. Our understanding of such physical system crucially relies on accurate estimation of the data generating space.
Feb-8-2019