Reviews: PCA of high dimensional random walks with comparison to neural network training

Neural Information Processing Systems 

Motivated by the problem of visualizing the loss landscape of a deep neural networks during training, and the heuristic consisting of performing PCA on the set of parameters output by stochastic gradient descent, this paper considers a simplified model where the data comes from a simple random walk in Euclidean space instead of a NN. The authors attempt to justify this heuristic by asking what the projection of this walk on its first few principal components should look like. An asymptotic analysis is performed and the conclusion that most of the variance of the walk is captured by its first few components is reached. This reasoning is then extended to the case of a discrete Ornstein-Uhlenbeck process. Then the authors show that their findings are reasonably accurate compared to data coming from a NN on real-world datasets. The idea of performing PCA on the output of a NN for purposes of visualization is an interesting one, and this paper makes a first step towards understanding this proposal through the analysis of a very simple model.