Appendix A Relation to PCA q, and additionally fpxq " 1 ry

Neural Information Processing Systems 

PCA: We show that under strict conditions, the PCA vectors applied on an intermediate layer where the principle components are used as concept vectors, maximizes the completeness score. We note that the assumptions for this proposition are extremely stringent, and may not hold in general. When the isometry and other assumptions do not hold, PCA no longer maximizes the completeness score as the lowest reconstruction in the intermediate layer do not imply the highest prediction accuracy in the output. In fact, DNNs are shown to be very sensitive to small perturbations in the input [Narodytska and Kasiviswanathan, 2017] - they can yield very different outputs although the difference in the input is small (and often perceptually hard to recognize to humans). Thus, even though the reconstruction loss between two inputs are low at an intermediate layer, subsequent deep nonlinear processing may cause them to diverge significantly.