Review for NeurIPS paper: Learning Diverse and Discriminative Representations via the Principle of Maximal Coding Rate Reduction
–Neural Information Processing Systems
Weaknesses: - My main concern is that, I don't see the benefits of modeling the data as a union of subspaces, where each subspace corresponds to a class, when the representation space is *learned*. In particular, since these subspaces won't be orthogonal in practice, on real data. In an unsupervised setting, to recover the subspaces, one needs to perform subspace clustering, which is a hard problem and computationally expensive to perform. In stark contrast, a linear head trained with a cross-entropy loss learns a representation space with approximately linearly separable regions for each class. As a consequence, classification is simple (linear) and Lp distances in representation space are meaningful (which is not necessarily the case when the classes lie on a union of subspaces). However, there are many other methods which can make neural networks with linear classification head more robust, for example [c].
Neural Information Processing Systems
Jan-25-2025, 09:19:45 GMT
- Technology: