Appendix for: Data-Aware Low-Rank Compression for Large NLP Models A Proof of Theorem 1 Theorem 1

Open in new window