Revisiting Offline Compression: Going Beyond Factorization-based Methods for Transformer Language Models

Open in new window