When Compression Meets Model Compression: Memory-Efficient Double Compression for Large Language Models

Open in new window