ResQ: Mixed-Precision Quantization of Large Language Models with Low-Rank Residuals

Open in new window