AMQ: Enabling AutoML for Mixed-precision Weight-Only Quantization of Large Language Models

Open in new window