MicroMix: Efficient Mixed-Precision Quantization with Microscaling Formats for Large Language Models

Open in new window