Mixed-Precision Quantization for Language Models: Techniques and Prospects

Open in new window