APTQ: Attention-aware Post-Training Mixed-Precision Quantization for Large Language Models

Open in new window