Norm Tweaking: High-performance Low-bit Quantization of Large Language Models

Open in new window