Rethinking Channel Dimensions to Isolate Outliers for Low-bit Weight Quantization of Large Language Models

Open in new window