Cherry on Top: Parameter Heterogeneity and Quantization in Large Language Models

Open in new window