Cherry on Top: Parameter Heterogeneity and Quantization in Large Language Models

Neural Information Processing Systems 

We find that a small subset of "cherry" parameters exhibit a