Goto

Collaborating Authors

 Oceania





QuIP: 2-Bit Quantization of Large Language Models With Guarantees

Neural Information Processing Systems

We introduce quantization with incoherence processing (QuIP), a new method based on the insight that quantization benefits from incoherent weight and Hessian matrices, i.e., from the weights being even in magnitude and the