QuIP: 2-Bit Quantization of Large Language Models With Guarantees

Neural Information Processing Systems 

We introduce quantization with incoherence processing (QuIP), a new method based on the insight that quantization benefits from incoherent weight and Hessian matrices, i.e., from the weights being even in magnitude and the

Similar Docs  Excel Report  more

TitleSimilaritySource
None found