LLM-FP4: 4-Bit Floating-Point Quantized Transformers

Open in new window