Q-Palette: Fractional-Bit Quantizers Toward Optimal Bit Allocation for Efficient LLMDeployment Deokjae Lee1,2 Hyun Oh Song1,2

Open in new window