SDQ: Sparse Decomposed Quantization for LLM Inference

Open in new window