QLLM: Accurate and Efficient Low-Bitwidth Quantization for Large Language Models

Open in new window