SliM-LLM: Salience-Driven Mixed-Precision Quantization for Large Language Models

Open in new window