Tequila: Trapping-free Ternary Quantization for Large Language Models

Open in new window