The Fourth State: Signed-Zero Ternary for Stable LLM Quantization (and More)
–arXiv.org Artificial Intelligence
Quantization is typically viewed as a pragmatic trade-off between model fidelity and computational costs [2, 3, 13]. Aggressive 2-bit ternary-state schemes are now commonly used to allow large-language models (LLMs) to run on commodity accelerators and edge devices. However, this leads to training-time issues resulting from intervals in which the quantizer output is numerically zero and the surrogate gradient vanishes. These near-zero intevals are referred to as "dead zones" [11]. We introduce a Signed-Zero Ternary (SZT) quantization in which we use the remaining fourth state in the 2-bit ternary encoding to distinguish two zero states (code words). This approach retains the benefits of ternary-state quantization while adding 1-bit gradient information at essentially no cost. This preserves the forward-path behavior of balanced ternary while the back-propagation rule remains fully deterministic for the straight-through form. We argue that availability of gradient information in this maximally quantized representation may tend to maximize overall information density rather than approximate it. All analytical results are obtained via changes to the encode/decode logic only, leaving the matrix-multiply datapath untouched, i.e., 1
arXiv.org Artificial Intelligence
Aug-11-2025