Genetic Quantization-Aware Approximation for Non-Linear Operations in Transformers

Open in new window