A Survey of Techniques for Optimizing Transformer Inference

Open in new window