M2R2: Mixture of Multi-Rate Residuals for Efficient Transformer Inference

Open in new window