Optimizing Inference Performance of Transformers on CPUs

Open in new window