Sparse is Enough in Scaling Transformers Sebastian Jaszczur