Transformers, parallel computation, and logarithmic depth

Open in new window