Untangling tradeoffs between recurrence and self-attention in neural networks

Open in new window