Practical Computational Power of Linear Transformers and Their Recurrent and Self-Referential Extensions

Open in new window