Optimality and NP-Hardness of Transformers in Learning Markovian Dynamical Functions

Open in new window