Approximation Rate of the Transformer Architecture for Sequence Modeling

Open in new window