Improving Adaptivity via Over-Parameterization in Sequence Models

Open in new window