State-space models can learn in-context by gradient descent

Open in new window