Latent Chain-of-Thought? Decoding the Depth-Recurrent Transformer

Open in new window