Middle-Out Decoding

Shikib Mehri, Leonid Sigal

Neural Information Processing Systems 

To facilitate information flow and maintain consistent decoding, we introduce a dual self-attention mechanism that allows us to model complex dependencies between the outputs. We illustrate the performance of our model on the task of video captioning, as well as a synthetic sequence de-noising task.