Non-autoregressive Sequence-to-Sequence Vision-Language Models