Differentiable Sampling with Flexible Reference Word Order for Neural Machine Translation