The Best of Both Worlds: Combining Recent Advances in Neural Machine Translation