Improving Scheduled Sampling with Elastic Weight Consolidation for Neural Machine Translation