Parallel Attention Forcing for Machine Translation