Evaluating and Improving the Coreference Capabilities of Machine Translation Models