On the Evaluation of Machine-Generated Reports