An Empirical Evaluation of Evaluation Metrics of Procedurally Generated Mario Levels