A Benchmark for Evaluating Machine Translation Metrics on Dialects Without Standard Orthography