DinerDash Gym: A Benchmark for Policy Learning in High-Dimensional Action Space