A Benchmarking Environment for Reinforcement Learning Based Task Oriented Dialogue Management