Towards Diverse Behaviors: A Benchmark for Imitation Learning with Human Demonstrations