Why Guided Dialog Policy Learning performs well? Understanding the role of adversarial learning and its alternative