Reevaluating Policy Gradient Methods for Imperfect-Information Games