Learning Classical Planning Strategies with Policy Gradient