Beyond Interpolation: Extrapolative Reasoning with Reinforcement Learning and Graph Neural Networks