Improving the Validity of Automatically Generated Feedback via Reinforcement Learning