Learning Relational Rules from Rewards