Learning Reward Machines: A Study in Partially Observable Reinforcement Learning