Reinforcement Learning for Long-Horizon Unordered Tasks: From Boolean to Coupled Reward Machines

Open in new window