Diversify \& Conquer: Outcome-directed Curriculum RL via Out-of-Distribution Disagreement
–Neural Information Processing Systems
Reinforcement learning (RL) often faces the challenges of uninformed search problems where the agent should explore without access to the domain knowledge such as characteristics of the environment or external rewards.
Neural Information Processing Systems
Dec-26-2025, 12:23:06 GMT
- Technology: