Trajectory-wise Multiple Choice Learning for Dynamics Generalization in Reinforcement Learning

Neural Information Processing Systems 

The main idea is updating the most accurate prediction head to specialize each head in certain environments with similar dynamics, i.e., clustering environments.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found