Skill-aware Mutual Information Optimisation for Generalisation in Reinforcement Learning

Neural Information Processing Systems 

Reinforcement Learning (RL) agents often learn policies that do not generalise across tasks in which the environmental features and optimal skills are different [des Combes et al., 2018, Garcin et al., 2024].

Similar Docs  Excel Report  more

TitleSimilaritySource
None found