Contrastive Reinforcement Learning of Symbolic Reasoning Domains

Neural Information Processing Systems 

Policy Learning (ConPoLe) that explicitly optimizes the InfoNCE loss, which lower bounds the mutual information between the current state and next states that continue on a path to the solution.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found