Contrastive Reinforcement Learning of Symbolic Reasoning Domains
–Neural Information Processing Systems
Policy Learning (ConPoLe) that explicitly optimizes the InfoNCE loss, which lower bounds the mutual information between the current state and next states that continue on a path to the solution.
Neural Information Processing Systems
Aug-15-2025, 15:33:23 GMT
- Country:
- North America > United States
- District of Columbia (0.04)
- California > Santa Clara County
- Palo Alto (0.04)
- Europe > Belgium
- Wallonia > Namur Province > Namur (0.04)
- North America > United States
- Genre:
- Industry:
- Technology: