Contrastive Reinforcement Learning of Symbolic Reasoning Domains
–Neural Information Processing Systems
Policy Learning (ConPoLe) that explicitly optimizes the InfoNCE loss, which lower bounds the mutual information between the current state and next states that continue on a path to the solution.
Neural Information Processing Systems
Aug-15-2025, 15:33:23 GMT
- Country:
- Europe > Belgium
- Wallonia > Namur Province > Namur (0.04)
- North America > United States
- California > Santa Clara County
- Palo Alto (0.04)
- District of Columbia (0.04)
- California > Santa Clara County
- Europe > Belgium
- Genre:
- Industry:
- Technology: