GALOIS: Boosting Deep Reinforcement Learning via Generalizable Logic Synthesis
–Neural Information Processing Systems
Figure 2: The illustration of knowledge reused from DoorKey to BoxKey. BoxKey As shown in Figure 1b, different from DoorKey, it has to open the box to get the key. Thus the learned program is color-agnostic (i.e., the agent's policy would remain robust no matter The valuation vector representations are fed to all the methods as input. The reward from the MiniGrid environment is sparse (i.e., only a positive reward will be given after We use a batch size of 256. The code is available at: https://github.com/caoysh/GALOIS
Neural Information Processing Systems
Aug-16-2025, 09:17:45 GMT
- Country:
- Asia
- China > Tianjin Province
- Tianjin (0.05)
- Singapore (0.05)
- China > Tianjin Province
- North America > Canada
- Alberta (0.15)
- Asia
- Technology: