GALOIS: Boosting Deep Reinforcement Learning via Generalizable Logic Synthesis

Neural Information Processing Systems 

Figure 2: The illustration of knowledge reused from DoorKey to BoxKey. BoxKey As shown in Figure 1b, different from DoorKey, it has to open the box to get the key. Thus the learned program is color-agnostic (i.e., the agent's policy would remain robust no matter The valuation vector representations are fed to all the methods as input. The reward from the MiniGrid environment is sparse (i.e., only a positive reward will be given after We use a batch size of 256. The code is available at: https://github.com/caoysh/GALOIS

Similar Docs  Excel Report  more

TitleSimilaritySource
None found