transition transfer module
Reviews: Transfer of Deep Reactive Policies for MDP Planning
The paper proposes a method termed TransPlan, to use Graph Convolutional Networks to learn the relations defined by an RDDL description to learn neural network policies that can "transfer" to different MDP planning domain instances. The architecture combines several components including a state encoder, an action decoder, a transition transfer module and a problem instance classifier. Only the action decoder requires retraining for transfer and the paper shows how a different component in the architecture (transition transfer module) can be used to quickly retrain and get substantial gains in transfer to a new domain without any "real" interactions (zero-shot). Authors evaluate their performance on benchmark domains from IPPC 2014 and show substantial improvements over standard algorithms which do not leverage the structure offered by an RDDL description of the problem. The authors also a do a few ablations studies to find the relative importance of different components in their system.