Knowledge Infused Policy Gradients with Upper Confidence Bound for Relational Bandits

Open in new window