Reviews: Reinforcement Learning with Multiple Experts: A Bayesian Model Combination Approach

Neural Information Processing Systems 

The paper describes a new algorithm to leverage domain knowledge from several experts in the form of reward shaping. These different reward shaping potentials are combined through a Bayesian learning technique. This is very interesting work. Since domain knowledge might improve or worsen the convergence rate, the online Bayesian learning technique provides an effective way of quickly identifying the best domain knowledge by gradually shifting the posterior belief towards the most accurate domain knowledge. At a high level, the approach makes sense.