Fast Reinforcement Learning with Large Action Sets using Error-Correcting Output Codes for MDP Factorization

Open in new window