Sample-Efficient Reinforcement Learning with Maximum Entropy Mellowmax Episodic Control

Open in new window