SOAP-RL: Sequential Option Advantage Propagation for Reinforcement Learning in POMDP Environments

Open in new window