Align-RUDDER: Learning From Few Demonstrations by Reward Redistribution

Open in new window