Goto

Collaborating Authors

 proposition



Reward Machines for Deep RL in Noisy and Uncertain Environments

Neural Information Processing Systems

Reward Machines provide an automaton-inspired structure for specifying instructions, safety constraints, and other temporally extended reward-worthy behaviour. By exposing the underlying structure of a reward function, they enable the decomposition of an RL task, leading to impressive gains in sample efficiency.









Active Bipartite Ranking

Neural Information Processing Systems

V arious dedicated algorithms have been recently proposed and studied by the machine-learning community. In contrast, active bipartite ranking rule is poorly documented in the literature. Due to its global nature, a strategy for labeling sequentially data points that are difficult to rank w.r.t. to the others is