On Divergence Measures for Training GFlowNets

Neural Information Processing Systems 

In reinforcement learning (RL), a recurring goal is to find a diverse set of high-valued state-action trajectories according to a reward function.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found