Goto

Collaborating Authors

 Reinforcement Learning



Flow Network based Generative Models for Non-Iterative Diverse Candidate Generation Emmanuel Bengio

Neural Information Processing Systems

This paper is about the problem of learning a stochastic policy for generating an object (like a molecular graph) from a sequence of actions, such that the probability of generating an object is proportional to a given positive reward for that object.








Supplementary Material Policy Learning Using Weak Supervision

Neural Information Processing Systems

In this supplementary material, we first provide theoretical analysis of the convergence rate (Sec A.1) and sample complexity (Sec A.2) for Peer