FlowNetworkbasedGenerativeModelsfor Non-IterativeDiverseCandidateGeneration

Neural Information Processing Systems 

This paper is about the problem of learning a stochastic policy for generating an object (like a molecular graph) from a sequence of actions, such that the probability of generating an object isproportional to agiven positivereward for that object.