Goto

Collaborating Authors

 Reinforcement Learning





FlowNetworkbasedGenerativeModelsfor Non-IterativeDiverseCandidateGeneration

Neural Information Processing Systems

This paper is about the problem of learning a stochastic policy for generating an object (like a molecular graph) from a sequence of actions, such that the probability of generating an object isproportional to agiven positivereward for that object.



e48e13207341b6bffb7fb1622282247b-Paper.pdf

Neural Information Processing Systems

Toovercome thelimitation, wepropose Latent Dynamics Mixture (LDM) that trains a reinforcement learning agent with imaginary tasks generated from mixtures of learned latent dynamics.


IsBang-BangControlAllYouNeed? SolvingContinuousControlwithBernoulliPolicies

Neural Information Processing Systems

Real-world robotics tasks commonly manifest ascontrol problems overcontinuous action spaces. When learning to act in such settings, control policies are typically represented as continuous probability distributions that cover all feasible control inputs - often Gaussians. The underlying assumption is that this enables more refined decisions compared to crude policy choices such as discretized controllers, which limit the search space but induce abrupt changes. While switching controls canbeundesirable inpractice astheymaychallenge stability andaccelerate system weardown, they are theoretically feasible and even arise as optimal strategies in some settings.