Goto

Collaborating Authors

 Statistical Learning



A Novel Framework for Policy Mirror Descent with General Parameterization and Linear Convergence Carlo Alfano Department of Statistics University of Oxford

Neural Information Processing Systems

In this work, we introduce a framework for policy optimization based on mirror descent that naturally accommodates general parameterizations. The policy class induced by our scheme recovers known classes, e.g., softmax, and generates new ones depending on the choice of mirror map.







Scalable Laplacian K-modes

Neural Information Processing Systems

Furthermore, we show that the density modes can be obtained as byproducts of the assignment variables via simple maximum-value operations whose additional computational cost is linear in the number of data points.