A Novel Framework for Policy Mirror Descent with General Parameterization and Linear Convergence Carlo Alfano Department of Statistics University of Oxford

Feb-12-2026, 16:30:57 GMT–Neural Information Processing Systems

In this work, we introduce a framework for policy optimization based on mirror descent that naturally accommodates general parameterizations. The policy class induced by our scheme recovers known classes, e.g., softmax, and generates new ones depending on the choice of mirror map.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

Neural Information Processing Systems

Feb-12-2026, 16:30:57 GMT

Conferences PDF

Add feedback

Country:
- North America > United States
  - Illinois > Cook County > Chicago (0.04)
- Europe
  - Russia (0.04)
  - France (0.04)
  - United Kingdom > England
    - Oxfordshire > Oxford (0.50)
- Asia
  - Russia (0.04)
  - Middle East > Jordan (0.04)

Genre:
- Research Report (0.67)

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning > Optimization (0.92)
  - Machine Learning
    - Reinforcement Learning (0.93)
    - Statistical Learning > Gradient Descent (0.46)

Duplicate Docs Excel Report

Title
61a9278dfef5f871b5e472389f8d6fa1-Paper-Conference.pdf

Similar Docs Excel Report more

Title	Similarity	Source
None found