Dual Policy Iteration
Wen Sun, Geoffrey J. Gordon, Byron Boots, J. Bagnell
–Neural Information Processing Systems
We also provide a general convergence analysis to support our empirical findings. Although our analysis is similar to CPI's, it has a key difference: as long as MBOC succeeds, we can provide a larger policy improvement than CPI at each iteration.
Neural Information Processing Systems
Nov-16-2025, 14:58:22 GMT
- Country:
- Asia > Middle East
- Jordan (0.04)
- Republic of Türkiye > Karaman Province
- Karaman (0.04)
- Europe
- Italy > Lazio
- Rome (0.04)
- United Kingdom > England
- Cambridgeshire > Cambridge (0.04)
- Italy > Lazio
- North America
- Canada > Quebec
- Montreal (0.04)
- United States
- New Jersey (0.04)
- Pennsylvania > Allegheny County
- Pittsburgh (0.04)
- Canada > Quebec
- Asia > Middle East
- Technology: