SOPE: Spectrum of Off-Policy Estimators
–Neural Information Processing Systems
Consequently, if the parameterization is not rich enough, then it may not be possible to represent the distribution ratios accurately, and when using rich function approximators (such as neural networks) then the optimization procedure may get stuck in sub-optimal saddle points.
Neural Information Processing Systems
Aug-16-2025, 10:22:02 GMT
- Country:
- Asia > Middle East
- Jordan (0.04)
- Europe > United Kingdom
- England > Cambridgeshire > Cambridge (0.04)
- North America > United States
- Massachusetts > Middlesex County
- Cambridge (0.04)
- Texas > Travis County
- Austin (0.05)
- Massachusetts > Middlesex County
- Asia > Middle East
- Industry:
- Government (0.46)
- Technology: