Learning Robust Options by Conditional Value at Risk Optimization
Takuya Hiraoka, Takahisa Imagawa, Tatsuya Mori, Takashi Onishi, Yoshimasa Tsuruoka
–Neural Information Processing Systems
Options are generally learned by using an inaccurate environment model (or simulator), which contains uncertain model parameters.
Neural Information Processing Systems
Nov-16-2025, 08:22:24 GMT
- Country:
- Asia
- Japan > Honshū
- Kantō > Tokyo Metropolis Prefecture > Tokyo (0.04)
- Middle East > Jordan (0.04)
- Japan > Honshū
- North America > Canada (0.04)
- Asia
- Technology: