Optimizing the CVaR via Sampling

Tamar, Aviv (Technion) | Glassner, Yonatan (Technion) | Mannor, Shie (Technion)

Mar-6-2015–AAAI Conferences

Conditional Value at Risk (CVaR) is a prominent risk measure that is being used extensively in various domains. We develop a new formula for the gradient of the CVaR in the form of a conditional expectation. Based on this formula, we propose a novel sampling-based estimator for the gradient of the CVaR, in the spirit of the likelihood-ratio method. We analyze the bias of the estimator, and prove the convergence of a corresponding stochastic gradient descent algorithm to a local CVaR optimum. Our method allows to consider CVaR optimization in new domains. As an example, we consider a reinforcement learning application, and learn a risk-sensitive controller for the game of Tetris.

algorithm, computer game, optimization problem, (20 more...)

AAAI Conferences

Mar-6-2015

Conferences PDF

Add feedback

Country:
- Asia > Middle East > Israel (0.14)

Industry:
- Leisure & Entertainment > Games > Computer Games (0.36)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning
    - Reinforcement Learning (0.67)
    - Statistical Learning > Gradient Descent (0.55)
  - Representation & Reasoning > Optimization (0.94)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found