AITopics | cvar-pg

Collaborating Authors

cvar-pg

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

EfficientRisk-AverseReinforcementLearning

Neural Information Processing SystemsFeb-12-2026, 02:28:28 GMT

A risk measure often focuses on the worst returns out of the agent'sexperience.

artificial intelligence, machine learning, trajectory, (16 more...)

Neural Information Processing Systems

Country: North America > United States > Oregon (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

A Simple Mixture Policy Parameterization for Improving Sample Efficiency of CVaR Optimization

Luo, Yudong, Pan, Yangchen, Wang, Han, Torr, Philip, Poupart, Pascal

arXiv.org Artificial IntelligenceJun-28-2024

This inefficiency stems from two main facts: a focus on tail-end performance that overlooks many sampled trajectories, and the potential of gradient vanishing when the lower tail of the return distribution is overly flat. To address these challenges, we propose a simple mixture policy parameterization. This method integrates a risk-neutral policy with an adjustable policy to form a risk-averse policy. By employing this strategy, all collected trajectories can be utilized for policy updating, and the issue of vanishing gradients is counteracted by stimulating higher returns through the risk-neutral component, thus lifting the tail and preventing flatness. Our empirical study reveals that this mixture parameterization is uniquely effective across a variety of benchmark domains. Specifically, it excels in identifying risk-averse CVaR policies in some Mujoco environments where the traditional CVaR-PG fails to learn a reasonable policy.

conference paper, cvar-pg, learning rate, (14 more...)

arXiv.org Artificial Intelligence

2403.11062

Country:

North America > Canada > Alberta (0.14)
North America > Canada > Ontario (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)

Genre: Research Report (0.82)

Industry: Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.97)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)

Add feedback

Efficient Risk-Averse Reinforcement Learning

#artificialintelligenceOct-22-2022, 00:05:12 GMT

In this post I present our recent NeurIPS 2022 paper (co-authored with Yinlam Chow, Mohammad Ghavamzadeh and Shie Mannor) about risk-averse reinforcement learning (RL). I discuss why and how risk aversion is applied to RL, what its limitations are, and how we propose to overcome them. An application to accidents prevention in autonomous driving is demonstrated. Our code is also available on GitHub. Risk-averse RL is crucial when applying RL to risk-sensitive real-world problems.

efficient risk-averse reinforcement learning, high-risk condition, risk aversion, (12 more...)

#artificialintelligence

Industry: Health & Medicine > Health Care Technology (0.31)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.61)

Add feedback