Conditionally Risk-Averse Contextual Bandits
Farsang, Mónika, Mineiro, Paul, Zhang, Wangda
–arXiv.org Artificial Intelligence
Contextual bandits [Auer et al., 2002, Langford and Zhang, 2007] are a mature technology with numerous applications: however, adoption has been most aggressive in recommendation scenarios [Bouneffouf and Rish, 2019], where the worst-case outcome is user annoyance. At the other extreme are medical and defense scenarios where worst-case outcomes are literally fatal. In between are scenarios of interest where bad outcomes are tolerable but should be avoided, e.g., logistics; finance; and self-tuning software, where the term tail catastrophe highlights the inadequacy of average case performance guarantees in real-world applications [Marcus et al., 2021]. These scenarios demand risk-aversion, i.e., decisions should sacrifice average performance in order to avoid worst-case outcomes, and incorporating risk-aversion into contextual bandits would facilitate adoption. More generally, risk aversion is essential for making informed decisions that align with the risk preferences of the decision maker by balancing the potential benefits and risks of a particular action.
arXiv.org Artificial Intelligence
Jul-8-2023
- Country:
- Asia > Middle East
- Palestine (0.14)
- North America > United States
- New York (0.14)
- Asia > Middle East
- Genre:
- Research Report (0.64)
- Industry:
- Health & Medicine (0.68)
- Information Technology > Security & Privacy (0.46)
- Technology: