Goto

Collaborating Authors

 propagating uncertainty


Propagating Uncertainty in Reinforcement Learning via Wasserstein Barycenters

Neural Information Processing Systems

How does the uncertainty of the value function propagate when performing temporal difference learning? In this paper, we address this question by proposing a Bayesian framework in which we employ approximate posterior distributions to model the uncertainty of the value function and Wasserstein barycenters to propagate it across state-action pairs. Leveraging on these tools, we present an algorithm, Wasserstein Q-Learning (WQL), starting in the tabular case and then, we show how it can be extended to deal with continuous domains. Furthermore, we prove that, under mild assumptions, a slight variation of WQL enjoys desirable theoretical properties in the tabular setting. Finally, we present an experimental campaign to show the effectiveness of WQL on finite problems, compared to several RL algorithms, some of which are specifically designed for exploration, along with some preliminary results on Atari games.


Reviews: Propagating Uncertainty in Reinforcement Learning via Wasserstein Barycenters

Neural Information Processing Systems

This paper proposes a mechanism for maintaining distributions over Q-values (called Q-posteriors) by defining the value function (the V-posterior) to be a Wasserstein barycenter of Q-posteriors and defining the TD update to be a Wasserstein barycenter of the current Q-posterior with an estimated posterior based on the value function. These distributions are intended to represent uncertainty about the Q-function and they enable more nuanced definitions of the "optimal" (w.r.t. Contributions seem to be: 1. A means of propagating uncertainty about Q-values via Wasserstein barycenters (Equations 2 & 3). 2. A proof that a modified version of the proposed algorithm is PAC-MDP in the average loss setting (Theorems 5.1 and 5.2). The paper is fairly clearly written and easy enough to understand. 2. The idea of propagating uncertainty via Wasserstein barycenters is interesting and suggests several concrete realizations.



Propagating Uncertainty in Reinforcement Learning via Wasserstein Barycenters

Neural Information Processing Systems

How does the uncertainty of the value function propagate when performing temporal difference learning? In this paper, we address this question by proposing a Bayesian framework in which we employ approximate posterior distributions to model the uncertainty of the value function and Wasserstein barycenters to propagate it across state-action pairs. Leveraging on these tools, we present an algorithm, Wasserstein Q-Learning (WQL), starting in the tabular case and then, we show how it can be extended to deal with continuous domains. Furthermore, we prove that, under mild assumptions, a slight variation of WQL enjoys desirable theoretical properties in the tabular setting. Finally, we present an experimental campaign to show the effectiveness of WQL on finite problems, compared to several RL algorithms, some of which are specifically designed for exploration, along with some preliminary results on Atari games.


Propagating Uncertainty in Reinforcement Learning via Wasserstein Barycenters

Metelli, Alberto Maria, Likmeta, Amarildo, Restelli, Marcello

Neural Information Processing Systems

How does the uncertainty of the value function propagate when performing temporal difference learning? In this paper, we address this question by proposing a Bayesian framework in which we employ approximate posterior distributions to model the uncertainty of the value function and Wasserstein barycenters to propagate it across state-action pairs. Leveraging on these tools, we present an algorithm, Wasserstein Q-Learning (WQL), starting in the tabular case and then, we show how it can be extended to deal with continuous domains. Furthermore, we prove that, under mild assumptions, a slight variation of WQL enjoys desirable theoretical properties in the tabular setting. Finally, we present an experimental campaign to show the effectiveness of WQL on finite problems, compared to several RL algorithms, some of which are specifically designed for exploration, along with some preliminary results on Atari games.


Pulcinella: A General Tool for Propagating Uncertainty in Valuation Networks

Saffiotti, Alessandro, Umkehrer, Elisabeth

arXiv.org Artificial Intelligence

We present PULCinella and its use in comparing uncertainty theories. PULCinella is a general tool for Propagating Uncertainty based on the Local Computation technique of Shafer and Shenoy. It may be specialized to different uncertainty theories: at the moment, Pulcinella can propagate probabilities, belief functions, Boolean values, and possibilities. Moreover, Pulcinella allows the user to easily define his own specializations. To illustrate Pulcinella, we analyze two examples by using each of the four theories above. In the first one, we mainly focus on intrinsic differences between theories. In the second one, we take a knowledge engineer viewpoint, and check the adequacy of each theory to a given problem.


Propagating Uncertainty in Solar Panel Performance for Life Cycle Modeling in Early Stage Design

Honda, Tomonori (Massachusetts Institute of Technology) | Chen, Heidi Q. (Massachusetts Institute of Technology) | Chan, Kennis Y. (ATAC Corporation) | Yang, Maria C. (Massachusetts Institute of Technology)

AAAI Conferences

One of the challenges in accurately applying metrics for life cycle assessment lies in accounting for both irreducible and inherent uncertainties in how a design will perform under real world conditions. This paper presents a preliminary study that compares two strategies, one simulation-based and one set-based, for propagating uncertainty in a system. These strategies for uncertainty propagation are then aggregated. This work is conducted in the context of an amorphous photovoltaic (PV) panel, using data gathered from the National Solar Radiation Database, as well as realistic data collected from an experimental hardware setup specifically for this study. Results show that the influence of various sources of uncertainty can vary widely, and in particular that solar radiation intensity is a more significant source of uncertainty than the efficiency of a PV panel. This work also shows both set-based and simulation-based approaches have limitations and must be applied thoughtfully to prevent unrealistic results. Finally, it was found that aggregation of the two uncertainty propagation methods provided faster results than either method alone.