Uncertainty
Export Reviews, Discussions, Author Feedback and Meta-Reviews
First provide a summary of the paper, and then address the following criteria: Quality, clarity, originality and significance. The contribution of this paper is probabilistic programming language that supports parallel inference for graphical models (specifically Bayes nets). Probabilistic programming languages are powerful tools because they allow rapid development of new models without having to derive/implement new inference algorithms. Unlike most existing probabilistic programming languages, Augur produces massively parallel code that can run on a GPU (using CUDA). A unique feature of Augur is that it compiles the model (specified in the language Scala) into an intermediate representation before it's ultimately compiled into a CUDA inference algorithm for parallelization.
Weighted importance sampling for off-policy learning with linear function approximation
Importance sampling is an essential component of off-policy model-free reinforcement learning algorithms. However, its most effective variant, \emph{weighted} importance sampling, does not carry over easily to function approximation and, because of this, it is not utilized in existing off-policy learning algorithms. In this paper, we take two steps toward bridging this gap. First, we show that weighted importance sampling can be viewed as a special case of weighting the error of individual training samples, and that this weighting has theoretical and empirical benefits similar to those of weighted importance sampling. Second, we show that these benefits extend to a new weighted-importance-sampling version of off-policy LSTD(lambda). We show empirically that our new WIS-LSTD(lambda) algorithm can result in much more rapid and reliable convergence than conventional off-policy LSTD(lambda) (Yu 2010, Bertsekas & Yu 2009).
Export Reviews, Discussions, Author Feedback and Meta-Reviews
First provide a summary of the paper, and then address the following criteria: Quality, clarity, originality and significance. In this article, the authors propose a framework for performing model comparison of Bayesian models on behavioral data. To do so, they summarize the Bayesian Decision Theory framework, pinpoint areas of non-identifiability, and outline the types of constraints that can be used to make each term in the Bayesian framework identifiable. They then make assumptions to constrain each term in the Bayesian framework, explore how differentiable parameter values are in their model, and apply the technique to two studies that use Bayesian decision theory to explain behavioral responses: time interval estimation and motion perception. Issues of identifiability of internal representations and processes have been prominent issues within cognitive science and psychology for decades.