Reviews: Loaded DiCE: Trading off Bias and Variance in Any-Order Score Function Gradient Estimators for Reinforcement Learning

Jan-24-2025, 15:42:45 GMT–Neural Information Processing Systems

This paper presents novel methodology in combination with automatic differentiation, that yields unbiased and low-variance estimators of derivatives at any order. It appears potentially to be widely useful, and the exposition is clear to understand. The reviewers and I seem to be in general agreement in liking the paper. Reviewer 1 wrote a thorough review touching on many aspects of the paper. The overall score was 7, and his bottom line positives were: "This paper is well executed: it is well written, technically sound and potentially impactful."

any-order score function gradient estimator, loaded dice, reinforcement learning, (4 more...)

Neural Information Processing Systems

Jan-24-2025, 15:42:45 GMT

Conferences Web Page

Add feedback

Genre:
- Summary/Review (0.40)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.40)