Goto

Collaborating Authors

 lipschitz





Counterfactual Evaluation of Peer-Review Assignment Policies Supplemental Material Martin Saveski, Steven Jecmen, Nihar B. Shah, Johan Ugander A Linear Programs for Peer-Review Assignment

Neural Information Processing Systems

Our estimators assume that there is no interference between the units, i.e., that the treatment of one The first assumption is quite realistic as in most peer review systems the reviewers cannot see other reviews until they submit their own. The second assumption is important to understand, as there could be "batch effects": a Monte Carlo methods to tightly estimate these covariances. AAAI datasets, we sampled 1 million assignments and computed the empirical covariance. In our setting, small amounts of attrition (relative to the number of policy-induced positivity violations) mean that the fraction of data that is missing is not exactly known before assignment, but almost. To get more robust estimates of the performance, we repeat this process 10 times.


Extracting Reward Functions from Diffusion Models

Neural Information Processing Systems

We consider the problem of extracting a reward function by comparing a decision-making diffusion model that models low-reward behavior and one that models high-reward behavior; a setting related to inverse reinforcement learning. We first define the notion of a relative reward function of two diffusion models and show conditions under which it exists and is unique.






2 Background

Neural Information Processing Systems

Inprinciple, onecandesign Lipschitz constrained architectures using the composition property of Lipschitz functions, but Anil et al.[2] recently identified a key obstacle to this approach: gradient norm attenuation.