Goto

Collaborating Authors

 Reinforcement Learning




Percentile Criterion Optimization in Offline Reinforcement Learning

Neural Information Processing Systems

In reinforcement learning, robust policies for high-stakes decision-making problems with limited data are usually computed by optimizing the percentile criterion .



Temporally-ConsistentSurvivalAnalysis

Neural Information Processing Systems

Wemodel theeventofinterest asaspecial terminal state, andwe seek to estimate the survival distribution (i.e., the distribution of the hitting time for that terminal state) from anyother state.