Goto

Collaborating Authors

 policy gradient method





A Appendix

Neural Information Processing Systems

For out of distribution (OOD) inference, it is desired that the model can assign high epistemic uncertainty to the OOD regions compared to their ID counterparts. A.2 Policy Gradient based Reward Maximization for Segmentation Backbone This approach enables us to efficiently achieve the optimal solution for reward maximization. We present some examples of generated OOD examples in Figure 1(a). The results are presented in Figure 1(b)-(d). In Table 1, we present the results of our uncertainty estimation framework when applied to the Cityscapes dataset.






Deep Recurrent Optimal Stopping

Neural Information Processing Systems

Deep neural networks (DNNs) have recently emerged as a powerful paradigm for solving Markovian optimal stopping problems. However, a ready extension of DNN-based methods to non-Markovian settings requires significant state and parameter space expansion, manifesting the curse of dimensionality.