A Appendix
–Neural Information Processing Systems
For out of distribution (OOD) inference, it is desired that the model can assign high epistemic uncertainty to the OOD regions compared to their ID counterparts. A.2 Policy Gradient based Reward Maximization for Segmentation Backbone This approach enables us to efficiently achieve the optimal solution for reward maximization. We present some examples of generated OOD examples in Figure 1(a). The results are presented in Figure 1(b)-(d). In Table 1, we present the results of our uncertainty estimation framework when applied to the Cityscapes dataset.
Neural Information Processing Systems
Feb-14-2026, 06:22:29 GMT