A Omitted Statements and Proofs

Neural Information Processing Systems 

To obtain a conservative value estimation, we follow the suggestions given by Fujimoto et al. (2019) and Liu et al. (2020) to prune the unseen state-action pairs in