A comprehensive study on the prediction reliability of graph neural networks for virtual screening