E-Scores for (In)Correctness Assessment of Generative Model Outputs