Comparing Plausibility Estimates in Base and Instruction-Tuned Large Language Models

Open in new window