A Scalable Framework for Evaluating Health Language Models

Open in new window