MedGPTEval: A Dataset and Benchmark to Evaluate Responses of Large Language Models in Medicine

Open in new window