HealthBench: Evaluating Large Language Models Towards Improved Human Health