Evaluating Large Language Models for Medical Calculations