FEABench: Evaluating Language Models on Multiphysics Reasoning Ability