Evaluating Large Language Models for Real-World Engineering Tasks