MLE-bench: Evaluating Machine Learning Agents on Machine Learning Engineering