MLE-bench: Evaluating Machine Learning Agents on Machine Learning Engineering

Open in new window