R&D-Agent: An LLM-Agent Framework Towards Autonomous Data Science

Yang, Xu, Yang, Xiao, Fang, Shikai, Zhang, Yifei, Wang, Jian, Xian, Bowen, Li, Qizheng, Li, Jingyuan, Xu, Minrui, Li, Yuante, Pan, Haoran, Zhang, Yuge, Liu, Weiqing, Shen, Yelong, Chen, Weizhu, Bian, Jiang

Oct-2-2025–arXiv.org Artificial Intelligence

Recent advances in AI and ML have transformed data science, yet increasing complexity and expertise requirements continue to hinder progress. Although crowd-sourcing platforms alleviate some challenges, high-level machine learning engineering (MLE) tasks remain labor-intensive and iterative. We introduce R&D-Agent, a comprehensive, decoupled, and extensible framework that formalizes the MLE process. R&D-Agent defines the MLE workflow into two phases and six components, turning agent design for MLE from ad-hoc craftsmanship into a principled, testable process. Although several existing agents report promising gains on their chosen components, they can mostly be summarized as a partial optimization from our framework's simple baseline. Inspired by human experts, we designed efficient and effective agents within this framework that achieve state-of-the-art performance. Evaluated on MLE-Bench, the agent built on R&D-Agent ranks as the top-performing machine learning engineering agent, achieving 35.1% any medal rate, demonstrating the ability of the framework to speed up innovation and improve accuracy across a wide range of data science applications. We have open-sourced R&D-Agent on GitHub: https://github.com/microsoft/RD-Agent.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

Oct-2-2025

arXiv.org PDF

Add feedback

Genre:
- Research Report > New Finding (0.46)

Industry:
- Education (1.00)
- Health & Medicine
  - Therapeutic Area (1.00)
  - Diagnostic Medicine (0.93)

Technology:
- Information Technology
  - Data Science (1.00)
  - Artificial Intelligence
    - Representation & Reasoning > Agents (1.00)
    - Natural Language > Large Language Model (1.00)
    - Cognitive Science (0.92)
    - Machine Learning > Neural Networks
      - Deep Learning (0.95)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found