Remote Labor Index: Measuring AI Automation of Remote Work

Mazeika, Mantas, Gatti, Alice, Menghini, Cristina, Sehwag, Udari Madhushani, Singhal, Shivam, Orlovskiy, Yury, Basart, Steven, Sharma, Manasi, Peskoff, Denis, Lau, Elaine, Lim, Jaehyuk, Carroll, Lachlan, Blair, Alice, Sivakumar, Vinaya, Basu, Sumana, Kenstler, Brad, Ma, Yuntao, Michael, Julian, Li, Xiaoke, Ingebretsen, Oliver, Mehta, Aditya, Mottola, Jean, Teichmann, John, Yu, Kevin, Shaik, Zaina, Khoja, Adam, Ren, Richard, Hausenloy, Jason, Phan, Long, Htet, Ye, Aich, Ankit, Rabbani, Tahseen, Shah, Vivswan, Novykov, Andriy, Binder, Felix, Chugunov, Kirill, Ramirez, Luis, Geralnik, Matias, Mesura, Hernán, Lee, Dean, Cardona, Ed-Yeremai Hernandez, Diamond, Annette, Yue, Summer, Wang, Alexandr, Liu, Bing, Hernandez, Ernesto, Hendrycks, Dan

Oct-31-2025–arXiv.org Artificial Intelligence

AIs have made rapid progress on research-oriented benchmarks of knowledge and reasoning, but it remains unclear how these gains translate into economic value and automation. To measure this, we introduce the Remote Labor Index (RLI), a broadly multi-sector benchmark comprising real-world, economically valuable projects designed to evaluate end-to-end agent performance in practical settings. AI agents perform near the floor on RLI, with the highest-performing agent achieving an automation rate of 2.5%. These results help ground discussions of AI automation in empirical evidence, setting a common basis for tracking AI impacts and enabling stakeholders to proactively navigate AI-driven labor automation.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

Oct-31-2025

arXiv.org PDF

Add feedback

Genre:
- Research Report > New Finding (0.93)

Industry:
- Information Technology (1.00)
- Banking & Finance > Economy (0.93)
- Law (0.68)
- Government (0.68)
- Leisure & Entertainment > Games
  - Computer Games (0.46)
- Health & Medicine > Therapeutic Area
  - Psychiatry/Psychology > Mental Health (0.41)

Technology:
- Information Technology
  - Data Science (1.00)
  - Communications (1.00)
  - Artificial Intelligence
    - Applied AI (0.93)
    - Representation & Reasoning > Agents (0.90)
    - Natural Language
      - Large Language Model (1.00)
      - Chatbot (1.00)
    - Machine Learning
      - Neural Networks > Deep Learning (0.71)
      - Performance Analysis > Accuracy (0.68)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found