Performance Prediction for Large Systems via Text-to-Text Regression

Akhauri, Yash, Lewandowski, Bryan, Lin, Cheng-Hsi, Reyes, Adrian N., Forbes, Grant C., Wongpanich, Arissa, Yang, Bangding, Abdelfattah, Mohamed S., Perel, Sagi, Song, Xingyou

Jun-30-2025–arXiv.org Artificial Intelligence

In many industries, predicting metric outcomes of large systems is a fundamental problem, driven largely by traditional tabular regression. However, such methods struggle on complex systems data in the wild such as configuration files or system logs, where feature engineering is often infeasible. We propose text-to-text regression as a general, scalable alternative. For predicting resource efficiency on Borg, Google's massive compute cluster scheduling system, a 60M parameter encoder-decoder, trained from random initialization, achieves up to a near perfect 0.99 (0.9 average) rank correlation across the entire fleet, and 100x lower MSE than tabular approaches. The model also easily adapts to new tasks in only 500 few-shot examples and captures the densities of complex outcome distributions. Ablation studies highlight the importance of using encoders, increasing sequence length, and the model's inherent uncertainty quantification. These findings pave the way for universal simulators of real-world outcomes.

machine learning, natural language, performance prediction, (18 more...)

arXiv.org Artificial Intelligence

Jun-30-2025

arXiv.org PDF

Add feedback

Country:
- South America > Brazil (0.04)
- Asia > Singapore (0.04)
- Oceania > Australia
  - New South Wales > Sydney (0.04)
- North America
  - United States
    - North Carolina (0.04)
    - Texas > Travis County
      - Austin (0.04)
    - New York > New York County
      - New York City (0.04)
    - California > Santa Clara County
      - Santa Clara (0.04)
  - Canada > Nova Scotia
    - Halifax Regional Municipality > Halifax (0.04)
- Europe > France
  - Nouvelle-Aquitaine > Gironde > Bordeaux (0.04)

Genre:
- Research Report > New Finding (0.67)

Industry:
- Information Technology (0.68)

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning (1.00)
  - Natural Language (1.00)
  - Machine Learning
    - Statistical Learning (0.68)
    - Neural Networks (0.46)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found