REASONINGCOMPILER: LLM-Guided Optimizations for Efficient Model Serving

Jun-20-2026, 09:27:15 GMT–Neural Information Processing Systems

While model serving has unlocked unprecedented capabilities, the high cost of serving large-scale models continues to be a significant barrier to widespread accessibility and rapid innovation. Compiler optimizations have long driven substantial performance improvements, but existing compilers struggle with neural workloads due to the exponentially large and highly interdependent space of possible transformations. Although existing stochastic search techniques can be effective, they are often sample-inefficient and fail to leverage the structural context underlying compilation decisions. We set out to investigate the research question of whether reasoning with large language models (LLMs), without any retraining, can leverage the context-aware decision space of compiler optimizations to significantly improve sample efficiency.

large language model, machine learning, natural language, (21 more...)

Neural Information Processing Systems

Jun-20-2026, 09:27:15 GMT

Conferences PDF

Add feedback

Country:
- North America > United States > California (0.15)

Genre:
- Research Report
  - New Finding (0.48)
  - Experimental Study (0.34)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language > Large Language Model (1.00)
  - Machine Learning > Neural Networks
    - Deep Learning (0.55)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found