AITopics | graph break

Collaborating Authors

graph break

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

GraphMend: Code Transformations for Fixing Graph Breaks in PyTorch 2

Kashmira, Savini, Dantanarayana, Jayanaka, Sathiyalogeswaran, Thamirawaran, Yuan, Yichao, Talati, Nishil, Flautner, Krisztian, Tang, Lingjia, Mars, Jason

arXiv.org Artificial IntelligenceSep-23-2025

This paper presents GraphMend, a high-level compiler that eliminates FX graph breaks in PyTorch 2 programs. Although PyTorch 2 introduced TorchDynamo and TorchInductor to enable just-in-time graph compilation, unresolved dynamic control flow and unsupported Python constructs often fragment models into multiple FX graphs. These fragments force frequent fallbacks to eager mode, incur costly CPU-to-GPU synchronizations, and reduce optimization opportunities. GraphMend addresses this limitation by analyzing and transforming source code before execution. Built on the Jac compilation framework, GraphMend introduces two code transformations that remove graph breaks due to dynamic control flow and Python I/O functions. This design allows PyTorch's compilation pipeline to capture larger, uninterrupted FX graphs without requiring manual refactoring by developers. Evaluation across eight Hugging Face models shows that GraphMend removes all fixable graph breaks due to dynamic control flow and Python I/O functions, driving the break count to 0 in 6 models and reducing it from 5 to 2 in another model. On NVIDIA RTX 3090 and A40 GPUs, GraphMend achieves up to 75% latency reductions and up to 8% higher end-to-end throughput. These results demonstrate that high-level code transformation is an effective complement to PyTorch's dynamic JIT compilation pipeline, substantially improving both usability and performance.

artificial intelligence, graph break, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2509.16248

Country: North America > United States > Michigan (0.15)

Genre: Research Report (0.86)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

PyTorch 2.0

#artificialintelligenceDec-3-2022, 07:49:43 GMT

Both DistributedDataParallel (DDP) and FullyShardedDataParallel (FSDP) work in compiled mode and provide improved performance and memory utilization relative to eager mode, with some caveats and limitations. External launcher scripts and wrappers that simply apply DDP under the hood generally should work out of the box. Hugging Face Accelerate, Lightning, torchrun, and Ray Train have all been tested and verified working. DeepSpeed and Horovod have not been tested and we expect to enable them soon.

backend, pytorch 2, torch, (16 more...)

#artificialintelligence

Genre: Instructional Material > Course Syllabus & Notes (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.80)

Add feedback

GitHub - pytorch/torchdynamo: A Python-level JIT compiler designed to make unmodified PyTorch programs faster.

#artificialintelligenceAug-14-2022, 02:50:46 GMT

TorchDynamo is a Python-level JIT compiler designed to make unmodified PyTorch programs faster. TorchDynamo hooks into the frame evaluation API in CPython (PEP 523) to dynamically modify Python bytecode right before it is executed. It rewrites Python bytecode in order to extract sequences of PyTorch operations into an FX Graph which is then just-in-time compiled with an ensemble of different backends and autotuning. It creates this FX Graph through bytecode analysis and is designed to mix Python execution with compiled backends to get the best of both worlds: usability and performance. TorchDynamo is experimental and under active development.

backend, compiler, torchdynamo, (12 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback