AITopics | framework overhead

Collaborating Authors

framework overhead

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

5f0ad4db43d8723d18169b2e4817a160-AuthorFeedback.pdf

Neural Information Processing SystemsFeb-8-2026, 14:36:09 GMT

framework overhead, kernel, nimble, (13 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.58)

Add feedback

Thank you for the insightful comments and the opportunity to follow up

Neural Information Processing SystemsOct-3-2025, 00:58:32 GMT

Thank you for the insightful comments and the opportunity to follow up. PyTorch's native implementation) to Nimble and measure its performance. Note that TensorRT and TVM do not support training for now. Figure 1: Speedup compared to TensorRT on inference workloads (batch size 1) using V100. Figure 2: Speedup compared to Py-Torch on training using V100.

artificial intelligence, machine learning, nimble, (13 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.38)

Add feedback

Single-GPU GNN Systems: Traps and Pitfalls

Gong, Yidong, Tarafder, Arnab, Afrin, Saima, Kumar, Pradeep

arXiv.org Artificial IntelligenceFeb-5-2024

The current graph neural network (GNN) systems have established a clear trend of not showing training accuracy results, and directly or indirectly relying on smaller datasets for evaluations majorly. Our in-depth analysis shows that it leads to a chain of pitfalls in the system design and evaluation process, questioning the practicality of many of the proposed system optimizations, and affecting conclusions and lessons learned. We analyze many single-GPU systems and show the fundamental impact of these pitfalls. We further develop hypotheses, recommendations, and evaluation methodologies, and provide future directions. Finally, a new reference system is developed to establish a new line of optimizations rooted in solving the system-design pitfalls efficiently and practically. The proposed design can productively be integrated into prior works, thereby truly advancing the state-of-the-art.

computation, dataset, pitfall, (17 more...)

arXiv.org Artificial Intelligence

2402.03548

Country: North America > United States > California > Santa Clara County > Palo Alto (0.04)

Genre: Research Report (0.81)

Industry: Information Technology (0.67)

Technology:

Information Technology > Hardware (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

The Framework Tax: Disparities Between Inference Efficiency in NLP Research and Deployment

Fernandez, Jared, Kahn, Jacob, Na, Clara, Bisk, Yonatan, Strubell, Emma

arXiv.org Artificial IntelligenceDec-22-2023

Increased focus on the computational efficiency of NLP systems has motivated the design of efficient model architectures and improvements to underlying hardware accelerators. However, the resulting increases in computational throughput and reductions in floating point operations have not directly translated to improvements in wall-clock inference latency. We demonstrate that these discrepancies can be largely attributed to bottlenecks introduced by deep learning frameworks. We denote this phenomenon as the \textit{framework tax}, and observe that the disparity is growing as hardware speed increases over time. In this work, we examine this phenomenon through a series of case studies analyzing the effects of model design decisions, framework paradigms, and hardware platforms on total model latency. Code is available at https://github.com/JaredFern/Framework-Tax.

batch size, latency, opération, (14 more...)

arXiv.org Artificial Intelligence

2302.06117

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
Europe > Italy > Tuscany > Florence (0.04)
(2 more...)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

PyTorch, a year in....

#artificialintelligenceJan-20-2018, 08:47:54 GMT

Today marks 1 year since PyTorch was released publicly. It's been a wild ride -- our quest to build a flexible deep learning research platform. Over the last year, we've seen an amazing community of people using, contributing to and evangelizing PyTorch -- thank you for the love. Looking back, we wanted to summarize PyTorch over the past year: the progress, the news and highlights from the community. We've been blessed with a strong organic community of researchers and engineers who fell in love with PyTorch.

artificial intelligence, machine learning, pytorch, (18 more...)

#artificialintelligence

Industry:

Education (0.48)
Health & Medicine (0.30)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback