AITopics | deep learning workload

Collaborating Authors

deep learning workload

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

HammingMesh: A Network Topology for Large-Scale Deep Learning

Communications of the ACMNov-21-2024, 16:07:59 GMT

Numerous microarchitectural optimizations unlocked tremendous processing power for deep neural networks that in turn fueled the AI revolution. With the exhaustion of such optimizations, the growth of modern AI is now gated by the performance of training systems, especially their data movement. Instead of focusing on single accelerators, we investigate data-movement characteristics of large-scale training at full system scale. Based on our workload analysis, we design HammingMesh, a novel network topology that provides high bandwidth at low cost with high job scheduling flexibility. Specifically, HammingMesh can support full bandwidth and isolation to deep learning training jobs with two dimensions of parallelism. Furthermore, it also supports high global bandwidth for generic traffic.

bandwidth, deep learning workload, hammingmesh, (12 more...)

Communications of the ACM

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

DeepContext: A Context-aware, Cross-platform, and Cross-framework Tool for Performance Profiling and Analysis of Deep Learning Workloads

Zhao, Qidong, Wu, Hao, Hao, Yuming, Ye, Zilingfeng, Li, Jiajia, Liu, Xu, Zhou, Keren

arXiv.org Artificial IntelligenceNov-4-2024

Effective performance profiling and analysis are essential for optimizing training and inference of deep learning models, especially given the growing complexity of heterogeneous computing environments. However, existing tools often lack the capability to provide comprehensive program context information and performance optimization insights for sophisticated interactions between CPUs and GPUs. This paper introduces DeepContext, a novel profiler that links program contexts across high-level Python code, deep learning frameworks, underlying libraries written in C/C++, as well as device code executed on GPUs. DeepContext incorporates measurements of both coarse- and fine-grained performance metrics for major deep learning frameworks, such as PyTorch and JAX, and is compatible with GPUs from both Nvidia and AMD, as well as various CPU architectures, including x86 and ARM. In addition, DeepContext integrates a novel GUI that allows users to quickly identify hotpots and an innovative automated performance analyzer that suggests users with potential optimizations based on performance metrics and program context. Through detailed use cases, we demonstrate how DeepContext can help users identify and analyze performance issues to enable quick and effective optimization of deep learning workloads. We believe Deep Context is a valuable tool for users seeking to optimize complex deep learning workflows across multiple compute environments.

artificial intelligence, call path, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2411.02797

Country:

North America > United States > North Carolina > Wake County > Raleigh (0.04)
North America > United States > Virginia > Fairfax County > Fairfax (0.04)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.64)

Industry: Information Technology (0.39)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Microsoft's 'Singularity' to Enable Global Accelerator Network for AI Training

#artificialintelligenceFeb-24-2022, 16:26:21 GMT

In science fiction and future studies, the word "singularity" is invoked in reference to a rapidly snowballing artificial intelligence that, repeatedly iterating on itself, eclipses all human knowledge and ability. It is this word that Microsoft--perhaps ambitiously--has invoked for its new AI project, a "globally distributed scheduling service for highly efficient and reliable execution of deep learning training and inference workloads." Microsoft's Singularity is a response to the computational costs of training deep learning workloads--costs that have quickly spiraled as those workloads have grown in size, complexity and number. It is also an attempt to maximize the use of idle time, which has increasingly become a focus of discussions of how to minimize the costs and environmental footprints of high-performance computing systems and AI model training on such systems. "Singularity is built with one key goal," explains the preprint paper, which was written by a team of more than two dozen Microsoft researchers and published on arXiv, "driving down the cost of AI by maximizing the aggregate useful throughput on a given fixed pool of capacity of accelerators on a planet scale, while providing stringent [service-level agreements] for multiple pricing tiers."

enable global accelerator network, microsoft, singularity, (9 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.50)

Add feedback

HPTMT Parallel Operators for High Performance Data Science & Data Engineering

Abeykoon, Vibhatha, Kamburugamuve, Supun, Widanage, Chathura, Perera, Niranda, Uyar, Ahmet, Kanewala, Thejaka Amila, von Laszewski, Gregor, Fox, Geoffrey

arXiv.org Artificial IntelligenceAug-12-2021

Data-intensive applications are becoming commonplace in all science disciplines. They are comprised of a rich set of sub-domains such as data engineering, deep learning, and machine learning. These applications are built around efficient data abstractions and operators that suit the applications of different domains. Often lack of a clear definition of data structures and operators in the field has led to other implementations that do not work well together. The HPTMT architecture that we proposed recently, identifies a set of data structures, operators, and an execution model for creating rich data applications that links all aspects of data engineering and data science together efficiently. This paper elaborates and illustrates this architecture using an end-to-end application with deep learning and data engineering parts working together.

application, operator, workload, (13 more...)

arXiv.org Artificial Intelligence

2108.06001

Country:

North America > United States > Virginia (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > Indiana > Monroe County > Bloomington (0.04)
North America > United States > California > Alameda County > Berkeley (0.04)

Genre: Research Report (0.82)

Industry: Health & Medicine > Therapeutic Area > Oncology (0.93)

Technology:

Information Technology > Software > Programming Languages (1.00)
Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Cloud GPU Instances: What Are the Options? - DATAVERSITY

#artificialintelligenceMay-20-2021, 00:20:45 GMT

Click here to learn more about Gilad David Maayan. If you're running demanding machine learning and deep learning models on your laptop or on GPU-equipped machines owned by your organization, there is a new and compelling alternative. All major cloud providers offer cloud GPUs – compute instances with powerful hardware acceleration, which you can rent per hour, letting you run deep learning workloads on the cloud. Let's review the concept of cloud GPUs and the offerings by the big three cloud providers – Amazon, Azure, and Google Cloud. A cloud graphics processing unit (GPU) provides hardware acceleration for an application, without requiring that a GPU is deployed on the user's local device.

gpus, hardware acceleration, workload, (11 more...)

#artificialintelligence

Industry: Information Technology > Services (1.00)

Technology:

Information Technology > Hardware (1.00)
Information Technology > Graphics (1.00)
Information Technology > Cloud Computing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Value Function Based Performance Optimization of Deep Learning Workloads

Steiner, Benoit, Cummins, Chris, He, Horace, Leather, Hugh

arXiv.org Artificial IntelligenceNov-29-2020

As machine learning techniques become ubiquitous, the efficiency of neural network implementations is becoming correspondingly paramount. Frameworks, such as Halide and TVM, separate out the algorithmic representation of the network from the schedule that determines its implementation. Finding good schedules, however, remains extremely challenging. We model this scheduling problem as a sequence of optimization choices, and present a new technique to accurately predict the expected performance of a partial schedule. By leveraging these predictions we can make these optimization decisions greedily and rapidly identify an efficient schedule. This enables us to find schedules that improve the throughput of deep neural networks by 2.6x over Halide and 1.5x over TVM. Moreover, our technique is two to three orders of magnitude faster than that of these tools, and completes in seconds instead of hours.

neural network, performance optimization, value function, (11 more...)

arXiv.org Artificial Intelligence

2011.14486

Country:

North America > United States > New York > New York County > New York City (0.05)
North America > United States > California > San Diego County > Carlsbad (0.04)
North America > Canada > Alberta > Census Division No. 15 > Improvement District No. 9 > Banff (0.04)
Asia > Middle East > Saudi Arabia > Riyadh Province > Riyadh (0.04)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Deep Learning: What You Need To Know

#artificialintelligenceMar-28-2020, 23:36:49 GMT

During the past decade, deep learning has seen groundbreaking developments in the field of AI (Artificial Intelligence). But what is this technology? And why is it so important? Well, let's first get a definition of deep learning. Here's how Kalyan Kumar, who is the Corporate Vice President & Chief Technology Officer of IT Services at HCL Technologies, describes it: "Have you ever wondered how our brain can recognize the face of a friend whom you had met years ago or can recognize the voice of your mother among so many other voices in a crowded marketplace or how our brain can learn, plan and execute complex day-to-day activities? The human brain has around 100 billion cells called neurons. These build massively parallel and distributed networks, through which we learn and carry out complex activities. Inspired from these biological neural networks, scientists started building artificial neural networks so that computers could eventually learn and exhibit intelligence like humans."

deep learning, learning, neural network, (9 more...)

#artificialintelligence

Country: North America > United States (0.49)

Industry:

Information Technology > Services (1.00)
Government > Regional Government > North America Government > United States Government > FDA (0.30)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Managing GPU workloads with Univa Grid Engine - Univa Corporation

#artificialintelligenceJan-7-2020, 14:39:28 GMT

For almost two decades, GPUs (Graphics Processing Units) have been steadily revolutionizing high-performance computing (HPC) and AI. Originally designed for graphics-intensive applications such as gaming and image processing, it didn't take long for HPC professionals to see the potential of low-cost, massively parallel processors able to handle then billions (and now trillions) of floating-point operations per second. In this two-part article, I'll discuss GPU workloads and how they are managed with Univa Grid Engine. First, I'll provide a short primer on GPUs, explain how they are used in HPC and AI, and cover some of the specific challenges when running GPU applications on shared clusters. In part II, I'll focus on some of the specific innovations in Univa Grid Engine that help make GPU applications much easier to deploy and manage at scale.

application, gpus, univa grid engine, (13 more...)

#artificialintelligence

Technology:

Information Technology > Hardware (1.00)
Information Technology > Graphics (1.00)
Information Technology > Sensing and Signal Processing > Image Processing (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.75)

Add feedback

Fueling AI innovation with a new breed of accelerated computing

#artificialintelligenceNov-28-2019, 12:48:09 GMT

The new HPE Apollo 6500 Gen10 is a groundbreaking server designed to tackle the most compute-intensive HPC and deep learning workloads. With superior speed, density, and performance, HPE is reinventing what it means to compute. A major transformation is happening now, as technological advancements and escalating volumes of diverse data drive change across all industries. Cutting-edge innovations are fueling digital transformation on a global scale, and organizations are leveraging faster, more powerful machines to operate more intelligently and effectively than ever. Today, HPE announced the new HPE Apollo 6500 Gen10 server, a groundbreaking platform designed to tackle the most compute-intensive high performance computing (HPC) and deep learning workloads.

apollo 6500, deep learning workload, hpe apollo 6500, (10 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.91)

Add feedback

New White Paper: High-Performance Virtualized Spark Clusters on Kubernetes for Deep Learning - VMware VROOM! Blog

#artificialintelligenceNov-19-2019, 20:18:17 GMT

A new white paper is available showing the advantages of running virtualized Spark Deep Learning workloads on Kubernetes. Recent versions of Spark include support for Kubernetes. For Spark on Kubernetes, the Kubernetes scheduler provides the cluster manager capability provided by Yet Another Resource Negotiator (YARN) in typical Spark on Hadoop clusters. Upon receiving a spark-submit command to start an application, Kubernetes instantiates the requested number of Spark executor pods, each with one or more Spark executors. The benefits of running Spark on Kubernetes are many: ease of deployment, resource sharing, simplifying the coordination between developer and cluster administrator, and enhanced security.

high-performance virtualized spark cluster, kubernetes, workload, (10 more...)

#artificialintelligence

Industry: Information Technology > Software (0.45)

Technology:

Information Technology > Cloud Computing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.65)

Add feedback