AITopics | infiniband

Collaborating Authors

infiniband

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Quiver: Supporting GPUs for Low-Latency, High-Throughput GNN Serving with Workload Awareness

Tan, Zeyuan, Yuan, Xiulong, He, Congjie, Sit, Man-Kit, Li, Guo, Liu, Xiaoze, Ai, Baole, Zeng, Kai, Pietzuch, Peter, Mai, Luo

arXiv.org Artificial IntelligenceMay-18-2023

Systems for serving inference requests on graph neural networks (GNN) must combine low latency with high throughout, but they face irregular computation due to skew in the number of sampled graph nodes and aggregated GNN features. This makes it challenging to exploit GPUs effectively: using GPUs to sample only a few graph nodes yields lower performance than CPU-based sampling; and aggregating many features exhibits high data movement costs between GPUs and CPUs. Therefore, current GNN serving systems use CPUs for graph sampling and feature aggregation, limiting throughput. We describe Quiver, a distributed GPU-based GNN serving system with low-latency and high-throughput. Quiver's key idea is to exploit workload metrics for predicting the irregular computation of GNN requests, and governing the use of GPUs for graph sampling and feature aggregation: (1) for graph sampling, Quiver calculates the probabilistic sampled graph size, a metric that predicts the degree of parallelism in graph sampling. Quiver uses this metric to assign sampling tasks to GPUs only when the performance gains surpass CPU-based sampling; and (2) for feature aggregation, Quiver relies on the feature access probability to decide which features to partition and replicate across a distributed GPU NUMA topology. We show that Quiver achieves up to 35 times lower latency with an 8 times higher throughput compared to state-of-the-art GNN approaches (DGL and PyG).

artificial intelligence, data mining, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2305.10863

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > United States > Hawaii > Honolulu County > Honolulu (0.04)
North America > United States > Georgia > Fulton County > Atlanta (0.04)
(4 more...)

Genre: Research Report (0.64)

Industry: Information Technology (0.68)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Nvidia Doubles Down on AI Supercomputing

#artificialintelligenceDec-13-2020, 13:45:27 GMT

Nvidia has outpaced itself with so many new GPUs for large-scale computing over the last several years that the strategy now seems to be to leave some near-term capability aside to allow big releases at the expected time of year. In today's case that timing is around the annual Supercomputing Conference (SC20) and while there is not something entirely new to marvel at GPU-wise, there is definite doubling of capacity and capability. The GPU maker announced that its A100 GPUs are capable of literally double the memory and performance with the addition of 80GB HBM2e devices, already shipped to some of its biggest HPC "Superpod" and in their DGX systems with wider availability via their partner network beginning in January 2021. For those partners, the overhead is simple, the capability and capacity jump adds another option without any overhead for the 400W of the 40GB GPUs and for the early Superpod customers, it's a simple tray shift, according to Paresh Kharya, Senior Director of Product Management at Nvidia. Having something new to announce is one thing, but it is probably more likely that without some delays, the original A100 might have had the 80GB of memory already.

nvidia, platform, simulation, (14 more...)

#artificialintelligence

Industry: Information Technology > Hardware (0.92)

Technology: Information Technology > Artificial Intelligence (1.00)

Add feedback

Lenovo, Nvidia partnership bridges HPC and enterprise AI with switches for optimized networking - SiliconANGLE

#artificialintelligenceSep-16-2020, 18:25:36 GMT

Artificial intelligence is fast becoming a part of the everyday enterprise workflow, but the computing infrastructure to support such a data-intense task must modernize. As businesses transform to better leverage data intelligence and become more agile through cloud-native processes, high-performance networking becomes priority. But investing in the InfiniBand standard for high-performance computing network switches has been a hard sell for information-technology departments with an existing Ethernet fabric in place. Enabling enterprise to catch the fast train to intelligent business operations are long-time partners Nvidia Corp. and Lenovo Group Ltd. "We love, from an HPC perspective, to use InfiniBand," said Scott Tease (pictured, right), general manager of HPC and AI at Lenovo. "But most enterprise clients are using Ethernet. We go to a partner that we've trusted for a very long time. And we selected the Nvidia Mellanox Ethernet switch family."

artificial intelligence, siliconangle, workload, (12 more...)

#artificialintelligence

Country: North America > United States > California (0.05)

Industry: Information Technology > Hardware (1.00)

Technology: Information Technology > Artificial Intelligence (1.00)

Add feedback

WekaIO CEO says focus will stay on AI, life sciences

#artificialintelligenceApr-26-2018, 16:16:16 GMT

WekaIO CEO Liran Zvibel has a two-pronged plan for launching the parallel-file-system startup to success: He intends... You forgot to provide an Email Address. This email address doesn't appear to be valid. This email address is already registered. You have exceeded the maximum character limit.

artificial intelligence, cloud computing, zvibel, (17 more...)

#artificialintelligence

Country: Asia > Middle East > Israel (0.15)

Industry: Information Technology > Services (1.00)

Technology:

Information Technology > Cloud Computing (1.00)
Information Technology > Artificial Intelligence (1.00)

Add feedback

Designing HPC, Deep Learning, and Cloud Middleware for Exascale Systems - insideHPC

#artificialintelligenceFeb-25-2018, 15:54:02 GMT

In this video from the Stanford HPC Conference, DK Panda from Ohio State University presents: Designing HPC, Deep Learning, and Cloud Middleware for Exascale Systems. "This talk will focus on challenges in designing HPC, Deep Learning, and HPC Cloud middleware for Exascale systems with millions of processors and accelerators. For the HPC domain, we will discuss the challenges in designing runtime environments for MPI X (PGAS-OpenSHMEM/UPC/CAF/UPC, OpenMP and Cuda) programming models by taking into account support for multi-core systems (KNL and OpenPower), high networks, GPGPUs (including GPUDirect RDMA) and energy awareness. Features and sample performance numbers from MVAPICH2 libraries will be presented. For the Deep Learning domain, we will focus on popular Deep Learning framewords (Caffe, CNTK, and TensorFlow) to extract performance and scalability with MVAPICH2-GDR MPI library and RDMA-enabled Big Data stacks. Finally, we will outline the challenges in moving these middleware to the Cloud environments."

artificial intelligence, deep learning, machine learning, (13 more...)

#artificialintelligence

Country: North America > United States > Ohio (0.31)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Mellanox Technologies (MLNX) Q1 2017 Results - Earnings Call Transcript

#artificialintelligenceApr-27-2017, 02:24:12 GMT

At this time, all participants have been placed in a listen-only mode. And the floor will be open for your questions following the presentation. As a reminder, this conference is being recorded. And now I would like to turn the conference over to Mellanox. Leading the call today will be Eyal Waldman, President and CEO of Mellanox Technologies; and Jacob Shulman, Chief Financial Officer. By now, you've seen our press release and associated financial information that we furnished to the SEC on Form 8-K this afternoon. If not, you may access them on our website at ir.mellanox.com. As a reminder, today's discussion includes predictions, expectations, estimates and other information, all of which we consider to be forward-looking statements. Throughout today's discussion, we present important factors relating to our business that may potentially affect these forward-looking statements. These forward-looking statements are also subject to risks and uncertainties that may cause actual results to differ materially from statements made today. As a result, we caution you against placing undue reliance on these forward-looking statements. And we encourage you to review our most recent SEC reports, including our 10-K and 10-Q, for a complete discussion of these factors and other risks that may affect our future results or the market price of our ordinary shares.

customer, infiniband, revenue, (17 more...)

#artificialintelligence

Country:

North America > United States (0.48)
Europe (0.04)
Asia > Middle East > Israel (0.04)

Genre:

Personal > Interview (1.00)
Financial News (1.00)

Industry:

Law > Business Law (0.34)
Government > Regional Government > North America Government > United States Government (0.34)
Banking & Finance > Trading (0.34)

Technology:

Information Technology > Cloud Computing (0.47)
Information Technology > Artificial Intelligence (0.46)
Information Technology > Communications (0.46)

Add feedback

InfiniBand will reach 200-gigabit speed next year

PCWorldNov-10-2016, 15:10:50 GMT

InfiniBand is set to hit 200Gbps (bits per second) in products that were announced Thursday, potentially accelerating machine-learning platforms as well as HPC (high-performance computing) systems. The massive computing performance of new servers equipped with GPUs calls for high network speeds, and these systems are quickly being deployed to handle machine-learning tasks, Dell'Oro Group analyst Sameh Boujelbene said. So-called HDR InfiniBand, which will be generally available next year in three sets of products from Mellanox Technologies, will double the top speed of InfiniBand. It will also have twice the top speed of Ethernet. But the high-performance crowd that's likely to adopt this new interconnect is a small one, Boujelbene said. Look for the top 10 percent of InfiniBand users, who already use 100Gbps InfiniBand, to jump on the new stuff, she said.

artificial intelligence, infiniband, machine learning, (9 more...)

PCWorld

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback