Distributed Machine Learning on VMware vSphere with GPUs and Kubernetes: a Webinar - Virtualize Applications

Oct-12-2019, 22:49:21 GMT–#artificialintelligence

This article directs you to a recent webinar that VMware produced on the topic of executing distributed machine learning with TensorFlow and Horovod running on a set of VMs on multiple vSphere host servers. Many machine learning problems are tackled using a single host server today (with a collection of VMs on that host). However, when your ML model or data grows too large for one host to handle, or your GPU power happens to be dispersed across several physical host servers/VMs, then distribution is the mechanism used to tackle that scenario. The VMware webinar introduces the concepts of machine learning in general first. It then gives a short description of Horovod for distributed training and explains the importance of low latency networking between the nodes in the distributed model, based here on Mellanox RDMA over Converged Ethernet (RoCE) technology.

gpus and kubernetes, virtualize application, vsphere, (13 more...)

#artificialintelligence

Oct-12-2019, 22:49:21 GMT

News Web Page

Add feedback

Genre:
- Instructional Material > Course Syllabus & Notes (0.37)

Industry:
- Information Technology > Software (0.87)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found