Goto

Collaborating Authors

virtualization


Council Post: The Next Step In Digital Transformation Is Software-Defined X

#artificialintelligence

Today's cloud was made possible by virtualization technology, which creates a software-based representation of hardware equipment. Virtual machines, such as those popularized by VMWare and the hypervisor technology that manages VM execution, make it possible to run different software on the same machine. This concept is now expanding beyond the cloud to the physical world through the use of software that controls autonomous robots. I call this software-defined X: any physical task (X), from cleaning the floor at an airport terminal to delivering an item from one end of a warehouse to the other, can now be controlled through software. This is really taking "digital transformation" to its logical conclusion.


Running AI workloads is coming to a virtual machine near you, powered by GPUs and Kubernetes

ZDNet

Run:AI offers a virtualization layer for AI, aiming to facilitate AI infrastructure. It's seeing lots of traction and just raised a $75M Series C funding round. Here's how the evolution of the AI landscape has shaped its growth.


How Virtual GPUs Enhance Sharing in Kubernetes for Machine Learning on VMware vSphere

#artificialintelligence

This optimizes the use of the GPU hardware and it can serve more than one user, reducing costs. A basic level of familiarity with the core concepts in Kubernetes and in GPU Acceleration will be useful to the reader of this article. We first look more closely at pods in Kubernetes and how they relate to a GPU. A pod is the unit of deployment, at the lowest level, in Kubernetes. A pod can have one or more containers within it. The lifetime of the containers within a pod tend to be about the same, although one container may start before the others, as the "init" container. You can deploy higher-level objects like Kubernetes services and deployments that have many pods in them. We focus on pods and their use of GPUs in this article. Given access rights to a Tanzu Kubernetes cluster (TKC) running on the VMware vSphere with Tanzu environment (i.e. a set of host servers running the ESXi hypervisor, managed by VMware vCenter), a user can issue the command:


Determining GPU Memory for Machine Learning Applications on VMware vSphere with Tanzu

#artificialintelligence

VMware vSphere with Tanzu provides users with the ability to easily construct a Kubernetes cluster on demand for model development/test or deployment work in machine learning applications. These on-demand clusters are called Tanzu Kubernetes clusters (TKC) and their participating nodes, just like VMs, can be sized as required using a YAML specification. In a TKC running on vSphere with Tanzu, each Kubernetes node is implemented as a virtual machine. Kubernetes pods are scheduled onto these nodes or VMs by the Kubernetes scheduler running in the Control Plane VMs in that cluster. To accelerate machine learning training or inference code, one or more of these pods require a GPU or virtual GPU (vGPU) to be associated with them.


Nvidia adds container support into AI Enterprise suite

#artificialintelligence

Nvidia has rolled out the latest version of its AI Enterprise suite for GPU-accelerated workloads, adding integration for VMware's vSphere with Tanzu to enable organisations to run workloads in both containers and inside virtual machines. Available now, Nvidia AI Enterprise 1.1 is an updated release of the suite that GPUzilla delivered last year in collaboration with VMware. It is essentially a collection of enterprise-grade AI tools and frameworks certified and supported by Nvidia to help organisations develop and operate a range of AI applications. That's so long as those organisations are running VMware, of course, which a great many enterprises still use in order to manage virtual machines across their environment, but many also do not. However, as noted by Gary Chen, research director for Software Defined Compute at IDC, deploying AI workloads is a complex task requiring orchestration across many layers of infrastructure.


Exploring the Impact of Virtualization on the Usability of the Deep Learning Applications

arXiv.org Artificial Intelligence

Deep Learning-based (DL) applications are becoming increasingly popular and advancing at an unprecedented pace. While many research works are being undertaken to enhance Deep Neural Networks (DNN) -- the centerpiece of DL applications -- practical deployment challenges of these applications in the Cloud and Edge systems, and their impact on the usability of the applications have not been sufficiently investigated. In particular, the impact of deploying different virtualization platforms, offered by the Cloud and Edge, on the usability of DL applications (in terms of the End-to-End (E2E) inference time) has remained an open question. Importantly, resource elasticity (by means of scale-up), CPU pinning, and processor type (CPU vs GPU) configurations have shown to be influential on the virtualization overhead. Accordingly, the goal of this research is to study the impact of these potentially decisive deployment options on the E2E performance, thus, usability of the DL applications. To that end, we measure the impact of four popular execution platforms (namely, bare-metal, virtual machine (VM), container, and container in VM) on the E2E inference time of four types of DL applications, upon changing processor configuration (scale-up, CPU pinning) and processor types. This study reveals a set of interesting and sometimes counter-intuitive findings that can be used as best practices by Cloud solution architects to efficiently deploy DL applications in various systems. The notable finding is that the solution architects must be aware of the DL application characteristics, particularly, their pre- and post-processing requirements, to be able to optimally choose and configure an execution platform, determine the use of GPU, and decide the efficient scale-up range.


The 5 Biggest Technology Trends In 2022

#artificialintelligence

In 2022 the covid-19 pandemic will continue to impact our lives in many ways. This means that we will continue to see an accelerated rate of digitization and virtualization of business and society. However, as we move into a new year, the need for sustainability, ever-increasing data volumes, and increasing compute and network speeds will begin to regain their status as the most important drivers of digital transformation. For many individuals and organizations, the most important lesson of the last two years or so has been that truly transformative change isn't as difficult to implement as might have once been thought, if the motivation is there! As a society, we will undoubtedly continue to harness this newfound openness to flexibility, agility, and innovative thinking, as the focus shifts from merely attempting to survive in a changing world to thriving in it. With that in mind, here are my predictions for the specific trends that are likely to have the biggest impact in 2022.


Multi-Cloud For Modern Enterprises - Why And Why Not? - Storage, Networking, Virtualization, Cloud and AI/ML

#artificialintelligence

Cloud adoption is accelerating fast in enterprises surging towards modernity. But are there better ways of utilizing the full potential of cloud computing? Leaving behind the constraints of a single cloud computing platform, you will find various other arrangements like hybrid and multi-cloud computing. The annual RightScale State of the Cloud Report suggests, 90% of respondents believe that multi-cloud is already the most common pattern with businesses and enterprises. So, let's delve into understanding more about multi-cloud for modern enterprises.


Ceph as a Secret Weapon for HPC

#artificialintelligence

Ceph is open source software defined storage (SDS) designed to provide highly scalable object-, block- and file-based storage under a unified system, setting it apart from other SDS solutions. It allows decoupling data from physical hardware storage, using software abstraction layers, providing scaling and fault management capabilities. As a distributed storage framework, Ceph has typically been used for high bandwidth, medium latency types of applications, such as content delivery, archive storage, or block storage for virtualization. Its inherent scale-out support allows an organization to build large systems as demand increases. Additionally, it supports enterprise-grade features such as erasure coding, thin provisioning, cloning, load-balancing, automated tiering between flash and hard drives, and simplified maintenance and debugging.


The 5 Biggest Technology Trends In 2022

#artificialintelligence

In 2022 the covid-19 pandemic will continue to impact our lives in many ways. This means that we will continue to see an accelerated rate of digitization and virtualization of business and society. However, as we move into a new year, the need for sustainability, ever-increasing data volumes, and increasing compute and network speeds will begin to regain their status as the most important drivers of digital transformation. For many individuals and organizations, the most important lesson of the last two years or so has been that truly transformative change isn't as difficult to implement as might have once been thought, if the motivation is there! As a society, we will undoubtedly continue to harness this newfound openness to flexibility, agility, and innovative thinking, as the focus shifts from merely attempting to survive in a changing world to thriving in it.