If you’ve ever found yourself wishing you could do all the things you’ve been able to do with a hypervisor and regular virtual machines but on a GPU cluster – in your own data center or in the cloud – Nvidia and VMware are now saying your wish is about to come true.
Monday morning, in conjunction with the start of VMworld in San Francisco, the two companies announced that VMware Cloud on AWS, the VMware-operated cloud service running on bare-metal infrastructure in AWS data centers, will soon feature virtualized GPUs you’ll be able to provision and manage using the same vSphere tools you use with regular VM infrastructure. You’ll be able to share a single physical GPU among multiple VMs, but you’ll also be able to aggregate the power of many GPUs to train a machine-learning model at massive scale, the companies said.
The play here is to get VMware into the infrastructure mix for the emerging set of enterprise computing workloads that benefit from GPU acceleration, such as AI and machine learning, as well as more traditional Big Data analytics. Also on Monday, the company announced a broad strategy for tackling the hybrid cloud opportunity, which is essentially to provide a single set of tools for managing all enterprise infrastructure, on premises and/or in any public cloud, in a uniform way.
Nvidia benefits from adding yet another way for companies to consume its GPUs in data centers, but it’s also making RAPIDS, its libraries of AI and analytics software tools for data scientists, compatible with this virtual GPU infrastructure. The containerized software packages enable popular data science technologies like TensorFlow and PyTorch to take advantage of GPU-accelerated hardware.
The key enabling technology is Nvidia’s Virtual Compute Server (or vComputeServer) software, which the GPU giant also announced Monday. It is based on the company’s previously existing virtual GPU technology, but until now it supported only open source KVM-based hypervisors, such as Red Hat’s and Nutanix’s. vGPU now supports vSphere, and vComputeServer is the product manifestation of that support.
VMware also has its own big hardware-accelerator virtualization effort. In July, the company announced acquisition of Bitfusion, a specialist in this area, saying the goal was to bring GPU virtualization to vSphere. VMware also said the platform could be extended to support other types of hardware accelerators, such as FPGAs and custom ASICs.
As with CPU virtualization, Nvidia’s vComputeServer puts a performance “tax” on GPU infrastructure. The exact size of the tax will vary from workload to workload, but the general impact on performance Nvidia engineers have observed in their labs has been less than 5 percent, John Fanelli, a VP of product at Nvidia, said on a call with reporters last week.
Today, vComputeServer is supported by hardware from Cisco, Dell, Hewlett Packard Enterprise, Lenovo, and Supermicro, Fanelli said.
The companies promise that a customer will be able to run vComputeServer on-premises, train their machine learning models while managing their GPU infrastructure – which today is typically bare-metal, living in its own silo – with the same VMware vCenter tools they use to manage the rest of their data center. If they need to scale up the model, they’ll be able to easily run it in VMware Cloud on AWS, using those same tools to manage it.
The virtual cloud GPUs will run on Nvidia’s latest T4 physical GPUs, which AWS announced it was deploying in its data centers earlier this year. AWS hadn’t launched this type of cloud GPU instance, called G4, at the time of writing. Fanelli said Amazon will share details about their availability “very shortly,” implying that VMware will make its GPU cloud service available around the same time.
VMware said Monday that it has quadrupled the number of VMware Cloud on AWS availability regions from the same time last year, going from four to 16.