VMware systems have been able to use GPUs for years, but VM session/resume functionality doesn’t work when a session is mated with shared GPU--a gap with broad security implications. NVIDIA’s GPU virtualization work with VMware has led to further CPU/GPU/VM session refinement, including session persistence when using GPUs attached to a VM.
Without session re-instantiation (resume/suspend), users would have to leave a session live until it was completed. That’s not always a big deal, but sometimes it is: Think about BI (business intelligence), data analytics, AI, CAD, rendering, and other heavily GPU-bound sessions that can last for hours as data is sorted through. If jobs aren’t being monitored, a session could be hijacked in a lab, data center or VDI session. With session re-instantiation, these sorts of jobs (and more) no longer need extra layers of security while the running sessions are unattended.
“Now we don’t have to worry about the large Linux VMs that researchers use to run jobs overnight getting interrupted,” said Jon Kelley, associate director of enterprise innovation at University of Arkansas, which uses VMware provisioned with NVIDIA GPU cards. “Students can now work on an application, go to another class, and get back to the state where they were even if we had to perform maintenance. With VMware suspend/resume for NVIDIA GPU-accelerated VMs, we can tighten the maintenance window for our quarterly and monthly updates or do more inside the same window.”
Technology providers have evolved session virtualization by abstracting and then optimizing the various functions of internal server elements. Initially, this included CPU-with-memory, storage, networking and KVM. VMware, among other virtualization software vendors, slowly abstracted each of these elements into discrete objects, permitting them to be grouped and optimized, while still ideologically isolated in “sandboxes” of varying degrees. It has taken longer for GPU virtualization--and GPUs as a shared but dedicated resource--to take hold as a virtualized object within hypervisor constructs.
GPU cards in VMware hosts are treated as shared resources, but aren’t totally abstracted in the virtualization model. Due to the nature of GPUs versus traditional CPU-based system architectures, maintaining VM session state in relationship with GPU resources has heretofore been a “holy grail.”
To share a GPU card as a resource in a hypervised-host server involves issuing sessions using virtual CPU allocations (vCPUs), which are allocated in fractions of a CPU core or in as many CPU cores (and therefore CPU states) as an administrator allocates to a VM session. GPUs, on the other hand, may have thousands of much smaller cores and states, each of which must be stored for suspension in the correct state, then readied for resumption when a suspended VM is re-instantiated.
Resumption of a suspended session involves a VM’s memory, CPU, storage and application state. When GPUs are joined to a session, resumption also includes GPU memory and the states of flows to the GPU calculative resources.
Over time it has become increasingly easy for hypervisor software to orchestrate VMs, but not with the sophisticated burden of keeping GPUs in sync. The difference can be likened to conducting a baroque orchestra with six people and conducting a full symphony orchestra with choir.
Session management in prior editions were actually more a function of cutting off the video and user authentication. Added are state-virtualization of the components involved, which include the video raster-to-vector rendering, as well as the state of the (sometimes) several thousand processors within the GPU card that can be allocated to a user work session. What’s new is the ability to do the entire job of heavy lifting, which will soon lead, according to NVIDIA and VMware, to full session virtualization.
Applications that need number-crunching power—including business analytics, video processing and AI/neural networks--benefit from GPU calculative power. With these capabilities increasingly making their way to the desktop and mobile devices, NVIDIA’s and VMware’s work to evolve GPU virtualization--and CPU/GPU/VM session security and performance--is worth watching.