Public cloud requires huge capital expenditure, not just for the massive scale of millions of servers in data centers, but also because those servers need to be replaced regularly as workloads become more demanding.
The variety of hardware in Microsoft Azure data centers, for example, is far from what you would see in cloud data centers of the past, which were filled with tens of thousands of identical, and fairly average, servers. Changing nature of the workloads has pushed hyper-scale data centers to become much more heterogeneous, and the model of scaling out using low-cost commodity boxes is now increasingly being augmented with powerful scale-up server architectures.
The rate of change in cloud hardware is accelerating, especially as more and more specialized hardware is designed for neural-network training and inferencing. To date, Microsoft has chosen to use FPGAs to accelerate these workloads. But in an interview with Data Center Knowledge, Azure CTO Mark Russinovich didn’t discount the possibility of using more custom processors for neural nets in the future.
Azure’s rival, Google Cloud Platform, is now on the third generation of its own custom chips for machine learning called TPUs, or TensorFlow Processing Units; another rival Alibaba revealed recently that it also had a custom AI processor; Facebook is said to be in early stages of developing a chip of its own for the same purpose.
“We’re at the very beginning of the innovation here; hardware for inference and training in the cloud will evolve,” Russinovich said. “What some accelerators provide even in short term is enough differentiated value in terms of solving a problem more quickly.”
First Azure Servers
An ever-changing hardware fleet appears to be a reality hyper-scale data center operators now live in. In his keynote at Microsoft’s Build conference in Seattle last week, Russinovich walked through the evolution of Azure hardware over the years.
The first servers in Azure data centers had two sockets and 8 cores, 32Gb of RAM, 1Gb network switches, and hard drives, not SSDs. Just a couple of years later, Azure was putting in servers with 128Gb of RAM and 10Gb switches.
At that time, it also started to introduce specialized servers. “Our HPC SKUs had SSDs and InfiniBand, so you get network latency between VMs in the same cluster of just a few microseconds [which is] great for scientific workloads that need time synchronization between different computers,” he said.
Three years ago, standard Azure servers moved to SSDs, 40Gb switches, and 192Gb of RAM. By the following year the team was building “pretty beefy servers,” Russinovich said, each with two 16 core sockets and 512Gb of RAM.
They were nicknamed Godzilla and were designed to run the G-series cloud VMs. A single G5 instance, for example, has up to 32 virtual processors, 448Gb of memory, and 6.5TB of SSD.
The next generation of general-purpose servers was only a little less powerful than Godzilla, with two 20-socket cores and 256GB of RAM. These were the first Azure servers with FPGA boards, but they were followed by models with multiple GPU types (for rendering, machine learning, and scientific workloads).
After that came a server spec Russinovich called “the Beast.” It had four sockets, 18 cores apiece and hyperthreaded, which meant there were 144 cores. “And the amazing thing is it’s got 4TB of RAM.”
The M series machines that run on the Beast are expensive, designed for very specific workloads. “The reason we had to do this is that SAP HANA scale-up workloads are very memory-intensive, and we’ve got many customers, including Microsoft, running them in production,” he explained.
The Next Gen
The next generation of servers Microsoft is currently building for Azure have 768Gb or RAM, and they’re the last servers that will use 40Gb networking; “We’ll jump to 50Gb in the next generation,” Russinovich said.
These sixth-generation servers are being built using Microsoft’s OCP Project Olympus design, which takes multiple different CPUs and accelerators, such as GPUs and FPGAs, and even Cavium and Qualcomm ARM CPUs for storage systems.
“They can support very high-performance GPUs and many GPUs in the extended chassis,” Russinovich said. “They can support high-density disks; they can support custom-designed SSDs that we're making; they can go onto the 50Gb networks that are coming.”
A custom PCIe fabric allows Microsoft to interconnect four chassis, each packed with eight Nvidia GPUs, in a way that essentially creates a 32-GPU server. The fabric’s latency is as low as you would get with all the GPUs sitting on the same PCIe bus, he said. “That’s an amazing machine for doing deep-learning algorithm training.”
Hot-swappable Single Magnetic Resonance 14TB hard drives can pack 1.2PB of storage into a Project Olympus chassis and 8.6PB into a single data center rack. With high-density SSDs, a single server can have 256TB of NVMe flash.
“Our general-purpose SKUs are now larger than Godzilla was just two years ago; this is the rate of scale up in the public cloud,” Russinovich said. “When we started, a slow, cheap commodity server was scale-out, but what we see now is very mission critical workloads like SAP HANA demanding a lot more power.”