Cloudflare, which operates a content delivery network it also uses to provide DDoS protection services for websites, is in the middle of a push to vastly expand its global data center network. CDNs are usually made up of small-footprint nodes, but those nodes need to be in many places around the world.
As it expands, the company is making a big bet on ARM, the emerging alternative to Intel’s x86 processor architecture, which has dominated the data center market for decades.
Cloudflare just launched a new data center in Macau. And in Riyadh. And in Reykjavik. It’s launched data centers in Istanbul, Kathmandu, Phmon Penh, Beirut, and in CEO Matthew Prince’s home town, Salt Lake City. “Our goal in March is 31 new sites in 31 days, and maybe 22 of those already have equipment racked and loaded and ready to go,” he told Data Center Knowledge.
It’s an ambitious target that he admits the company might not quite manage – in 2012 a similar project launched ten data centers in 30 days – but he expects to at least get close. “It will be somewhere between 27 and 31. And what’s fun is that it’s starting to be cities where I don’t know where they are!”
Apart from giving Prince the pleasure of testing reporters’ knowledge of Eastern European capitals, adding so many new locations is how the company keeps building out a network that can shrug off the kind of DDoS attack that targeted GitHub last month. It was the largest of the kind ever recorded, but Cloudflare sees large-scale attacks against its customers fairly regularly, Prince told us. “It was an exciting week, because someone found a new toy, but it was also a boring week, because we’ve graduated to a scale where even terabyte-scale attacks don't cause us a lot of trouble.”
The key is having a network that’s distributed enough to absorb enormous floods of malicious traffic. One way Cloudflare keeps its costs down is by connecting the nodes of its network with local ISPs. “It’s slightly more expensive to get gear to these places, so you effectively have slightly … higher CapEx, but the bandwidth costs just by far trump that. And if you go directly into these regions, you're able to directly connect with the ISPs there and exchange traffic at no cost.”
It also keeps costs down by pre-configuring routers in new data centers (usually in existing colocation facilities) so that its engineers can set up and run servers remotely, PXE-booting across a VPN from the nearest Cloudflare site. Many of the facilities the company uses have never been visited by any of its employees.
By the end of 2018, Prince plans to have Cloudflare data centers in 200 cities around the world (at the beginning of March the number was over 125). At that point, he said, around 95 percent of the world’s population will live in a city with a Cloudflare data center. “We constantly have to be moving our gear closer to where people are, so we’re looking for anywhere there’s a major population center and moving in.”
Arrival of 5G wireless networks will make this even more important, as next-generation applications will push the limits of what a mobile device can do. “It becomes a question of how much you can take the devices we’re walking around with and have them intelligently use the thing that’s right next to them (computing infrastructure) to augment what it is that they're doing,” Prince said.
To keep further expansion affordable, Cloudflare is planning a massive architectural change. The company will soon start deploying ARM servers in its data centers, Prince said, expecting that the alternative to x86 will be cheaper to buy and to keep running due to their lower power requirements and to the nature of Cloudflare’s workload.
The company has already moved away from Intel SSDs after finding the performance wasn’t enough for its needs. (All Cloudflare servers use SSDs to cache data from the web sites it protects.) “I’d give better-than-even odds that by Q4 this year we will no longer spend any money with Intel,” Prince told us.
“We think we're now at a point where we can go one hundred percent to ARM. In our analysis, we found that even if Intel gave us the chips for free, it would still make sense to switch to ARM, because the power efficiency is so much better.”
That may not be true for every data center, but ARM is a good fit for Cloudflare’s workload. Lower-power chips are also good to have in the cities where its servers may be deployed in older and less efficient facilities.
“Every request that comes in to Cloudflare is independent of every other request, so what we really need is as many cores per Watt as we can possibly get,” Prince explained. “The only metric we spend time thinking about is cores per Watt and requests per Watt.” The ARM-based Qualcomm Centriq processors perform very well by that measure. “They've got very high core counts at very lower power utilization in Gen 1, and in Gen 2 they're just going to widen their lead.”
Qualcomm started shipping Centriq, aimed at hyper-scale cloud platforms, last November. The hyper-scalers, who together make up a huge portion of the server-chip market, have been open to ARM. Microsoft, for example, said last year that the architecture could one day power half of its entire workload.
The Cloudflare team (which includes former Intel engineers) didn’t initially expect ARM processors to perform so well for its needs, especially with the open source software the service is built on. “They started testing the workloads, and they were just blown away. It was so much easier than we thought it would be. Performance is already better, and the power performance is just incredible.”