By now you've probably heard that the open core code repository GitLab is in the midst of a migration away from Microsoft Azure to Google Cloud Platform. There's actually nothing new here, as this migration has been underway for a while. What made it news was that on Monday, Andrew Newdigate, the project lead of GitLabs cloud migration team, published a blog explaining the migration, primarily as a way of reassuring users that the company is taking steps to assure the migration goes off with little to no disruption.
Earlier this month I spoke with GitLab's co-founder and CEO, Sid Sijbrandij, for an article for our sister site Data Center Knowledge. He told me then that the migration was already underway and that much of GitLab's infrastructure had already been moved to Google Cloud Platform, but that its customers' accounts were still on Azure.
That would be the customers who are hosted by GitLab. In an email after the article was published, Sijbrandij explained that "GitLab.com has 200 terabyte of git repositories. Self-managed installations have petabytes of storage in total." In other words, most of GitLab's customers are hosting themselves.
To old timers in the open source game, it might come as a surprise that a company like GitLab that's proud of it's open source roots would be using Azure to begin with. After all, wasn't distrust of Microsoft's ownership of GitHub the reason behind the mass exodus to GitLab earlier this month? While a "new" and more open source friendly Microsoft was undoubtedly one of the reasons why GitLab would even consider the move to Redmond's cloud -- the motivating factor was money.
"We got a big credit, being a startup, to move there," Sijbrandij told me.
In other words, as a cash strapped startup (all startups are cash strapped to one degree or another), Azure made them a deal they couldn't refuse. That was important for a company that runs absolutely everything in the cloud. GitLab has no servers of its own, and before the migration to Google Cloud Platform began, everything it offers was hosted on Azure. The company also doesn't have a brick-and-mortar office; everyone works remotely.
So why the move from Azure? Simple: containers.
GitLab already offers its customers containers as part of its service, which goes a bit beyond the scope of a plain vanilla git repository, to offer tools for the entire development life cycle. But while GitLab has been offering its customers the convenience, speed, and flexibility of containers, its own infrastructure has been running the old fashioned way, using virtual machines.
"We're switching towards a cloud native architecture," Sijbrandij said. "We're big proponents of Kubernetes. We think the whole world is going to move from virtual machines to containers with Kubernetes. We're moving GitLab.com to a setup like that.
"We're already allowing people to attach a cluster, a Kubernetes cluster, to their projects on GitLab. If you attach your cluster, all your tests, all your review apps, all your production apps, will all happen on that cluster. I think that's a really neat feature that makes it a lot easier to to have that whole DevOps cycle but have it happen automatically. You just push your code and GitLab takes care of the rest."
While all of the big cloud players offer spin-it-up containerization services, Sijbrandij and his team settled on Google Cloud Platform as being the best suited to suit their needs.
"We think Google, the authors of Kubernetes, has the best service for Kubernetes, so we wanted to move to Google."
He makes a good point. Although users have long been able to install and run their own instances of Kubernetes on both AWS and Azure, neither cloud offered it natively in easy-to-spin-up mode until this year. Google, on the other hand, as the original developer of the orchestration engine, has been offering Kubernetes on GCP for about as long as there has been Kubernetes.
Earlier this month, as GitLabs was dealing with the anti-Microsoft mini-exodus from GitHub that saw a temporary 7x increase in orders, Sijbrandij treated me to a glimpse of a pane-of-glass view of some aspects of the GitLab infrastructure that were already up-and-running on GCP. At that moment, there were about 1,000 machines running.
"These are runners, so these are processing the builds on GitLab," he explained. "These are not to host GitLab.com, but to process all the CI [continuous integration] and CD [continuous deployment] work that's happening on GitLab.com. And this is all auto scaling; it's scaling up and down according to need."
Now GitLab seems to be preparing for the final stages of the migration. According to Newdigate, his migration team is utilizing a homegrown application, Geo, which allows full, read-only mirrors of GitLab instances, to move GitLab.com from Microsoft Azure to GCP. For several months, he said, GitLab and been using Geo to keep an up-to-date synchronized copy of GitLab at GCP's us-east1 data center in South Carolina, which will evidently be GitLab's new home. In addition, the company is moving file artifacts that have been kept on NFS servers to Google Cloud Storage to leverage its built-in redundancy and multi-region capabilities.
The current plan is for the migration to be complete and for GitLab to be operating from its new GCP home July 28, 2018. Newdigate cautioned, however, that a smooth transition for its customers in more important than meeting the deadline.