Liberty Mutual is one of the largest property and casualty insurers in the world, with more than 45,000 employees globally. As the company grew, so did the mixture of overlapping systems, making it more and more difficult to keep costs under control while ensuring that both internal and external customers had access to the resources they needed. The solution was a major shift to the public cloud for most workloads, along with a consolidation and modernization of the compute and storage platforms in its on-premises private cloud.
One of the biggest drivers has been the fast growth of the company's data stores, which reached the 40-petabyte mark a few years ago. It required a team of 100 infrastructure engineers and systems administrators to manage more than 60,000 servers and VDI instances, along with a massive amount of data stored in storage-area networks (SANs), network-attached storage (NAS) and, more recently, object storage.
During a period of about 10 years, the IT group added several different solutions to its storage approach to address various problems. Because Liberty Mutual had three data centers across the country, it had to replicate different solutions and types of storage across data centers. To make matters even more challenging, the company would add caching appliances and other solutions to bolster the speed and performance of aging, low-performance storage.
All of this came to a head in 2018 when Liberty Mutual's internal customers began complaining that they couldn't get the storage they needed fast enough.
"About 30% of the operational team was playing a shell game, taking storage from one location and one product and moving it to another to free up space to do something else," said Chris Lund, director of hybrid compute and storage at Liberty Mutual. "It was still a safe place to store customer documents because it continued to meet SLAs in a lot of ways, but managing it was really tough."
At the same time, storage requirements continued to grow between 5% and 10% every year. All of this convinced Lund's team to make some serious changes.
One of the features Lund insisted on was cloud-based storage, which could be used for off-site replication. That way, the company could have a production copy of data located as closely as possible to the applications it served, as well as a replica for disaster recovery purposes. That scenario was a much more efficient and cost-effective solution to the storage approach Liberty Mutual was using at the time, which required all data to be stored in triplicate.
The solution also had to be simple. Instead of having multiple tiers and types, the team wanted low-cost, high-performance solid-state disk—something that didn't exist several years ago. By the time the team was looking for a replacement, it realized that the company could get at a reasonable cost very high performant solid-state disk that was much more reliable than spinning disk.
That settled the matter. "At that point, it was a no-brainer," Lund said. "We could get the performance so we would have it when we need it, and if we don't use it, it's not like it's costing us that much, and we'll make up for it on the operational side through drastic simplification of the environment."
Lund also insisted on API-driven interfaces. The previous storage had been very hands-on, with changes done manually by logging onto a storage console. Newer generations of solutions now provide APIs that allow administrators to perform tasks such as creating storage, allocating storage to a system or application, and adding or removing storage all through infrastructure-as-code. This process reduces the labor required.
In addition, the team was looking for a consumption model with no capital outlay. With this storage approach, Liberty Mutual would be able to agree on different volume and pricing tiers up-front. That way, even if the company didn't use as much storage as expected up-front and may pay a bit more for that storage, it could be assured that the more storage it used, the more that storage would be discounted.
Finally, Lund wanted a solution that would help the team simplify capacity and lifecycle management. "We spent a lot of time moving things around, thinking about whether we needed to buy more storage, and how much we needed to buy. We didn't want to do that anymore," he said. "We wanted to get to the point where we could say, 'Application X used $16 of storage last month and $17.50 this month,' and analyze whether this is a problem or a trend."
And the Winner Is ….
Using these criteria, Liberty Mutual chose a combination of technologies from vendors it already used in some capacity, including Dell/EMC, Hewlett Packard Enterprise and VMware. Lund's team wrote its own Ansible pipeline automation solution to zone, mask and allocate raw storage to its VMware host servers for the HPE SAN storage solution. From there, the storage is presented in the form of data stores, and VMware Dynamic Resource Scheduler manages capacity across virtual machines.
Now, when storage is required, it is provisioned through an API or user interface via both VMware and CloudBolt. Depending on the requirements, the storage will be either an all-flash disk embedded in Dell's VxRail hyperconverged server system or HPE's XP8 SAN storage.
The cloud-based replication model allowed the company to replicate its mission-critical data offsite to AWS S3. Not only did that reduce storage costs by not having to manage replicated data onsite, but it allowed the data protection team to expand its infrastructure engineering skills in the public cloud.
"We're trying to make storage something you don't have to think about," Lund said. "We set it up so the provisioning automatically asks a series of questions around the size of the application and the speed of storage needed. It will provision the right server, storage and anything else the [internal] customer needs in minutes."
With this system in place, the IT team also can evaluate consumption and use metrics associated with specific business owners and applications to understand the right cost model going forward.
When object storage is needed, Liberty Mutual uses the software-defined IBM Cloud Object Storage, as well as AWS' S3 and S3-Infrequent Access for applications that can use object through the public cloud directly.
Upgrading storage is just one part of Liberty Mutual's overall data modernization project.
"Data modernization is a very business-centric topic for us in terms of the future of the business, but a lot of our customer data is still on premises. As a result, we may be missing out on opportunities when it comes to machine learning and enhanced data visualization," Lund said.
The goal, he said, is to find a way to safely unlock sensitive data responsibly by moving as much data to the cloud as possible.