Understanding Azure Virtual Machine Scale Sets (Part 1)

One of the biggest benefits of cloud computing is that it can be elastic. Like a rubber band, the idea behind elastic computing is that you can stretch or shrink your cloud service usage to accommodate changes in workload. Amazon, Microsoft, and other cloud providers have tried to make elasticity a key selling point of their offerings, promising that you can seamlessly add or remove virtual servers, storage, network services, and so on at any time, paying only for what you use.

Paul Robichaux

June 5, 2017

14 Min Read
Understanding Azure Virtual Machine Scale Sets (Part 1)

One of the biggest benefits of cloud computing is that it can be elastic. Like a rubber band, the idea behind elastic computing is that you can stretch or shrink your cloud service usage to accommodate changes in workload. Amazon, Microsoft, and other cloud providers have tried to make elasticity a key selling point of their offerings, promising that you can seamlessly add or remove virtual servers, storage, network services, and so on at any time, paying only for what you use. Depending on your  cloud service needs, elastic computing can be a really powerful technology, although it doesn’t work well for scaling conventional application workloads such as SQL Server or Exchange. However, elastic computing shines for applications where you can divide the work to be done among a number of identical applications or services running on different machines. For example, the Xbox Live gaming service uses elastic computing in its back-end to seamlessly bring up or shut down gaming servers for popular games such as Titanfall—whenever you start a new multiplayer Titanfall match, there’s an Azure virtual machine being created to host it, and when all players leave the lobby, the VM goes away.

Azure includes several services with elastic features. One of the newest, and least understood, is Azure Virtual Machine Scale Sets (which I’ll call “scale sets” from now on). A scale set is an identical pool of virtual machines running some application you control. Azure provides tools for you to build and configure the VM the way you want it, then create or remove instances of it until you have as many, or as few, as you need at any point in time. By deploying a scale set, you can have an on-demand army of VMs doing whatever work you need done, but the army grows or shrinks depending on your explicit controls, on user demand, or on other parameters you measure.

Scale Set fundamentals

Former Microsoft architect Bill Baker is often credited with inventing the famous “cattle versus pets” metaphor of cloud scaling. If your servers are like pets, each one is lovingly raised, individually named, and carefully tended; if a pet gets sick, you nurse it back to health. If your servers are like cattle, they are all interchangeable, and you don’t treat individuals any differently by giving them cute names. If one gets sick, you shoot it and get another one. To extend this metaphor, scale sets give you a way to clone a herd of cattle, of a size you choose and with whatever breed of cow you like, on demand, as long as you’re willing to have every herd member be identical.

The first key to understanding Azure scale sets is that they are sets of identical VMs. You can customize the first cow in the herd but all the others will be exactly like the first one. The scale set itself is defined in Azure either through the portal, manually through PowerShell or the Azure command-line tools, or through an Azure Resource Manager (ARM) template. This definition tells Azure what size VM instance you want to use, what the scale set should be named, how many machines will be in it, and so on. You can customize the VM used by the scale set to include your application in three ways: by creating a completely customized VM image and supplying it to Azure, by taking a prebuilt Windows or Linux image and installing your application when the scale set is started, or by customizing the image to include container software and then loading the application container when the scale set is started. Each of these approaches has its benefits and drawbacks, but as far as the scale set is concerned they are equal. Because the scale set is an array of identical VMs, you only have to configure the instance once and then the scale set will handle creating and removing the machines. However, this also means that your application should be able to handle running with minimal on-machine configuration; if you need things like registry changes or configuration files unique to a given machine, that will be harder to automate.

When you start the scale set, Azure creates a number of objects for you: load balancers, network addresses, and so on. All of this infrastructure is contained in the Azure resource group you specify, and it’s shared between all the scale set VMs. After the required objects in the resource group have been instantiated, the scale set system will start creating VM instances up to the initial limit you set in the scale set definition. You can’t assume anything about the ordering or speed at which these VM instances will be set up—Microsoft doesn’t guarantee anything about launch time or latency.

A single scale set may contain up to 1,000 instances, but there are a number of restrictions, which are described at https://docs.microsoft.com/en-us/azure/virtual-machine-scale-sets/virtual-machine-scale-sets-placement-groups. In short, you have to use multiple Azure placement groups if you want to have more than 100 VMs in a scale set, and this requires you to be mindful of the limitations on Azure load balancing (which doesn’t work with layer 4 load balancing across placement groups), storage (you must use Azure Managed Disks), and VM sources (you can’t use a customized VM that you upload if you have more than 100 instances). 

Once the specified number of VM instances have been created, the scale set is considered to be running. (Actually, Azure may create some extra VMs as part of the scale set, but you’re not charged for those because they’re only used for redundancy.) What happens next depends on the way you defined the scale set. A scale set can autoscale, in which case Azure will add or remove virtual machines depending on some monitored parameter. (Microsoft’s name for its autoscaling feature is “Azure Insights Autoscale,” in case you were wondering.) If you don’t enable autoscale, then you can add or remove instances either through the portal or with Azure PowerShell. Keep in mind that you’ll be charged for each instance you run for as long as you run it—the scale set itself is a convenient way to manage a fleet of VMs, but the individual VMs are billed and managed just like any other longer-lived VM you might create. Of course, one thing to remember is that running a large scale set is a good way to rack up high Azure charges—a fleet of 100 or more VMs can drive your bill up faster than you might like.

A simple scale set application

Let’s say you want to create a scale set that you can use for load-testing an on-premises application—that’s exactly what I wanted to use it for. In this case, I have a small simulator that I wanted to run in order to send data to a test server, but I wanted to set up several hundred of them, which would be a painfully tedious task with most other approaches. My plan was to create a scale set using a basic Azure VM image, which I would customize using the PowerShell script extension that Microsoft provides as one of the three supported customization methods. This PowerShell extension is registered on the VM (by Microsoft; you don’t have to do it, although you can register the same extension in your own individual VMs), and it runs after the VM is created and added to the scale set. There are some limitations on what this PowerShell script can do; for example, you don’t get access to the Azure Storage PowerShell extensions, so you have to plan ahead for the customizations you want the extension to apply for you. The extension I wanted to use needed to copy the application and its data files from Azure blob storage, then create some configuration data on the VM, then install a script that would run the application.

Creating a scale set in the portal 

As I mentioned earlier, you can easily create scale sets in the portal, at least if you’re using the “new” portal (portal.azure.com). Log in to the portal, click the “+” icon, navigate to the “Compute” category, and choose “Virtual machine scale set” from the list of supported workloads. You’ll see a page with some information about placement group limitations (which I mentioned earlier). For now, you can ignore that since we’re going to create a small scale set, with just a handful of VMs. The important part of this portal blade is the section at the bottom, where the “Create” button lives. Once you click it, two new blades will appear in the portal. The Basics blade, shown on the right side of Figure 1, is where you specify the key parameters for this scale set:

  • The scale set name will be used throughout the portal. Case doesn’t matter, but you can’t include special characters.

  • OS type is self-explanatory.

  • The user name and password fields specify the credentials you want the scale set VMs to use. Remember, every VM in the scale set will have identical credentials. Azure enforces a number of password length and strength restrictions to try to keep people from choosing easily-guessed passwords: 12 to 123 characters, with at least one upper-case, one lower-case, and one numeric character.

  • The “Limit to a single placement group” control indicates whether you want this scale set to be able to grow past 100 VMs by allowing Azure to spread it across multiple placement groups. Leave it set to “True” unless you want to create a large scale set and understand the limitations of doing so.

  • The “Subscription” pulldown is used to associate your scale set with a particular Azure subscription. This is useful for billing and access control.

  • Every scale set has to live inside an Azure Resource Manager resource group. You can create a new resource group just for the scale set, or choose an existing one, with the Resource group controls.

  • The location you choose for the scale set governs the Azure billing rates used, as well as where your scale set VMs and network objects will be physically homed. While you can’t move a scale set to a new region, you can easily remove it and recreate it in another region if need be. 

After you’ve filled out the controls on the Basic blade, you can click OK to go to the scale set configuration blade (Figure 2).

In the scale set configuration blade, you tell Azure about what you want in the scale set: how many instances, of what machine type, and running what operating system. You must also specify what the Internet-accessible FQDN for the scale set load balancer should be, whether or not you want autoscale enabled, and whether you want managed or unmanaged storage. This last choice deserves an article of its own—unmanaged disks allow you to see the individual storage objects and customize how storage is allocated to storage accounts, and managed disks don’t. Managed disks are logically and administratively simpler to work with but not as flexible. Most of these fields are self-explanatory, but the instance sizing field deserves a bit of explanation (it’s on the right-hand side of Figure 2). When you first click the “Scale set virtual machine size” field, the sizing pane appears, with a single VM instance size selected. This is the size that Microsoft recommends for your scale set deployment—in my experience, it’s usually the D1_V2 size (with 1 CPU core, 3.5GB of RAM, and 2 small data disks). Click the “View all” link and you’ll see the full catalog of VM instance sizes, so you can choose one that offers the right amount of compute power for your application. Once you’ve selected a machine size and clicked “Select,” you can click OK to dismiss the configuration blade.  You’ll then see a summary blade that allows you to review your configuration choices; once you finalize the configuration, your scale set will be created and will appear in the portal. (Note that the “Virtual machine scale sets” category doesn’t appear in the default list of services; you can use the “+” icon in the left hand navigation bar to pin that category to the list for easier access.)

Accessing your scale set

Before you can do much of anything with your scale set, you’ll need to access it. Any time you create a VM in Azure, you have several choices to access it. For scale sets, the choices are these:

  • You can give each individual VM in the scale set its own public IP. This is more expensive, and more complex, than most applications warrant.

  • You can use a load balancer with NAT, in which case you’d probably want to set each instance of the scale set to use a different TCP port. This is the default configuration for a new scale set, starting with TCP port 50000. If you have 10 VMs in the scale set, you’ll have one public IP (for the load balancer) and connect to port 50000 for the first instance, port 50001 for the second, and so on.

  • You can set up a single machine with a public IP, then connect to it using RDP or SSH and use it as a launching point to connect to the scale set VMs on their private IP addresses. 

  • For inbound traffic, you can simply create an Azure Application Gateway or load balancer instance and use it to distribute incoming traffic.

All of these configurations are outside the scope of this article, but Microsoft has documented them thoroughly so you can find what you need to set up whichever type of connection makes sense for you.

Customizing the VM using templates, extensions, and PowerShell

The good news: you’ve created a scale set. The bad news: it isn’t very useful, since all it is at this point is a collection of identical but unconfigured VM instances. This is where the configuration choices I mentioned earlier come into play: you can set up your application in a container and then load it into the scale set VMs, install it directly on the scale set VMs after creating the scale set, or install and configure it as part of the creation process by using a template. I prefer to use the template approach because it allows you the most flexibility. However, it is also the most time-consuming to learn and use. If you’re already comfortable with containerization technologies such as Docker, and your application is easily containerized, that might be a better way to get started quickly.

A scale set template is nothing more than a JSON file that specifies how you want the scale set configured: how many instances, what the administrator password is, and so on. In fact, if you use the GUI as described earlier in the article, the confirmation page displays a small link that lets you download a template file that can be used to implement the configuration you chose in the GUI. Using templates has several benefits, including the ability to keep your templates in a source control system such as Git. The big reason to use templates, however, is that they make it easy to do post-install customizations because you can use a template extension that runs on the VM after the VM has been instantiated. In part 2 of this series, I’ll discuss how to use templates and customization to define a scale set using PowerShell, instantiate it, customize the VM image by downloading and installing software, and start and stop the scale set when you need it to perform actual work.


Scalability is one of the key reasons that cloud computing has become popular, and the ability to quickly scale up a particular workload, on demand, is useful in a wide range of contexts. Microsoft has provided a simple GUI for creating scale sets in the Azure portal, so you can easily and quickly create a scale set to test applications or handle high-demand workloads.

Sign up for the ITPro Today newsletter
Stay on top of the IT universe with commentary, news analysis, how-to's, and tips delivered to your inbox daily.