From One Web Server to Two: Making the Leap to a Web Farm

Grow your own web farm by adding hardware, setting up load balancing, and putting the servers on a cluster

Richard Campbell

October 30, 2009

17 Min Read
ITPro Today logo

Cover Story

From One Web Server to Two: Making the Leap to a Web Farm

Grow your own web farm by adding hardware, setting up loadbalancing, and putting the servers on a cluster

By Richard Campbell

As web developers building applications in ASP.NET,we've been told that scaling our website will be easy you just add more webservers. But the reality is not that simple. There are quite a few stepsinvolved in actually getting multiple web servers working on the same website.Also, there are key design decisions in your application that can significantlyimpact how your web farm works. However, in the 24 x 7 web world, there is nosubstitute for building a web farm; you just have to know when and how.

When to Add a Web Server

So when should you add a second web server? Thereare really only three reasons to do so. By far the most common reason isreliability: When you have only one web server, having it die on you means yourwebsite is down. Adding a second web server gives you redundancy.

If reliability is your motivating reason, youshould be looking at your entire infrastructure as well. You may need tosignificantly re-architect your infrastructure for redundancy. What about yourdatabase server? Can your website stay up if it fails? Can you seamlesslyredirect web traffic from one server to the next? Once you head down the pathof guaranteed uptime, you're going to need more than one of everything. Part ofthat goal is the web server, but it's important to remember that it is onlypart of a larger system that needs to be reliable.

The second reason for adding a web server isperformance. It's easy to believe that if you double the number of web servers(say, from one to two), you'll double the performance of the website. The realityis not that nice. Although it's possible to improve the average response timeof your website by adding more servers, there is certainly no guarantee ofthat. It is entirely possible to add a web server to a site and have noappreciable improvement in performance.

To know for sure that you'll get a performancebenefit from adding web servers, you need to know two things: the first is thatas the number of users goes up, your website's performance goes down. PerformanceMonitor can tell you this, by looking at requests per second counter (part ofASP.NET) and average response time counter (part of ASP.NET applications). Whenaverage response time is going down as the requests per second counter goes up,that's a sign that you're performance bound. But there's more: The second keyfactor to knowing you'll benefit from adding a web server is being able to showthat the web server is buried that it cannot serve any more users at thecurrent rate than it's currently serving.

Again, Performance Monitor can help you determinethis. Look for pinned processors (CPU %), maxed-out memory (.NET Memory Heap),and response queues growing out of control (part of ASP.NET). It is easy tofool yourself looking at stats like this. For example, you could have a webapplication that's completely dependent on retrieving data from the database.Adding more web servers won't help you; you'll just have more servers waitingaround for the database to deliver data. In fact, the additional databaseconnections might make things even slower! However, in that scenario, it's notlikely that your CPU will be pinned, but it is possible that you'll havesignificant request queues.

There are more impediments to getting performancebenefits from web servers than just database bottlenecks. I'll go into more ofthese issues later in the article, but it is well worth your time to think hardabout how your application functions before presuming that adding servers willimprove performance. It can be a career-limiting mistake to spend thousands ofdollars on new computer equipment that provides no benefit.

Finally, the third reason to go from one web serverto two is seamless website updating. If you have a 24 7 website, taking the site downto do updates is not an option. By having two or more web servers, you can pullservers out of the pool, update them with the latest version of the webapplication, and place them back into the pool without the site ever beingdown. Actually making this process work is tricky, but it's a highly valuedskill in the modern web world.

Buying Hardware

When you're getting ready to build your web farm,especially for the first time, leave your existing gear alone. It's time to buyall new gear you'll need to do quite a bit of testing before you'll be ready totake over for the existing equipment. Trying to buy just enough gear to turnthe existing set up into a web farm is a recipe for disaster.

If this is your first web farm, you likely haveonly three sets of working equipment now: your development gear, QA gear, andthe production environment. The development and QA gear likely are very similarto each other, whereas the production system will be different, possibly withmore redundancy and performance.

Once you enter the world of web farms, you need apre-production environment. This equipment needs to reflect the productionenvironment. It doesn't have to be identical, but it should be close. Thepre-production environment is where you can run load tests and make sure thatyour application is going to behave properly in the web farm. When you'reshopping for your first web farm, buy the pre-production environment first. Itwill help teach you what you'll really need for production.

Ideally, your web servers should be symmetrical:all identical machines. Although it's possible to load balance betweenasymmetrical machines, doing so will very likely cause more problems than it's worth.Any perceived savings in cost will be quickly wiped out by the additional costof diagnosing problems with the asymmetrical configuration. They don't need anyinternal redundancies, like multiple hard drives or power supplies. After all,if you've set up your farm correctly, you should be able to have a web serverfail with no significant impact on the application at all. Web servers in a webfarm should be inexpensive and plentiful.

Load Balancing

One of the most challenging decisions to make whensetting up your first web farm is how you're going to load balance between yourservers. Microsoft offers a few ways to do load balancing. The simplest isNetwork Load Balancing (NLB), which comes with every copy of Windows Server. Ifyou're running Microsoft IIS 7, you also have Application Request Routing(ARR), which will do load balancing in addition to request routing (e.g., separatingimages requests from ASPX requests). Finally, Microsoft ISA Server also has aload balancing feature.

There are numerous third-party load balancinghardware solutions as well. The biggest (and most expensive on the market) comefrom companies such as F5 Networks, Citrix Systems, and Cisco Systems. Thereare also lower-cost solutions such as Zeus Technology, Coyote Point Systems,and Barracuda Networks. Hardware load balancers offer a large number of optionsfor load balancing as well as other features such as SSL offloading. If you'reconsidering the third-party load balancer option, remember that you may need tobuy two for redundancy's sake and consider the costs of training and/orconsulting for configuration. You'll also need a load balancer in yourpre-production environment. It's smart to get to know your load balancer well;it's an expensive piece of equipment with many features that can help yourwebsite.

Setting Up NLB

You can't argue with the price of NLB: It'sincluded with Windows. It also requires no additional hardware, since it runson the web servers themselves. There's no central point of control; every webserver knows what every other web server is doing, so there is no single pointof failure. NLB is an algorithmic load balancer, splitting the workload betweenthe web servers in a round-robin style by IP address. You configure rules forwhat NLB load balances and how the servers are balanced.

Every web server in an NLB cluster listens on acommon virtual IP address. Because all the servers have the same algorithm inthem, when a request comes in, they know which server should respond to thatrequest, and only that server will respond to a given request. The web serverskeep in touch with each other via a status packet that's sent every few secondson the virtual IP address. When a web server stops sending that packet, theserver is dropped from the cluster automatically. The remaining servers recomputethe algorithm to load balance without the failed server.

You have to install NLB on your web servers; it isn'tinstalled by default. NLB is part of the Windows Networking features. It'simportant to note that NLB is not a web server specific feature; it can loadbalance anything and is used with Microsoft Exchange Server, SQL Server, and otherWindows services.

Once you've installed NLB, you can start creating acluster. The first step in creating a new cluster is selecting the first hostto be in the cluster the first web server of the cluster. In Figure 1, I'veentered the IP address of the first web server for the cluster. All my webservers have NLB installed on them, but you don't have to run the NLBadministrator from a web server; you can run it on any computer that canconnect to the web servers.

Figure 1: Creating a new cluster

The next step in setting up the cluster isspecifying host parameters, as Figure 2 shows. The priority of the host mattersonly when a request comes into the virtual IP that is not covered by the portrules you've set. When a request comes in that isn't covered by port rules, theserver with the lowest-priority number handles the request. Another hostparameter is the initial host state. By default this is set to Started, meaningthat every time the web server starts up, it joins the load balancing clusterimmediately. You can choose to have freshly started servers not join the loadbalancing cluster until you tell them to.

Figure 2: Setting the host parameters for the first host in the cluster

Once the host parameters for the first host in thecluster are set, it's time to specify the cluster IP address. This is thevirtual IP address that all servers in the cluster will listen to. It'simportant to make sure this IP is not used by anything not part of the cluster.As you see in Figure 3, you also set the cluster operation mode, which handleshow MAC addresses work in the cluster. The default is Unicast mode, whicheffectively makes every web server in the cluster have the same MAC address.Multicast mode lets the servers have their own MAC addresses. IGMP multicast isused only on networks taking advantage of the IGMP protocol.

Figure 3: Configuring the cluster IP address

As your cluster grows larger (beyond two servers), you'llfind find that the amount of network traffic generated is hard on your network bothin test and in production. It's important to isolate that traffic from the restof your network, either by using isolated switches or Virtual LANs (VLANs).

Your next step is to set port rules; Figure 4 showsthe dialog box where you do so. Since I'm building a web server cluster, I needto create a rule only for port 80 using the TCP protocol. The filtering mode isthe key part of this dialog box: It decides the affinity of the web farm.Setting affinity to None means there's no affinity; a given request IP addresscan go to a different server every time. Setting affinity to Single means agiven IP address always goes to the same server. Network affinity is used whenyou have tiers of clusters something you'll need when you have many (more than10) web servers. Affinity is a key issue in making web farms work well. I'lldiscuss this in more detail later.

Figure 4: Creating port rules for the cluster

After the port rules are set up, the cluster willstart to set itself up. This takes some time, since NLB is now reprogrammingthe NIC in the first web server, setting it to listen on the virtual IPaddress. When this process is finished, you have a cluster of one server setup. The next stage is to add a second host to the cluster, as Figure 5 shows.The process is identical to adding the first host to the cluster, except thatthe cluster IP address is already set, as are the rules.

Figure 5: Adding the second host to the cluster

Since in this example there are only two servers,once the second server is added, the cluster is finished. Figure 6 shows NLBManager after the second server has been added but is not yet fully configuredin the cluster. The status of the servers will cycle through three phases:Pending, Converging, and Converged. It can take several seconds for this tohappen.

Figure 6: Configuration changes underway in NLB Manager

Once the cluster is set up, you should now be ableto hit your web application on the virtual IP address. The old IP addresses ofthe servers are still there, but they are specific to the server. You will wantyour users to only ever reference the web farm by the virtual IP that meanschanging some DNS entries. The original IPs still work, and for administrationsthey're very useful for being able to check individual servers, but the usersthemselves shouldn't ever do that.

So What's the Big Deal About Affinity?

Affinity is the term used to indicate that a givenuser on your website is storing information specific to that user on a givenweb server. The user is bound to the web server because they need to keep comingback to that server to use the information stored there. In ASP.NET, typicallythe bound data is in-process session data. If your user is loading items into ashopping cart on your website and the shopping cart is stored in an in-processsession, sending that user to a different server will cause the items in theircart to seem to disappear or worse, the user will get an Object Not Found error.

If you store your session data in process but don'tset your load balancing to bind the user to the server, you'll confuse yourselfand your users. It isn't always obvious that the problem is your load balancingconfiguration. Sometimes your site will work fine, sometimes things willdisappear, and sometimes you'll get errors.

Whether you use software- or hardware-based loadbalancing, all of them offer some sort of affinity. The simplest affinitysticks a given IP address to a given server. This is what NLB supports. Thedownside to that sort of affinity is that you can actually have a large numberof users coming from the same IP address, so the server that gets stuck withthat IP address can be overwhelmed.

A better form of load balancing affinity iscookie-based: using the ASP.NET session cookie as the affinity identifier. Withcookie-based load balancing, a given cookie is always sent back to the sameserver. Most third-party hardware load balancers support this technique,referring to it as sticky sessions,as does ARR. But regardless of the technique you use, getting the configurationright is important, but an even better solution is to get rid of affinity.

Recall that there are only three reasons to go tomultiple web servers: reliability, performance, and seamless updates. Affinityimpairs all three of these goals. If you need reliability, storing your sessiondata in a web server means that session data vanishes when the server fails,which leads to unhappy users. If you want performance, affinity adds overhead;it takes extra work for NLB or any other load balancing system to keep a givenuser stuck to a given server. And if that server gets overloaded with work,there's no way to get away from that busy server; you're stuck to it. Finally,seamless updating means getting all the users off the server when it's time toupdate. If you've got users stuck to that server, it can take hours or more toget all the users off of it without negatively impacting them.

Clearly then, the goal of any web farm is to getrid of affinity. So what does it take?

Moving Session Out-of-Process

Microsoft offers two options for moving sessionout-of-process: State Server and SQL Server. State Server is a free bit ofsoftware included with ASP.NET for storing state data. It's a fast, simpleserver with no redundancy options. Typically you'd have a dedicated serverrunning behind your web servers for storing state data, and all your webservers would fetch state data from it.

Unfortunately, in this configuration if the stateserver fails, you've lost all the session data from all users. And sincethere's no redundancy option, there isn't much you can do when it fails exceptset up a new state server and get your users to start over.

SQL Server is a popular option for out-of-processsession data for a couple of reasons. The first is that most of the time,you're already running SQL Server, so there's no additional licensing orhardware needed. Also, there are redundancy options for SQL Server.Unfortunately, SQL Server is the slowest of the out-of-process options,although the actual amount of time it takes to store and retrieve session datafrom SQL Server is not significant if the amount of data stored in sessionisn't too large.

To switch your session to out-of-process, youmodify web.config to specify how and where you want to store session data. Butbefore you do that, have you marked all objects going into the session objectas serializable?

When you switch to out-of-process session, you'reswitching from storing session data in the memory of your web server into amore complex process. At the beginning of each web request that references aweb page using session data, ASP.NET will make a call to the out-of-processsession store to retrieve the session. This data is in a serialized form that is,in a form that transmits over the network properly. Once it arrives back at theweb server, the web page processing continues. When something is referencedfrom the session object, it is de-serialized back into memory. Note that onlywhen an object in the sessions object is referenced is it actually placed intomemory. When the page is finished processing, the session object with anychanges is serialized again and set back to the session store for the nextrequest.

To mark your objects as serializable, you have toset the Serializable attribute to true. These days, virtually every object isserializable, but since the default setting of the attribute is false, if youmove your session out-of-process without setting the attribute, you'll get aserializing error when ASP.NET attempts to serialize your object in the sessionobject. It's a vague error and takes time to debug.

In an existing application, you'll have to do asearch for every reference to the session object and check exactly what you'reputting in it. Make sure the Serializable attribute is set. This is also anexcellent time to think hard about exactly what you're stuffing into thesession object. More is definitely not better; now that you're sending yoursession data out of your web server with every web request, you want to keepyour session object small.

There's another significant advantage to goingout-of-process with session: You stop using .NET memory for session on your webserver. Most really busy web servers are running low on memory most of thetime. Getting session out of that memory can really help with performance andreliability.

Getting the Real Benefits of Web Farms

The real benefits of web farms are reliability,performance, and seamless updates. Once your application doesn't requireaffinity, you can easily add more servers on demand and recover gracefully fromfailed servers. When you want to update your servers, you drain them ofconnections (even NLB has this feature), remove them from the pool, update theserver with the latest version of the application, then add it back into thepool. This process can be scripted for rapid, seamless website updating.

To get these benefits, you need to do lots oftesting and practice. Use new, separate hardware for this; ultimately you'llneed to build a pre-production environment. Get good at load testing; you wantto test your web farm and see what benefits you're getting each time you add aweb server. Practice failures, too: Pull the plug on your servers during a loadtest and see what failure looks like.

A web farm is more than just going from one serverto two; it's significantly changing the architecture of your application andits infrastructure. With some effort and practice, those changes will take yourweb application to new levels of success.

Richard Campbell ([email protected]) is a co-founder of StrangeloopNetworks. He has more than 30 years of high-tech experience and is both aMicrosoft Regional Director and Microsoft MVP. In addition to speaking atconferences around the world, Richard is co-host of the .NET Rocks! (www.dotnetrocks.com) and host of RunAsRadio (www.runasradio.com).

Sign up for the ITPro Today newsletter
Stay on top of the IT universe with commentary, news analysis, how-to's, and tips delivered to your inbox daily.

You May Also Like