So your Web site is up and running and people are flocking to it. Time to file your IPO and retire to Jamaica, right? Wrong. Handling your new customers is the challenge of every successful Web site. With Windows 2000 or Windows NT as your host, you have several options to scale your Web site. But how do you know which options are right for your Web site? Knowing how to handle large numbers of users and incorporate high availability and robust handling of hardware failures are the keys to making scalability work for you.
Scaling Web sites is not a simple task, but it's not daunting either. Each Web site is based on several key components that work together to serve your unique customer base, and it's important to understand how to scale each piece, including your Web and database servers. (This article assumes that your Web site consists of Web servers on the front end and database servers on the back end.)
Change Your Thinking
PC hardware used to be expensive and unreliable. You typically relied on a few powerful (albeit expensive) servers to handle the breadth of work on your network, which was a throwback approach to the mainframe days. Today, PC hardware is relatively inexpensive, reliable, and small, which allows for the specialization and redundancy necessary to solve the reliability and high-availability problems.
Despite these advances, hardware still fails. Those little tenuous pieces of plastic, silicon, and copper aren't resilient enough. Cards die, hard disks crash, and power supplies fail. To achieve high availability, you must change your thinking and admit that hardware fails. With this shift in thinking you can anticipate these failures by incorporating redundant systems.
Redundancy means having replacements available for every critical piece of the puzzle, right? Yes and no. I worked for a major oil company in the late 80s, and the company incorporated redundancy for every desktop computer in our department. The onsite technician had access to a room of replacement parts and could fix a problem within 6 hours of a hardware failure. Unfortunately, 6 hours today may as well be 4 years in Internet time. So by today's standards, redundancy means having machines available to take over for failed systems immediately—no down time, no customer complaints, and no 404 errors.
So how do you incorporate redundancy to satisfy all these requirements? By using server farms—banks of identical systems that can handle the same job. This approach is best suited for Web servers, but the same principle can apply to many different types of tasks. By creating farms of servers with identical roles, you can achieve high availability and handle large amounts of traffic.
Before you jump into creating a server farm, you should understand how to manage the servers. To make your redundant banks of servers work, you need to manage distributing requests across the farm. This process is called load balancing. Many load-balancing solutions exist, and they can best be broken down into two categories: software and hardware.
So what does load balancing do? Let's look at an example that uses Web servers. When you type http://www.mywebsite.com into your browser, several processes occur. First, the browser attempts to resolve the user-friendly domain name (www.mywebsite.com) into an Internet TCP/IP address (e.g., 192.168.0.1). The browser then attempts to connect to a port (usually port 80 for Web sites). Load balancing comes in at one of two places, either during this resolution (the simpler approach) or during the connection. Load balancing represents your site as one address and routes work (e.g., a user request for a Web page) to one of many servers.
Engineers have spent a lot of time developing sophisticated algorithms for balancing requests across multiple servers, but for most Web applications, a simple round-robin method works best. A round- robin approach routes the work to the first machine, then to the second, and so on until all machines have had their turn. The process then starts over again with the first machine. The short-lived time users spend at a site (typically less than 30 minutes) provides few opportunities for servers to become unbalanced. When you're deciding which load-balancing system to deploy, keep in mind that although many load-balancing algorithms make for interesting product literature, few are in use on real-world Web sites.
One well-known load-balancing software solution is Microsoft's Windows NT Load Balancing Service (WLBS). WLBS is available for Windows NT Server 4.0, Enterprise Edition (NTS/E—you can also download the software from Microsoft's Web site. Win2K repackaged this code into Network Load Balancing (NLB), which comes with the Win2K Advanced Server (Win2K AS) license. For more on NLB, see Microsoft’s Win2K Web site.
WLBS can determine server load and the availability of your hardware pool. You can configure a pool of up to 32 machines to use with WLBS. The software does an admirable job for such a solution, but performance might be an issue for large-scale sites.
Hardware-based load-balancing solutions consist of a proprietary piece of hardware located between the Internet and your Web servers. These systems do the same job as the software-based solutions, but they are optimized to route traffic to your Web servers. These hardware solutions can also perform tasks at a lower level in TCP/IP than the software solutions because they can change packets on the fly. Let's look at two hardware-based solutions: F5 Networks' Big-IP Load Balancer and ArrowPoint's CS-800.
F5 Networks' Big-IP Load Balancer device is a relatively low-cost, high-performance solution for small Web sites. BigIP Squared is limited to 45Mbps throughput, so larger Web sites can run out of bandwidth quickly. The company has also introduced the BigIP HA+ device, which provides 280Mbps of throughput—large enough for most Web sites.
ArrowPoint's CS-800 provides significant throughput (20GBps) and an integrated firewall. It is a good solution if you expect your site to break into the top 100 Web sites one day.
So which solution is best, software or hardware? As a big fan of hardware specialization, I tend to lean toward black-box solutions, but don't overlook the benefits of familiarization with Windows-based tools for load-balancing configuration and management. Take time to talk with hardware and software vendors to learn the complexities of managing your server farm. Now that you understand how to handle load balancing, you can turn your attention to setting up servers to handle client requests.
Web farms, as I've defined for this article, consist of identically configured servers that handle a certain number of users visiting a Web site. These users don't care which machine they connect to, as long as the Web site works as it would if they connected to one server. A Web farm lets you remove a failed machine from the list of machines available to the load balancer and continue until the machine is available again, regardless of whether that failure means rebooting or replacing the hardware. As your site grows, you can add more machines to handle additional capacity.
When deciding on a load-balancing solution, be aware that different hardware vendors have different management interfaces to let you add and remove machines from the list of available servers. Some provide for automatic testing for availability and others do not. As a result, you should make sure you understand the amount of work your group requires to manage this process. Also, most solutions limit how many machines you can include in a group. As I mentioned previously, WLBS limits you to 32 machines. This number might seem huge, until you need more. Don't base your decision solely on price; take time to study the price versus risk decision to ensure the long-term viability of any scaling solution.
Unfortunately, scaling your Web site isn't as simple as setting up a load balancer and purchasing a dozen machines. Hosting your Web site on multiple machines requires that you change your expectations about how users will use your Web site. So, just because a user connects to your site on a particular server doesn't mean that the user will continue to use that same server. The positives of machine failover and decreased downtime of your Web servers is tempered by the fact that you must manage your user information in a more distributed manner.
User data management (or state management) is complicated because often you must keep that information on separate systems in case a Web server fails during a user visit. So where should you store this data? You have two choices when it comes to storing state information about your users: databases and cookies. Databases are logical places for long-lived data (e.g., user IDs, passwords, demographics), and cookies are useful for short-lived information (e.g., shopping baskets, search criteria). Although I won't go into the development details of each, you can find information in Ted Pattison’s article "Working on a Web Farm," June 1999 Microsoft Internet Developer Magazine.
Remember that scalability is only as good as the weakest link, so creating a robust Web farm that has one database server to handle its load is of little use. Once you've scaled the Web servers, you need to look at scaling your database servers.
Scaling SQL Server
Microsoft SQL Server can handle many transactions. However, there are limits to the amount of work one machine can accomplish. Machines will always be overwhelmed at some point, which means that no matter how many Web servers you have in your Web farm, your site will eventually slow to a crawl. To help prevent this slowdown, it's critical to ensure that you scale your database servers as part of the scalability solution for your Web site.
When you scale your database servers, you must address two problems: server load and high availability. No single solution can solve both problems. Clustering (using Microsoft Cluster Server—MSCS), with the aid of Microsoft SQL Server Failover Support, can provide high availability in handling the server load. Using these tools, you can make two machines appear as one machine and provide for failover in case of catastrophic failure. Alternatively, segmentation involves separating data requirements onto separate machines that help distribute the load.
Clustering SQL Server
The term clustering is overused and might be confusing to some readers. In SQL Server terms, clustering refers to a type of indexing, which has nothing to do with clustering servers and is not the clustering I refer to in this article. Instead, I'm referring to clustering in WLBS terms, which is a group of servers that the load balancers use as a pool of machines. In Win2K and NT, clustering refers to the combination of two core technologies—MSCS and SQL Server Failover Support. For information about configuring MSCS, see the Microsoft article "Clustering Support for Microsoft SQL Server 6.5", and for information about SQL Server Failover Support, see the Microsoft article "Using SQL Server Failover Support".
MSCS for NTS/E, Win2K AS, or Windows 2000 Datacenter Server (Datacenter) lets you configure two machines as one virtual server. MSCS also lets you configure the first machine so that the OS will defer all operations to the second machine if the first one fails. You must install NTS/E, Win2K AS, or Datacenter and MSCS separately on a local disk for each machine. Depending on the exact configuration, you then install SQL Server on one or more shared disks and configure one or more SQL Server virtual servers.
A SQL Server virtual server is a composite entity to which all clients connect. This virtual server is what lets failover occur without any client noticing. In this way, the virtual server is analogous to the virtual Web server address that the load balancers use.
You can configure the virtual server in two ways: Active/Passive and Active/Active. In the Active/Passive configuration, you install one SQL Server system and use the second machine only as a failover system. In the Active/Active configuration, both machines maintain copies of SQL Server and actively take database requests. Failover occurs when one machine takes the load of both servers in case of failure on either system. Be aware that both SQL Server systems in this Active/Active configuration gain scalability simply by segmenting the data; however, both servers are not identical. The failover simply means that the servers can handle each other’s load, if necessary, in time of catastrophic failure.
When setting up your virtual servers, you must weigh the benefits and risks of each configuration. In the Active/Passive configuration, you have the maximum benefits of high availability, but the high cost of duplicate machines. Conversely, in the Active/Active configuration, you get more for your money, but in the case of failure, you will suffer a severe performance penalty. For information on Active/Passive configuration, see the Microsoft article "Active/Passive Failover Configuration,". For information on Active/Active configuration, see the Microsoft article "Active/Active Failover Configuraton,".
Segmenting SQL Server
Segmenting your database requests involves dividing up your databases into distinct data areas across servers, either single machines (i.e., a single server or a virtual server) or clusters of machines. In this way, you can more easily scale your site to allow for more users. So, for example, by placing your user and demographic data on one server and placing your content on another server, you can divide the workload. This approach gives you more room to grow. I don't suggest starting out with a dozen SQL Server systems for a small site; instead, start with several SQL Server databases. Because SQL Server databases are discrete islands of data, you can move them to other servers with little difficulty. Then, as you increase the server load, you can simply move these databases onto separate servers and scale your database servers.
This approach is not without limitations. For starters, SQL Server is a relational database. By locating the data discretely, you run into the problem of how to relate data in one database with data in another. SQL Server 7.0 does allow cross database and machine joins, but I don't suggest using these features because you will pay a significant performance penalty for day to day usage of cross database joins.
The best solution to this problem is proper planning. Consider the following example: Imagine you have a Web site that has all the news about widgets. You let anonymous users view all the news about widgets and join your user database to receive email updates about the latest widget news. You store all the news on the site in a database so you can change the look and feel of the site without modifying all the news. You also maintain an advertising database that lets you target banner ads on the site for non-anonymous users. This database lets you show banner ads based on demographic and geographic user information.
In this example, you can easily separate the user data from the content data because you never need to commingle the two data types. You can show news about widgets with disregard to any information about the user. What you can't do in this example is separate the advertising data from the user data because you will be doing joins between the two to show the appropriate ad banner. However, you can get around using these joins by storing information about the user in a cookie and using it for your advertising queries.
All Together Now
By bringing clustering together with segmentation, you can provide load balancing and high availability to your database servers. Coordinating the two is not as simple as planning your Web farm, but with good database design from the beginning, you can create databases that will scale as your Web site grows. With a mix of hardware support, software changes, and a new way of thinking about hardware failure, you can have a Web site that purports 99.9 percent availability. So, don't cancel your trip to Jamaica; just postpone it until you've had a chance to get ready for the onslaught of traffic.