How can one OS scale from home-user desktops to SMP clusters with millions of terabytes of online storage and hundreds of gigabytes of RAM? Microsoft's answer is to offer different flavors of Windows 2000 (Win2K) to meet different needs. Windows NT Server 4.0 and earlier versions offered some scalability—you could purchase regular NT Server 4.0 or, when you needed to cluster systems, you could purchase NT Server, Enterprise Edition (NTS/E). These products are identical at the core, but Microsoft designed them for use in different situations. To take this scalability concept a step further, Microsoft is offering several varieties of Win2K. Microsoft specifically designed each Win2K version for a target market. For example, Win2K Professional (Win2K Pro) is for business desktop use. Win2K Server is for departmental and workgroup use. Win2K Advanced Server (Win2K AS) is for businesses that require high availability. Win2K Datacenter Server (Datacenter) is for businesses with the most demanding environments.
To offer so many versions of Win2K, Microsoft had to improve NT's scalability in key areas: memory, storage subsystems, SMP, and directory services. In this article, I discuss each of these areas in regard to the Win2K versions and the Win2K features. I also explain the benefits that the Win2K scalability features provide. To read more about the enhancements in Win2K that make the OS more scalable than NT, see Mark Russinovich, "Inside Win2K Scalability Enhancements, Part 1," page 51.
Remember the days when 4GB of RAM equaled a large amount of storage for a hard disk? In enterprise and data center computing, 4GB of addressable RAM isn't enough for some applications. The primary limiter of RAM for NT 4.0 and earlier OSs was the processor.
In the world of 32-bit processors, a unique 32-bit address distinguishes every byte in memory. Because each bit in the address can be either 1 or 0, the CPU can address up to 232 (i.e., 4,294,967,296) bytes of RAM, or 4GB of RAM. For many applications, 4GB of RAM provides sufficient server memory. But larger realtime applications (e.g., online transaction processing—OLTP—applications and e-commerce applications) require a short response time, and one way to dramatically improve response time is to keep data in RAM. So, 4GB of RAM might not provide enough capacity for a realtime application.
To remove the 4GB memory limit, Intel Pentium II Xeon processors use the new Physical Address Extension (PAE) feature that lets the processor use 36 bits when addressing physical memory. (Compaq Alpha processors provide the same enhancement using another technology.) The net result of this subtle change is that with 36 bits, a processor can address up to 236 (i.e., 68,719,476,736) bytes of RAM, or 64GB of RAM. This amount of supported memory is an increase over the 32-bit limit by a factor of 16.
Although 36-bit addressing is technically a change in processor technology, an OS must support 36-bit addressing to take advantage of this capability. Microsoft has built 36-bit addressing support into Win2K's kernel. However, Microsoft has stated that only selected Win2K versions, which will likely be Win2K AS and Datacenter, will support 36-bit addressing. At press time, Win2K AS and Datacenter are the only Win2K versions that provide 36-bit addressing support.
For application compatibility, developers won't need to change existing code to take advantage of PAE. Because Win2K supports PAE at the kernel level, existing applications can recognize all available memory, including memory above 4GB, without any updates. Microsoft has also implemented a new set of APIs, Address Windowing Extensions (AWE), that let developers implement an advanced level of control for additional memory space.
One storage truth is that an organization's storage requirements will grow in excess of the available storage on a regular basis. No matter what your disk space requirements are now, the odds are good that in 1 or 2 years your disk space requirements will be even greater. Whether you're looking for an OS that can smoothly scale with your growing needs or you're looking to build a large storage subsystem, Microsoft has improved Win2K to support your needs.
One of the key factors of the increased storage scalability in Win2K is the addition of reparse points to NTFS. Reparse points are special NTFS objects that instruct Win2K to execute extended file system functionality when the OS encounters the reparse points. A request can return NTFS objects either directly or through a series of file system drivers, as Figure 1 shows. Administrators can implement this extended functionality via an installable file system driver that has a tag matching the reparse point. Microsoft has included reparse points so that third-party software vendors can extend Win2K functionality. Microsoft has also included two new scalability features in Win2K that utilize reparse points—NTFS Volume Mount Point and Remote Storage Service (RSS).
Have you ever wondered what happens to an NT 4.0 computer when you try to add more disks or volumes to the system than the alphabet has letters? Each volume on an NT 4.0 server must have a unique drive letter, and after you've reached Z, you're out of luck. I hope you haven't run out of drive letters, but in enterprise systems, some large storage implementations might need a few dozen more drive letters. Microsoft developed the NTFS Volume Mount Point to solve this problem.
An NTFS Volume Mount Point is an NTFS reparse point object that instructs the file system to mount a volume in response to a user's request at a specific point in the file system structure. Suppose you have a server that has six CD-ROM drives installed on it. Each CD-ROM drive represents a volume and, with NT, takes up a drive letter. To share these volumes with your user population, you need to create six separate shares: one for each of the drive letters assigned to the CD-ROMs. With Win2K, you can define a directory on an existing NTFS volume called C:\cdroms, then define subdirectories such as cdrom1, cdrom2, and so on. Within each of those subdirectories, you place an NTFS Volume Mount Point that instructs the OS to mount the appropriate CD-ROM volume for the user and return the requested result, as Figure 2 shows. This capability is a welcome feature for administrators who are nearing the drive-letter limit on an NT server and need to add more storage capacity.
You can also add additional storage capacity by creating an NTFS Volume Mount Point and directing it to a new volume, rather than adding a new drive to the system and potentially adding another drive letter. Because extending a volume set in NT 4.0 has the same net effect as adding a new drive, extending volume sets might not be the best option for large-scale storage systems.
The second scalability feature that reparse points make possible is RSS. RSS is Microsoft's implementation of Hierarchical Storage Management (HSM). RSS lets Win2K automatically move old data off the primary storage volumes (i.e., online media) and onto secondary media, such as optical media or tape media (i.e., near-line media).
You can set parameters (e.g., time intervals) that define when Win2K needs to move data, and Win2K will automatically find data outside those parameters and move that data from the primary storage area to a predefined near-line storage area. Then, to mark the removed files and directories, Win2K will place a remote storage reparse point in the files' former location that directs Win2K to look to near-line storage for the data.
Suppose you manage a Win2K network for a stock brokerage firm. For compliance with Securities and Exchange Commission (SEC) regulations, you must keep copies of all transactional data for 7 years. You can use RSS to have Win2K migrate data that occurs before a specific date off the primary disk arrays and onto secondary media to keep the entire 7 years' worth of data online. Then, when users need to access data from 5 years ago, the users simply navigate through the directory structures of their system. When Win2K hits a remote storage reparse point, Win2K obeys the instructions in the reparse point to retrieve the file from the secondary media and return the data to the users. This process is transparent to users and occurs without the need to direct support personnel to mount an old transaction backup tape. For large storage applications, RSS is a welcome addition to Win2K.
For increased scalability, Microsoft has improved SMP in Win2K. However, in the Win2K betas, Microsoft scaled back the default capabilities of the base OS from its predecessor, NT Server 4.0. As you might recall, the standard out-of-the-box implementation of NT Server supports up to four processors (i.e., 4-way SMP), and NTS/E supports up to eight processors (i.e., 8-way SMP). Because of a marketing misjudgment on Microsoft's part, the company had planned to support only 2-way SMP on Win2K Server and 4-way SMP on Win2K AS, which would have removed capabilities that NT previously had. Customers upgrading to Win2K could maintain the same level of SMP support. However, users who performed a clean installation of Win2K and wanted 4-way SMP would have needed to purchase Win2K AS (and would have paid for additional functionality that they might not have used, such as clustering support).
Recently, Microsoft reversed its SMP position and announced Win2K's final packaging. Now Microsoft plans to support 4-way SMP on Win2K Server, 8-way SMP on Win2K AS, and 32-way SMP on Datacenter out of the box. Although 32-way SMP might seem like a mind-boggling amount of processing power for one server, this power is necessary for Microsoft to position Datacenter as a competitor for big iron.
In addition to the increase to 32-way SMP support in Datacenter, Microsoft has improved SMP scalability and performance. To accomplish these improvements, Microsoft has better tuned memory allocation and memory locking to reduce resource contention across processors.
A widely touted Win2K feature is Active Directory (AD), which lets you catalog all user, computer, and relevant network configuration data across an enterprise. Although AD isn't generally associated with scalability, it's indeed a scalability enhancement.
In NT 4.0 and earlier versions, domains could store only 30,000 to 40,000 objects (depending on what Microsoft materials you read). For large multinational organizations, the object limit didn't provide enough capacity and often required administrators to implement multidomain structures and complex trust relationships to handle the necessary objects.
AD removes the object limit; domains can now support more than 1 million objects. Compaq tested an AD structure that contains more than 16 million objects on a system with 2GB of RAM. Although the object database grew to more than 60GB of data, AD continued to service client requests in a timely fashion. An object database in the 60GB range shows how AD can scale to fit the demands of even the largest organizations.
Win2K also uses transitive trusts, which is a scalability enhancement. With NT 4.0 domains, even if domain A trusted domain B and domain B trusted domain C, domain A wouldn't trust domain C, unless you manually defined such a trust. Defining specific trust relationships isn't a major problem if you have only a handful of domains to manage, but it can be a major problem in a large organization with hundreds of geographically dispersed domains. Adding a new NT 4.0 domain in a large organization can require administrators to define hundreds of new trust relationships. Win2K solves this problem through transitive trusts (i.e., if domain A trusts domain B and domain B trusts domain C, then domain A trusts domain C) and lets a subdomain in the AD tree trust all the domains that its parent domain trusts. Transitive trusts simplify the job of adding new domains to a growing Win2K infrastructure.
How Much Is Enough?
Depending on the type of scalability you're looking for in a server OS, Win2K Server might be the right product for you. If your business requires high availability, Win2K AS might be the right choice. Microsoft has designed Datacenter to meet the needs of the most demanding environments; the top end for a Datacenter server is a 32-way SMP clustered server with 64GB of RAM and an unlimited amount of online storage. But regardless of your specific needs, if you need more RAM, more processing power, or more online storage, Win2K's scalability features will simplify your job.