Over the past 2 weeks, I've been speaking on the Microsoft TechEd tour in Australia, New Zealand, and Asia. One of the best parts of traveling around the world with Microsoft's technical leaders is learning about the way Microsoft runs its business from the managers who actually run it. I was privileged to hang out with Casey Jacobs, group operations manager of Microsoft.com. Casey and I discussed how Microsoft.com runs, and I have to admit that the operation's technical efficiency blew me away.
On the day Casey and I met, I happened to be working on a presentation about Windows SharePoint Services beta 2. When Casey saw that I was working on this presentation, he mentioned that Microsoft.com had just started using SharePoint beta 2 for searches and explained that when Microsoft.com begins to use a new Microsoft product, its systems engineers and DBAs work directly with the product's development team. Those engineers and administrators relay information about the new product's performance, functionality, and operational bugs to the product team. They also help the product team by running proposed bug fixes. Such partnerships between operations and product teams has helped Microsoft raise the quality of Windows, SQL Server, IIS, Microsoft Media Server (MMS), and now SharePoint products.
I'm a bit sad yet relieved about Microsoft.com's use of the Windows SharePoint Services search. I'm sad because Microsoft.com had been using the Site Server 3.0 Commerce Edition search for many years, and I had worked on that product team. However, I'm relieved because Microsoft.com was able to come up with a search replacement pronto when Site Server 3.0 was officially mothballed.
Here's how the SharePoint search works. When a user performs a search from one of the many search UIs on Microsoft.com, the search executes at Search.Microsoft.com, which consists of a set of Web servers that run various software, such as Windows SharePoint Services, an ASP.NET application that provides the search UI, and XML Web services. After the user enters and submits a search term, the ASP.NET UI application passes the search term to XML Web services, which performs several operations, such as compiling customized lists of Best Bets and Related Links, performing a spell check, and creating a "Did you mean?" helpful hint. XML Web services then takes all potential relevant terms, or "nTerms," from the original string input and passes them to Windows SharePoint Services, which performs a database lookup. The ASP.NET UI application uses ASP.NET caching liberally for both the search UI and XML Web services layers.
The scalability and performance that Microsoft.com is achieving is impressive. The activity on the Search.Microsoft.com ASP.NET UI application is about 1.5 million searches daily. The ASP.NET UI application takes the search term and runs searches eight ways (sort of like a multi-threaded search), which means that Windows SharePoint Services has to resolve about 12 million search components daily.
Microsoft.com has 16 unique SharePoint catalogs, the largest of which is 650MB. Windows SharePoint Services crawls through 100GB of Web content from numerous Microsoft.com sites. A full crawl through all 16 catalogs takes about 10 hours. A full propagation of all 16 catalogs takes 45 minutes. But despite the size and number of catalogs, the average SharePoint query time on Microsoft.com is an impressive 1.5 seconds.
The physical hardware infrastructure behind Microsoft.com's SharePoint search isn't as daunting as you might expect for one of the Internet's most searched sites. Microsoft.com has a Web farm of 34 servers. Here's a breakdown of those servers:
- Ten load-balanced servers (Compaq ProLiant DL380 G3 servers with 4GB of memory and two processors) house Search.Microsoft.com and the ASP.NET UI application.
- Eight servers (ProLiant DL380 G3 servers with 4GB of memory and two processors) host the XML Web services' editorialized term relevance function.
- Ten servers (ProLiant ML570 G2 servers with 2GB of memory and four processors) host Windows SharePoint Services.
- Two servers (ProLiant ML570 G2 servers with 4GB of memory and four processors) host the SQL Server 2000 databases that Windows SharePoint Services uses.
- Four servers (ProLiant ML570 G2 servers with 4GB of memory and four processors) host the SharePoint Build/Index Services.
Microsoft houses XML Web services on a separate set of servers from the ASP.NET UI application and Windows SharePoint Services because other Web sites also leverage XML Web services. This type of service-oriented architecture is Microsoft's promise of the future of distributed computing.
Getting a peek under the hood at the way Microsoft.com uses its Windows SharePoint Services search was quite interesting. Casey had more to say about the way Microsoft.com runs, which I'll share in future commentaries.