Federated Databases and Flying Pink Elephants - 02 Mar 2000

Last week, I pronounced SQL Server the "fastest database in the known universe," at least according to the industry-standard TPC-C benchmark. In case you haven't heard, SQL Server 2000 posted a TPC-C score of 227,079 transactions per minute (tpmC), topping the next fastest non-Microsoft score (running on Oracle) of 135,815. (To review last week's column or see http://www.tpc.org for information about the TPC-C test.)

SQL Server 2000 achieved its impressive scores using a new feature called distributed partitioned views. This configuration let Microsoft physically distribute slices of a single table across multiple servers but logically access the table as an integrated whole. Microsoft calls this configuration a "federated database," a precursor to the 100 percent shared-nothing cluster architecture plans for Yukon (the version after SQL Server 2000). Microsoft spread the federated database across 12 Compaq servers, each running 8 CPUs for a combined total of 96 CPUs. The best Oracle score came on a single SMP-based IBM machine running 24 CPUs.

Is it fair to compare a 96-CPU system to a 24-CPU system? Certainly. We’re talking about database benchmarks, so performance and price are the primary concerns. And the 96-CPU SQL Server system was 67 percent faster and 60 percent cheaper than the 24-node non-Microsoft solution. Of course, any sane person would rather administer a single SMP machine than a federated system of 12 nodes, and Microsoft still has to prove its federated database technology will be practical to implement.

Consider a reader's response to last week's column: "I see too many benchmarks from Oracle, IBM, Microsoft, Informix, etc. Everyone tells me to buy their product because the latest TPC benchmark brings them to the top, but I've learned to be skeptical. It's the real world, man. Don't bombard us with these fantastic benchmarks that cost millions of dollars to build when most of us are supporting $10,000 servers with one or two processors."

Although we might run systems that are larger than two CPUs, most of us will never have a business problem that requires the processing power offered by these TPC-special behemoths. And most of us will never need to scale a single application beyond the limits of today's 8-way Intel-based systems. Microsoft's recent TPC-C scores on a single 8-CPU SMP node hover in the mid-40,000 tpmC range, which is more than 12 million OLTP transactions per day. Last year, the NYSE executed about 1 million transactions a day, while Visa, Citicorp, Bank of America, and Wal-Mart ranged from 10 million to 40 million transactions a day. In other words, you can do a heck of a lot of work on a single SQL Server 8-CPU SMP node, and we'll soon have 16- and 36-CPU SMP nodes to work with.

If few people will need to scale beyond a single-node SMP system, why are distributed partitioned views so important to SQL Server's success as an enterprise-class database? The answer has to do with flying pink elephants.

Many of my conversations with potential customers over the years have gone something like this: "It's clear that the shipping version of SQL Server can absolutely meet all of my current and foreseeable performance needs. But what if flying pink elephants attack from outer space, and I suddenly need LOTS more processing power than I could possibly imagine under any conceivable business scenario? I know Oracle costs a lot more than SQL Server, but those pink elephants can be scary. I'd better use Oracle because I know it can scale to meet my needs."

Many organizations (small and large) will justify SQL Server deployment based on the mere existence of federated database technology, which will boost the overall adoption of SQL Server in the enterprise. Although we may never need to build a federated database, it's nice to know we can scale that high if flying pink elephants from outer space ever attack.

Hide comments


  • Allowed HTML tags: <em> <strong> <blockquote> <br> <p>

Plain text

  • No HTML tags allowed.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Lines and paragraphs break automatically.