IBM's TPC-H Benchmark on Linux—What Does It Really Mean?

Last week, I discussed IBM's acquisition of Informix and its declared intention to come out on top in the database wars. This week, I explore the same issue from a different angle. On May 15, IBM DB2 (interestingly, running on Linux) posted a new world record for the Transaction Processing Performance Council's (TPC's) 100GB TPC-H benchmark, beating Microsoft's old world record. First the facts, then some thoughts.

  • TPC-H is a decision-support benchmark, unlike TPC-C, which is an online transaction processing (OLTP) benchmark. You can find more information about the TPC benchmarks on the TPC's Web site.
  • IBM's new world record is 2733 queries per hour (QphH), with a system cost of $347/QphH.
  • Microsoft's old world record was 1699 QphH at $161/QphH.
  • Microsoft achieved its score by using eight PIII 700MHz Intel processors on a single server; IBM used 16 PIII 700MHz processors spread across a four-node cluster.
  • IBM's score was 60 percent faster than Microsoft's but was more than two times more expensive on a per-query basis and 25 percent less efficient on a CPU-use basis.

I could turn this column into a classic benchmarking discussion, weighing the pros and cons of clustered versus nonclustered approaches. But that would be too easy. I could focus on the fact that Microsoft hasn't posted a new TPC-H score in almost a year and could probably top the new IBM score with little effort by using the same distributed partitioned views scale-out technology that gave Microsoft the world record in the TPC-C battle. That's also too easy. I could point out that Oracle has never posted a TPC-H 100GB score, presumably because its price or performance simply isn't competitive. But again, that would be too easy. Instead, I'd like to point out that this is the first time anyone has posted any kind of TPC score on the Linux platform. Is that important? Absolutely! TPC benchmarks are important from a technology perspective. But in many ways, the TPC game is nothing more than a carefully crafted game of cat and mouse.

Engineers don't publish TPC scores; marketing folks do. I'm sure IBM could have posted higher and less expensive numbers using a mainstream UNIX or Windows system. After all, DB2 achieved its fastest TPC-C score on Windows 2000. So why did IBM use Linux? "We are interested in helping bring Linux into the mainstream," said Berni Schiefer, IBM's distinguished engineer and manager of performance and advanced technology for the Data Management Solutions Group. "There has been no TPC benchmark published on Linux to date. No TPC-C, TPC-H, TPC-R, or TPC-W."

Why does IBM care about bringing Linux into the mainstream? I think it's because Microsoft is afraid of Linux, although Microsoft will never admit it. I don't believe open-source OSs will ever become enterprise-class commodities unless a) you live in a socialist state that strips intellectual property rights from computer companies, or b) you live in a fantasy world where computer companies voluntarily give up their intellectual property rights and the lucrative revenue streams that go along with them. But I'm the first to admit that many people disagree with me and that many people who hate Microsoft love Linux and the open-source movement in general. Would some database customers show a preference for a vendor, such as IBM, that's "interested in helping bring Linux into the mainstream"—even if those same customers have no intention of running their mission-critical applications on Linux? I bet some marketing folks think so.

If IBM is serious about winning the database wars, the company is smart to market to those people. What do you think?

Hide comments


  • Allowed HTML tags: <em> <strong> <blockquote> <br> <p>

Plain text

  • No HTML tags allowed.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Lines and paragraphs break automatically.