Watch out, Arnold! T3 is here with a vengeance, but T3 is much more than a sissy, metal-morphing android from the future. Last week, Microsoft laid claim to yet another world record in the landscape of database technology when the company published benchmark results for the world's largest Multidimensional OLAP- (MOLAP-) based data warehouse. The project that recorded the benchmark record is called T3 (short for TerraByte cube). T3's sheer magnitude is an amazing leap forward for the OLAP world and serves as powerful proof that Microsoft-based data-management platforms can scale to meet enterprise needs.
According to WinterCorp's public audit of the benchmark results, Microsoft based T3 on 40GB of production-quality data obtained from a commercial data provider. The data covered 716,252 products from 133,003 brands in 71 markets during a 5-year period. The data provider supplied instructions for programmatically expanding the initial 40GB data load to 1.2TB of realistic business data spread across 7.7 billion rows. Microsoft scrubbed the data to protect privacy, then used SQL Server 2000 Analysis Services to create a 471GB MOLAP cube.
T3 pushes the envelope of OLAP scalability when it comes to MOLAP cube size, but the project's query performance is equally impressive. According to Microsoft, processing the 7.7 billion rows took 53 hours and included retrieving the rows from the data warehouse server, creating aggregations and indexes, and populating the MOLAP cube. That processing equates to 145 million rows per hour or 40,000 rows per second. Microsoft designed T3 to mimic the realistic use of 50 concurrent users: Each user submitted 27 queries with an average wait time between queries of 30 seconds. None of the queries was the same, creating a total of 1350 distinct queries per test run. The median query response time was only 0.2 seconds for warm cache queries and 0.8 seconds for cold cache queries. Think about that—7.7 billion rows and 1.2TB of raw data queried on a cold cache with a median response time of 0.8 seconds! The word impressive doesn't do T3 justice.
Unquestionably, T3 raises the bar several notches higher in the world of data warehousing. But T3 intrigues me for another important reason. To the best of my knowledge, Microsoft has released the first public database benchmark that runs on a 32-CPU Windows 2000 Datacenter Server. Microsoft ran the test on a Unisys ES7000 with 32 PIII Xeon 700MHz processors.
If you have experience with OLAP solutions, you probably find these numbers impressive. However, the numbers are important even if you aren't currently working with OLAP. First and foremost, this project is another example of Microsoft's very real ability to compete with true enterprise-class data systems. Terrabyte cubes might not sound that large when compared to pure relational systems, but this is the first public audit of a 1TB MOLAP system that I'm aware of. Second, you might not be working with OLAP today, but there's a good chance you will be sometime in the very near future. Fundamentally, OLAP is designed to help you make better decisions. And you do want to make better decisions, don't you? You can read the full T3 technical report, audit, and supporting information on the T3 home page.