Microsoft’s David DeWitt on SQL Server Query Optimization

A look at the Jim Gray Systems Lab, Denali’s columnar index feature, and more

Have you considered how the features found in the CTP of the latest version of SQL Server, such as SQL Server “Denali,” came to be? Someone has to come up with the ideas, design the features, and test them until they reach a high level of quality and can be considered for inclusion in the next version of SQL Server. Some of this research is done at Microsoft’s Jim Gray Systems Lab in Madison, Wisconsin. SQL Server Magazine’s Senior Technical Director Michael Otey and SQL Server Content Manager Megan Keller spoke with David DeWitt, a Technical Fellow in Microsoft’s Data and Storage Platform Division, who leads the lab. We discussed his work at the lab, query optimization, and how he sees SQL Server progressing in upcoming years.

SQL Server Magazine: David, can you tell our readers a bit about your technical background?

David DeWitt: I received my PhD from Michigan in 1976, joined the faculty at the University of Wisconsin-Madison that fall, and was a faculty member for almost 32 years before I retired and joined Microsoft in the spring of 2008. Along the way I built three different parallel data warehouse systems, including the first one that actually ran queries in the early 1980s, a second one called Gamma in the mid-to-late 1980s, and a third one called Paradise in the 1990s that we sold to NCR. I’ve been on the board of a number of startups, most recently Vertica. I joined Microsoft in March 2008 and I am part of the SQL Server organization, reporting into Quentin Clark, who reports into Ted \\[Kummert\\] who reports into Bob \\[Muglia\\]. So, that’s sort of the high-level background.

When I joined Microsoft, I was not given a specific charter. I was just told to go set up a lab and do interesting work. So far I have hired seven full-time employees, plus an admin. I have a small lab in Madison, Wisconsin, named after Jim Gray, a former Microsoft Technical Fellow who was lost sailing off the coast of San Francisco in January 2007, that’s affiliated with the university. There are also eight graduate students and three faculty members from the university’s database group involved in the activities of the lab. The lab is a collaboration between the university and Microsoft. So far it has proven really interesting.

SQL Server Magazine: What kind of things do you and your team do at the lab?

DeWitt: We’re involved in a number of different projects. We worked on some features for version 1 of the parallel data warehousing product, SQL Server 2008 R2 PDW \\[Parallel Data Warehouse\\], including analytics and semi-joins, and we are currently doing technology explorations for future versions of PDW. Even though I’m part of the development organization, my mission in the lab is to really work on features that may or may not be part of future releases of SQL Server. We’ve done some work on buffer pool extensions to SQL Server using solid state devices and are exploring how novel memory technologies, such as phase-change memory, might affect the design of SQL Server down the road. We are also exploring a variety of parallel database technologies.

Our mission is similar to a typical research lab, but with a slightly shorter-term focus. The graduate students involved pretty much do whatever they want to do under the supervision of their advisors. They are not Microsoft employees. They’re exploring a variety of ideas including doing query optimization with power constraints, text retrieval algorithms for parallel database systems, and advanced designs for highly scalable transaction processing systems. So the graduate students are being typical graduate students. The staff is focused on issues more directly aligned with the mission of SQL Server and technologies that can be used to improve SQL Server.

SQL Server Magazine
It’s pretty interesting that you use graduate students in the lab.

DeWitt: It’s an unusual situation; totally unique for Microsoft, and probably anywhere, in that the students aren’t Microsoft employees as I said previously. So Microsoft gives a grant to the university just as the National Science Foundation would give a grant. The graduate students are free to work on whatever interests them with their advisors. There is an IP \\[intellectual property\\] agreement in place between Microsoft and the university such that both sides benefit. The students benefit because they have funding that is not constrained by what their advisor proposed in a grant application to the NSF or DARPA. The students also benefit because, if they want to (and not all do, but about half do), they can actually take advantage of their ability to access Microsoft source code and actually do their research projects inside SQL Server. So that gives them access to a real commercial database solution. Microsoft benefits because we do get tiered rights to the patents, we get to look at the graduate students early on, and get early exposure to the creative new ideas that the graduate students come up with. So we’re looking over their shoulders, seeing what they’re working on, and giving them guidance as they go.

Pretty much everybody in my lab has a PhD, and some have experience beyond the PhD, so we get the privilege to act as collaborators and advisors, and we get the opportunity of seeing where their minds are taking them. All the best ideas come from graduate students—well, I shouldn’t say that, but more so than their faculty advisors. It is an unusual situation—it’s been an operation going on three years now, and both sides (we had a review recently) are very happy with the arrangement. We think it’ll give us a little commercial advantage, obviously. It pays back because a lot of people from Wisconsin have gone on to work at Microsoft, but we see a business opportunity, and the graduate students really benefit by having this sort of unlimited source of funding without any constraints.

SQL Server Magazine
Do any of the graduate students in the lab go on to work for Microsoft?

DeWitt: I think some will. Four of my current employees were graduate students at \\[the University of\\] Wisconsin. I’d say some will become professors, some will work for Microsoft, and some will work for our competitors. There’s no quid pro quo—no guarantee that when they take one of these \\[graduate student\\] positions that they will become a Microsoft employee. We encourage them to do that with internships here and in Redmond, but we also encourage them to look around and make sure that Microsoft is the right place for them.

The SQL Server world is looking for people that are very strong developers, and some grad students aren’t great developers and it wouldn’t be appropriate for them to join the SQL world. They might be great researchers, and Microsoft Research or Google Research might be a better fit. We’re trying to find the best fit for the students. Five years from now we’ll probably have a better perspective on what percentage actually end up going to work for Microsoft.

SQL Server Magazine
It sounds like you’ve done a lot of work in parallelism. How has your research helped the Parallel Data Warehouse, and is there other research you’ve done in the lab that is influencing the current versions of SQL Server?

DeWitt: As I said earlier, in PDW version 1 we actually delivered two features. Semi-joins was one feature, which is a way of executing parallel joins that reduce the amount of data that must be redistributed when the tables being joined are not both partitioned on their joining attributes. We also delivered analytics for PDW—the whole window aggregate feature. Since I’ve built several parallel database systems, I do have a fair amount of experience in this area, and I have 20 years of ideas that have never made it into a product. You can be sure that I will try very hard to get some of my favorites into future versions of the product. But again, it’s like any process, and there are always more good ideas than cycles to actually build them into products. There are a bunch of interesting problems and technologies that I’d like to see broaden the product in years to come. We’re helping out on version 2 of PDW in a way I can’t really talk about right now, but my lab staff is actively engaged with the Microsoft team located in Aliso Viejo, California, that has the main responsibility for all future versions of PDW.

And we’re very interested in how changes in storage technologies will influence database systems down the road; things as simple as putting the log on a solid state disk or putting whole databases on solid state disks, which we’ve seen some customers do. It is very expensive storage, but it’s high performance storage, and there are query optimization implications when you do that because certain kinds of indices, like non-clustered indices, don’t always work very well on a standard moving head drive, but would work very well on a solid state drive. We’ve done work extending the SQL Server buffer pool to store pages ejected from the buffer pool to a solid state device, sort of extending the size of the buffer pool, and have observed really significant performance improvements that can be obtained with such an approach.

SQL Server Magazine
Your keynote speeches at the PASS Summit the past three years have been incredibly popular. The past two years you mentioned the columnar index feature. Can you provide more insight into that feature?

DeWitt: The idea of a columnar index goes back to the very early days of the relational database—late 70s, early 80s. People at IBM had the idea of storing tables as columns rather than tables as rows. The idea has kind of come and gone—there are many commercial products now that support this—and we’ve announced it as part of SQL Server Denali. The key idea is that tables have gotten increasingly wider and wider. For certain types of queries it doesn’t make sense to read all the columns of the table from disk into memory. So storing tables as columns instead of rows allows the query optimizer to formulate an execution plan that brings only the necessary columns in from disk.

So, say a row is 500 bytes wide, but only three or four columns are needed to execute a query. You might only need 30 bytes out of the 500 bytes that would constitute the normal row layout. So, database systems that offer a column storage layout (Microsoft calls them columnar indexes) allow the database system to read just the necessary columns. So you save a lot of I/O. The storage format is especially interesting for decision-support queries, but not so interesting for OLTP applications. The I/O savings can be a factor of 10, and typically customers will see somewhere between a factor of three and a factor of 50 improvement in performance for certain classes of queries. By sorting and compressing things appropriately, you can not only save reading unnecessary columns, but the database system can also compress data very aggressively. This technology can work really well for certain classes of decision-support queries.

We’ve announced \\[the columnar index feature\\] as part of SQL Server Denali, and we’ll continue to enhance its capabilities in future releases. It’s complex because sometimes there are both column and row representations of a table and the query optimizer must pick which is the most efficient one to use as well as ensuring that both copies get updated. So it brings challenges—I know my friends at Vertica are still working on their query optimizer. Query optimization is hard; query optimization for column stores is even harder. It’s an area that will be challenging us for years to come.

SQL Server Magazine
You mentioned that the data is compressed with the columnar index feature—are these the same data compression techniques that are being used in PowerPivot? And with the columnar index feature, what types of situations would customers use this feature in? Is it best for very wide tables?

DeWitt:: First of all, several of the compression techniques are very similar to what’s used in PowerPivot, which are themselves similar to techniques used in other column store implementations, such as run-length encoding, dictionary encoding, and a whole variety of compression techniques depending on what the data looks like. You can use bit-level encoding where the database system uses the minimum number of bits for each value in each column. Everybody’s compression techniques, if you squint at them, are about the same because they’re well-developed techniques used by people doing compression even outside the database area. Most are not database-specific compression techniques.

What’s a customer to do? I think this is an example of the need for tools like SQL Server’s Index Tuning Wizard, which allows people to get advice from the database system about what indices should be built based on the workload. I think there is an opportunity to build a similar kind of wizard for columnar indices. It will be a challenge for the DBA to decide whether a columnar index would be beneficial and what compression techniques should be used. If it’s a very wide table and most of the workload touches only a few of the columns, it’s a no-brainer. But as the number of columns that are typically accessed by the majority of queries starts to approach 30 to 50 percent of the overall number of columns in the table, it’s probably not worthwhile. There is some cost—you get big I/O savings by only reading the column that the query needs, but you pay a CPU price because eventually you have to formulate rows out of these columns to return to the user. So this technique mainly affects selection operators at the bottom of the query tree, and at some point the database has to glue these columns back together to form rows. That can be a costly operation if it’s not done efficiently. There’s a bunch of research out there about how to do it efficiently, but it’s still not always going to be the right technology for every customer to use. And I think we need better tools to help a customer understand when it is and when it’s not going to be a useful feature.

SQL Server Magazine
What are some of the main factors that influence query optimization?

DeWitt: There are a bunch of different things. One of the hardest things to get right is what we call estimating the selectivity factor of a predicate. Anytime a query applies a predicate to a table, the optimizer must estimate the number of rows from that table that will satisfy the predicate. This estimate is important because it can influence whether the database system should scan the table or whether it should use a clustered or non-clustered index. If it’s a very highly selective predicate on a column that has a non-clustered index, it can be useful to use that index. However, if the optimizer misestimates the selectivity factor of the predicate, it could be a really terrible plan. So selectivity estimation is problem number one. And, as I talked about previously, we use a variety of techniques (especially histograms) to summarize the data so that the query optimizer can do a good job of estimating the selectivity factor of each predicate.

The selectivity factor predicate not only affects what technique will be used to access the data on disk, it also impacts the order in which the operators in the query get executed and which algorithm is used for each operator. So if a selection is followed by a join, the query optimizer must pick an algorithm for doing the join—whether a hash join, index nested loops join, or sort merge join. Every SQL operator has a set of algorithms that can be used to execute the operator, and each algorithm has a cost model. The selectivity factor affects the estimate of the sizes of the inputs to the operator, how many rows the operator will produce, and the cost of the operator in terms of both CPU and I/O time to execute operator. Estimating the CPU time and number of I/O operations is also hard because the optimizer generally does not have an accurate model of the hardware that will be used to run the query or what other queries will be running concurrently.

So those are a couple of the challenges, and when you run queries in parallel on a system like PDW, it gets even harder because you have to deal with estimating the cost of redistributing data between the nodes of the appliance for certain kinds of queries.

SQL Server Magazine
How do you see SQL Server advancing as a data platform in the future?

DeWitt: With the PDW product line, I think we’re going to do a really good job of scaling out. We have a great single box product. In fact, I don’t think anybody’s single box product is any better than ours, and I think we have the start of a great offering in the parallel space. I think over the years to come we will be increasingly credible on massive, petabyte-scale databases. That’s certainly my intention and one of my goals as a Microsoft employee.
I also think SQL Azure is a truly exciting product. I manage a little database for a local swimming club, and I moved it into the cloud last week. I was truly impressed with how easy it was to take the application, move the data to the cloud, and get the application running again. I think it’s an amazing paradigm shift, and for thousands of small businesses that don’t want to maintain their own servers and deal with all the headaches associated with doing so, the move to the cloud is going to be truly compelling. So I think we’re going to have a great story to tell, and I don’t want to get into trouble here, but I don’t think our competitors are going to be able to touch us.

Oracle obviously has a great product offering; they have some really smart people and have executed amazingly well. But Microsoft has presence from Bing, Windows Live, and Hotmail—we’re used to running data centers with tens of thousands of computers all over the world, and that’s going to give us a leg up on everybody else.

Amazon doesn’t own a database product—yes, they know how to host web services. Google obviously is great at operating data centers at massive scale, but they don’t have a real database product. Oracle has no real cloud expertise at all, so I think the SQL Azure product line is going to be a huge success down the road for thousands and thousands of businesses. I think as we’re learning from PDW how to operate queries at scale, we will undoubtedly think about taking those technologies and moving them into the cloud, too. With the solid product offering we have today, the expertise we have in managing web data centers, I see great opportunities for SQL Azure going forward.

SQL Server Magazine
So what do you see as the future for the Jim Gray Systems Lab?

DeWitt: I think the future for the lab is to double or triple in size over the next couple years—I don’t think we’ll ever grow to 50 employees, at least not while I’m around, but I think we’ll double from 8 to 16 advanced developers/researchers. I think the relationship with the graduate students and the energy they bring to the lab is really super. Almost every company has summer interns, but these are summer interns that in some ways never go away. No, we don’t get to tell them what to do, but we get to interact with them on a week in, week out, month in, month out, year in, year out basis. Half our lab is full-time employees currently and half is grad students. So, every two to three years, these people go away and new people show up because they graduate, get their degrees, and the new people bring new ideas and different perspectives on hard problems.

SQL Server Magazine
Well I can certainly see how the lab benefits both Microsoft and the students and resulted in some really cool products.

DeWitt: I’ll be surprised if other companies don’t look at the success \\[of the lab\\] and say, in other situations where there’s a close proximity, one could repeat this model. Both Microsoft and the University of Wisconsin are very protective of their intellectual property rights, and the fact that they could work out an agreement acceptable to both indicates to me that almost any \\[organization\\] could work it out.

Hide comments


  • Allowed HTML tags: <em> <strong> <blockquote> <br> <p>

Plain text

  • No HTML tags allowed.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Lines and paragraphs break automatically.