Distinguished engineer Peter Spiro, leader of Microsoft's core SQL Server engine team and an 8-year veteran of the database group, talks with SQL Server MVP Brian Moran about stomping bugs, expanding programmability, adding function, and simplifying administration on the way to a better, faster, stronger data-management platform.
Brian Moran: The next version of SQL Server—code-named Yukon—and other Microsoft products have been delayed because of comprehensive debugging and security-checking activities that froze development companywide for several months earlier this year. What did you learn from this exercise that will help you ensure the quality and security of the Yukon release and any service packs, SQLXML releases, and so on that the company rolls out in the meantime?
Peter Spiro: The full SQL Server team—developers, testers, program managers, and management—spent 3 months working on security, and only security. We identified different pieces of code and how vulnerable they might be, and we went line by line through the entire code base with teams of developers. Even though we were looking for security, we also found places where the code wasn't as efficient as it could be, little nits here and there. So we did a lot of code cleanup beyond the security check, improving the quality as well as the security of the product.
In the process, we also developed some interesting tools, which we plan to reuse in the development of future versions, that automate the checking of buffer overruns and other types of coding errors that might create vulnerabilities. Security vulnerabilities often occur because different people are working on different sections of the code and have different assumptions about what's being checked or passed in. Some of our new tools validate those assumptions from the top of the stack to the bottom, automating more internal checks. As we improved the security that was already in place, we also identified some security features that need to be added in different parts of the product. We went beyond cleanup to a whole new mind-set about security and quality.
BM: Liberty, the 64-bit version of SQL Server that's in beta testing right now and that's due out this fall, has been generating a lot of buzz. Some people think a 64-bit database is automagically going to solve all their performance problems, but that isn't the case for many customers. What customers are going to see the most benefit from 64-bit?
PS: You're exactly right, Liberty isn't a panacea for all performance problems. Generally, applications that benefit from more memory will do better on the 64-bit machines. We did some interesting things in SQL Server that are going to make it perform nicely on 64-bit machines. Most databases' internal components use a lot of memory. Generally, they use the memory in different ways. So, for example, a procedure cache might use memory to roll back transactions. If you're doing hash joins, you need memory. The log manager needs memory, the buffer pool needs memory, and so on. Most subsystems have their own algorithms and data structures that use memory and, in fact, compete with each other because they've evolved over time. So, one thing we did in SQL Server 7.0 was put a lot of thought into integrated memory management and usage. Rather than have competing subsystems, we built a system where all the different subsystems that use memory cooperate with each other. We have one mechanism in the buffer pool that controls all the memory. We have different policies for different memory spaces and hand out memory on the fly.
We put a lot of work into this integrated memory model, and when you go to large amounts of memory, this architecture will pay off even more. The buffer pool's going to get a huge amount of memory with 64-bit, but we don't have to change data structures. We just need a few query processor algorithms or procedure cache algorithms because the subsystems have already hooked in and integrated with each other. Beyond just getting a bigger buffer pool with 64-bit, which is what a large system will give you, applications will get a kind of multiplicative performance win by having all the different subsystems using this larger memory space without having to change their algorithms.
BM: So this subsystem work has enabled you to make SQL Server more self-tuning—capable of monitoring current system behavior and of responding appropriately?
PS: Absolutely. In fact, as we designed SQL Server 7.0, we drew on 20 years of research and knowledge. And when we finally re-implemented a lot of the core server internals, we took best-of-breed thought from previous products and previous research—particularly about integrated memory management—to make SQL Server 7.0 extremely easy to use and self-managing. This architecture gives SQL Server cleaner internals for using memory, and lets you monitor different behaviors without having to tweak a bunch of knobs to manage the subsystems' different memory branches.
Also remember that as we built SQL Server 7.0, we knew 64-bit was coming. So we designed a lot of the data systems, data structures, and algorithms—for example, the buffer pool and the hashing tables—with 64-bit in mind. To prepare most products to run on 64-bit hardware, you'd normally need lots of people sifting through the entire product, changing data structures and algorithms. We assigned just a couple of people to prepare for 64-bit because we did all the work up front to make sure both the data structures and the algorithms were able to handle large amounts of memory. What we've had to concentrate on as we use more CPUs is eliminating the bottlenecks. If you have plenty of CPUs and memory, you have plenty of processing power. You just have to watch for internal synchronization points.
BM: Although SQL Server does a great job of responding to the environment and tuning itself, a lot of professional DBAs still think SQL Server is unsophisticated and less tunable than other database management systems (DBMSs) because it doesn't have all those knobs to tweak. What's your response to such criticisms?
PS: I've been working with databases for almost 20 years, and as we built these engines in the mid-'80s, we added functionality and added more knobs so that we could tune that functionality for more efficient performance. In the process, we developed products that could solve any problem, but only half a dozen people in the world could tune the system so that it operated correctly and efficiently. When I came to Microsoft, we made a big push early on to rebuild algorithms and data structures to make the system more self-tuning. If we needed to build a log manager, we asked how much memory we needed to enter transactions and what we needed to do as we grouped log records to write them to the log. We knew all the steps that we skipped in previous systems, and instead of saying, "We'll set this parameter, and we'll set that parameter," we incorporated the 10 to 15 years of experience into taking care of, say, 80 percent of the tuning right off the bat.
BM: Even if you get perfect ability to tune the knobs exactly, if the workload changes, your tuning work is suddenly obsolete. For TPC-C benchmark tests, for example, many elements are static, but these elements aren't static in the real world.
PS: Exactly. The dot-com whirlwind forced people to get their solutions to market immediately. The industry pressed companies to get these systems online before they fully knew the type or volume of transactions they'd see or how many users they'd have to support. The system could be—and often was—completely different six months down the line. Even as we look at some of Microsoft's own internal business applications, such as accounting, the load changes from morning to night. So, you can add enough memory so that you can manage the three to four different workloads during peak system usage, but that memory will be wasted at night. You could probably get a few percentage points improvement if you wanted, but you'd have to pay a lot for extra hardware, consulting, and so on. And you'd need to be tweaking the system in the morning, at midday, and at night. We don't think customers want to play that game. With SQL Server, they can save a lot of money and still solve their business problems and meet their business needs.
BM: Using distributed partitioned views, Microsoft has posted fantastic TPC-C benchmark scores on scale-out hardware. But customers find distributed partitioned views difficult to manage, and performance can be a problem. What are Microsoft's plans for helping customers manage scale-out implementations?
PS: At the beginning of development for Yukon, we had to set our priorities. We looked at what our customers were doing, how they were solving their business problems, and decided to focus on programmability and scale-up solutions rather than on the scale-out architecture. We also looked at the hardware coming down the line, which was going to make scaling up more feasible and attractive to customers. With today's hardware, you don't see many applications that can't effectively run on an 8-way system. And if you look further out, when Yukon will be available (the beta is slated for the first half of 2003), we're looking at 64-bit machines and 64-way machines. So, we just couldn't see much demand for scale-out and clustering solutions.
That said, not pursuing advances in distributed partitioned views and clustering was a difficult choice for us because lots of people in my group, including myself, have strong clustering backgrounds. And we recognize that distributed partitioned views are difficult for customers to use. Ease of use is what Microsoft is about. So if we're going to pursue clusters—and we definitely have that goal for the future—we want to do it right. We'll guarantee and automate the moving of data from one machine to another when one cluster node becomes overcrowded or when the response time becomes poor. Until then, distributed partitioned views are a stopgap measure for people who are using multiple machines.
BM: Adding the .NET Common Language Runtime (CLR) to the SQL Server engine in the Yukon release will add tremendous value and programmability to the database. But I fear it will also allow talented developers who don't necessarily understand database-centric issues to write inefficient code. Today, if a Visual Basic (VB) or C++ developer writes code or a stored procedure that's not very efficient, a DBA can fix the problem because they can read the language. How will DBAs be able to debug, troubleshoot, and performance-tune inefficient procedural code written in languages they aren't familiar with?
PS: Microsoft is going to provide great tool support for logging stored procedures and so on. We'll also have white papers and other documentation that can help identify possible problem areas and shorten the learning curve. DBAs are going to have a learning curve for a couple of years, which is a natural outcome of making the database more accessible to more people who don't have 20 years of experience writing database applications.
BM: SQL Server Profiler is my best friend when I'm tuning, letting me see what's happening at the engine level and identify exactly what's consuming the most resources. In Yukon, are we going to see an enhanced Profiler that can monitor very low-level activity that's happening in a non-T-SQL language, or will Profiler remain mostly limited to monitoring server classes for T-SQL?
PS: We'll be sorting different trace events in the CLR, whether it's garbage collection or invoking functions or other activity. I can't give exact details because we're still working them out, but our goal is to expand Profiler's reach because we understand the importance of making sure DBAs have the information they need to manage performance and other aspects of the database.
BM: Another Yukon question: Will Microsoft expand our ability to construct XML queries, and will SQL Server's query processor and storage engine be able to store and manipulate XML natively so that we can index and search XML?
PS: Microsoft saw the importance of XML early on and put a lot of work into designing deeper XML support within the SQL Server engine, both in the query processor and the storage engine. In Yukon, we'll be implementing some sort of XML data type, along with a deep knowledge of that data type in terms of indexing or querying using X-Query or Data Manipulation Language (DML). We're taking a lot of what we learned in implementing XML on the middle tier and pushing those capabilities that make sense down into the database engine. We're not building on the top of or on the side of—we're building this integrated functionality deep into the engine. Because of this integrated approach, you'll be able to load XML into SQL Server, back up the XML, and query and manipulate it. And it will be lined up correctly with the metadata and DDL. Essentially, we're making an engine that lets you do relational access and XML access.
We're also including native support for HTTP and Simple Object Access Protocol (SOAP) in the product, which lets us directly expose XML Web services. So, you'll be able to talk HTTP as an alternative to Tabular Data Stream (TDS). We think this will let us play in the Web services space directly as a better component server.
BM: Let's talk about T-SQL for a moment. T-SQL is a very logical language, and programmers should be able to concentrate on queries being logically correct without having to worry about performance. The SQL Server team has done a lot of work to make the query optimizer more sophisticated with every release. However, different queries that perform the same task and generate the same result set still sometimes run through different execution plans and perform differently—sometimes in orders of magnitude. How far are we from having an optimizer that can always tell that two different queries that perform the same task are actually the same, then produce the same execution plan for both?
PS: I believe the optimizer gets smarter with every release, and we're getting a lot smarter at how to test it by using tools against generated and real workloads. But when you're trying to generate the correct query, it's hard to come up with just one right answer. Optimizing queries is sometimes more of an art than a science. As you noted, we built a new optimizer in SQL Server 7.0, thinking a lot about stability of plans and so on. But invariably, when you release a new optimizer version, 90 percent of the queries get faster, a few remain the same, and a few get worse. We use automatic SQL generators that help us test, compare, and make sure that the plans SQL Server generates are the proper plans. We also have workloads and queries from real customers, and we study the query plans being generated for those. We think we have the right framework for continually making progress, and we think our automated tools will help us work on those 2-3 percent of queries that get slower after a new release.
BM: Depending on who you talk to, .NET might mean the CLR, Web services, Visual Basic .NET, or a number of other technologies and directions. What does—and will—. NET mean to the SQL Server community?
PS: Here's how I think about .NET. A major portion of .NET is about storage—a storage engine for distributed programs. So a lot of features that we have in SQL Server, such as XML support, are really setting the product up to be something beyond the traditional database that we've built over the past 20 years. For example, the new Notification Services for SQL Server 2000, which lets you develop and deploy applications that generate and send personalized notifications, goes beyond the functionality of a traditional database system. We're seeing a big market for Notification Services from people who want to be notified when, say, their stock price changes or their plane is late. We put a lot of value on the dynamic nature of things, not just static behavior. So, for example, we're doing work in query notification—working with ASP.NET so that when things happen in the database, we can notify different products up the stack to take appropriate action.
Every indicator we look at says we'll be building more and more distributed applications in the future. And so much of what we're developing for Yukon and beyond—in terms of programmability, notification, XML, and so on—makes us the right platform for building these distributed apps. We're looking at SQL Server as a sort of subsystem—the structure under the covers that lets you build distributed applications. The relationship between SQL Server and .NET is really about storing information in multiple ways, from supporting large databases, to facilitating the dynamic nature of databases, to supporting XML and HTTP access.
BM: You touched on the idea of SQL Server being a storage engine for things above and beyond what we typically think of as belonging to a database. What kind of work is Microsoft doing around that concept?
PS: You're right, the technology we're building is becoming more and more important for Microsoft. First, we built a great database. That success opened a lot of eyes at Microsoft and opened a lot of opportunities to use this technology with a lot of different products. Probably 90 percent of my people are core database people—that's what we live and breathe for and think about. That being said, we've built a valuable technology, so why should we continue to build one-off technologies that try to do similar things for different products?
If we hadn't made the early push on SQL Server's serviceability, ease-of-use, self-monitoring tools, and other manageability functions, it would be difficult to pick up the traditional database and steer it toward these other uses. But because we did focus on serviceability, we're finding that it's not that complex to take the technology, talk to different customers both internally and externally, and understand what they need to do and how much we need to change the product. Again, Notification Services is a good example. We didn't have to change SQL Server much at all because it already had the functionality to solve this kind of messaging problem. Yes, it's stretching us a bit to have a good number of people thinking about problems in different dimensions and different domains. But we find it fitting very naturally with what we've already built. And XML is helping because we already have a lot of data structures and access patterns lined up with the way the company's thinking about using this technology in the future.