What the heck is the NoSQL movement all about, and how does it affect SQL Server developers and DBAs? I got the chance to ask Kevin Kline, strategy manager for SQL Server at Quest Software, and Brent Ozar, a SQL Server DBA expert for Quest Software, about this fast growing trend at TechEd 2010. Here’s what NoSQL means for you and the scenarios in which this option could be more efficient than SQL Server. (Be sure to check out the rest of my interview with Kevin and Brent on the Database Administration blog.)
Megan Keller: Brent, there seems to be a NoSQL movement taking place. What do our readers need to know about it?
Brent Ozar: There are two common scenarios for why you would consider using something other than your typical relational database. One is data that is not worth very much money. For example, if you’re tracking web usage amongst all your employees, you really don’t care if somebody clicked on Farmville, MySpace, or whatever. You care, but there’s not a million dollar value to that data, and there’s a ton of it, so we want to be able to stash it somewhere. But we don’t really need to query it that quickly, and we’re going to develop reports against it once and then walk away.
The other use is frequently reloaded analytical data. So I might need a whole lot of sales data and a whole lot of customer data, and I’m going to pay some rocket scientist PHD to slice and dice that with his own tools, but at the end of that we’re going to throw that [data] away. We’re just going to take whatever we learned and use that to make business decisions. Both of those are things that need peak loads or that are very cheap, and SQL Server doesn’t really work very well in those situations. It costs a whole lot of money and it’s pretty inflexible.
Kevin Kline: I would even go further and say relational databases. If you had Oracle or MySQL, even those would not be great at that.
Ozar: People aren’t really rebelling against the SQL language itself or any particular properties of the database. It’s just that they needed something cheaper and they can do these crazy loads—these high loads—very quickly, and make it very cheap and easy to scale. So developers are building tools to make it easier for them to store data that way, and much like the cloud, they’re not coming to DBAs asking questions. They’re just going to building their own very specialized tools in order to get the job done.
Keller: So we should expect this movement to grow a lot in the next year?
Kline: Absolutely. In fact, the writing is already on the wall. At the [PASS] keynote last year, Dave DeWitt spent a great deal of time talking about what a column value stores. And NoSQL doesn’t stand for “no SQL,” it stands for “not only SQL.” So we’re not just excluding ourselves to only including relational databases. We’re going to look at other ways to look up data: key value stores, what’s the other one called?
Ozar: XML columnar storage, XML property bags.
Kline: Right, so there’s several different ways that you can get to this data that are better at certain types of use cases. So another example might be like with Facebook—they use Cassandra, don’t they?—which is one of the best known. So each one of them has different virtues, but the idea is that I need to get scaling up to millions of people. That’s where I need to be. And I don’t need ACID properties of a relational database—you know Atomic, Consistent, Isolated, and Durable. What if it’s eventually consistent so that when you post “I’ve got the flu today” it gets up there sometime in the next five minutes, but it doesn’t have to be there right now? We can cut all kinds of shortcuts compared to if we did that with Oracle or SQL Server or one of the relational databases. So part of it is the high-end scalability.
And one of things that Dave DeWitt pointed out, as Brent had mentioned earlier, is we have CPUs that are getting so much faster. Intel has already announced I think a 64- or 32-core CPU, so that is still continuing to accelerate. But our hard disks are not getting any faster; really, they kind of peaked out there. So we do have some intermediate technologies, in-memory data storage, we have SSD, that helps, but it’s still not enough to keep up with huge advances we’re making in CPUs. So what if we turn that relational model on its head, we use a different model, and we can get to this stuff faster? That’s again another opportunity that the hardware presents for us, just like what we have with VMs, like what we are having now with the cloud. And Microsoft is already thinking about how are we going to be able to incorporate key value storage, XML storage, and so forth as elements of SQL Server in future releases. Now DeWitt up and down said “This is not future looking.” It is absolutely future looking. He just meant, “Don’t call me on a date.” But it will be in there eventually. That’s Kevin Kline talking, that’s not Dave DeWitt.