The Honesty of RDBMS, NoSQL, and ACID

RDBMS or NoSQL--the decision rests on performance

Very few developers today remember a time before the relational database—it was a very long time ago, indeed. E.F. (Ted) Codd published a paper in 1970 that defined the concept of the relational database, but it would be more than 10 years before a product would really come to market.

One of the core tenets of the relational database was ACID: Atomicity, Consistency, Isolation, and Durability. Atomicity focused on the "all or nothing" rule: the idea that a whole transaction—or none of it—was completed. Consistency meant that when a transaction rolled back, it left the data in a state consistent with where it was before the transaction. Isolation focused on the idea that a transaction was isolated from other transactions, forcing transactions affecting the same data to wait until the previous transaction had been completed (locking). Finally, durability was all about the idea that no matter what kind of failure, the database remained consistent. There's more to ACID than this, but you get the idea.

The principes of ACID have become so deeply ingrained in databases now that most developers are barely even aware of it. They just expect their databases to work, never lose data, never make a mistake, no matter how egregiously we abuse it.

The recent NoSQL movement has postulated the idea that all this protection comes at too high a performance cost: that we can "bend" some data integrity rules in the name of performance. Most NoSQL solutions are not ACID compliant. Some are; the question is, do you know the difference? Do you care?

We've been undermining ACID for some time now in the name of performance, using various caching techniques to pull data out of the database and into something faster—effectively "taking a chance" with data to improve performance.

Ever tried buying a flight on Expedia just to find out that the price you were quoted no longer existed? You could still get a ticket, but at a higher price. You just experienced a caching failure, and a violation of database consistency. Expedia "took a chance" with keeping a copy of the ticket prices in a cache, away from the actual database of the airline that had the real prices. In most cases, probably in the high 90th percentile, this is no problem. But once in a while, they offer a price that is no longer available, and you get that happy message about being able to pay more for the same flight.

The point I'm making here is that the principles of NoSQL aren't that exotic; they are based on a concept many of us have being actually doing for some time now: sacrificing data integrity for performance. Ultimately, it is a business decision that manifests itself in coding.

The question is: Does it make more sense to subvert a highly reliable system like an RDBMS with technologies like caching, or is it better to use lighter weight, less-reliable-by-design system like one of the NoSQL products to do the same thing? After all, you've already been subverting ACID, why not be honest about it?

Richard Campbell is technical director of DevProConnections.

Hide comments

Comments

  • Allowed HTML tags: <em> <strong> <blockquote> <br> <p>

Plain text

  • No HTML tags allowed.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Lines and paragraphs break automatically.
Publish