NoSQL: Eventual Consistency Yields Major Flaws

NoSQL and I don't get along.

That's not really NoSQL's fault. Nor, frankly, is it mine.

Instead, I blame our rocky relationship on NoSQL's fan club—or developers who blindly assume that NoSQL is the answer to all of the world's problems. It's not. NoSQL is a viable technology for a subset of specialized needs and use cases—that comes with significant tradeoffs (as does every other single technology available).

Related: NoSQL? No Way!

Yet, what I commonly hear from developers high on NoSQL catnip, is that relational databases don’t scale or working with them is hard or tricky. Honestly, every time I hear that I want to reply with "Booohooo. Cry me a river."

Relational Databases Scale Just Fine

First of all, relational databases can and do scale. Just fine. Yes, maybe your CRAPPY database is having major problems, but that's potentially because you bit off way more than you could chew and failed to architect things correctly. But, make no mistake, relational database can and do scale. It might take some effort and you might have to learn and master some things, but relational databases do scale—just fine.

Second, lots of aspects of development are hard—or cumbersome.

Test-driven development (TDD) is or can be hard to get right—but you'll write better code with tests than without.
Version control is cumbersome—but you're stupid not to use it.
Multi-threading is non-trivial to implement properly but you're going to need it sometimes.
Etc.

But, when's the last time you heard a developer say (without being laughed to scorn): "Hey, I keep getting burned by these pesky off-by-one errors—is there a language that will infer what I mean here since I'm too stupid to explicitly and correctly define my logic here?"

Learn How Relational Storage Engines Work

Yet, while gobs of developers seemingly have no problem learning a brand new JavaScript framework from one week to the next, too many of them can’t be bothered to learn how relational storage engines work. They will, however, also spend hours, days, months, and years learning the ins-and-outs of newly emerging (and frequently very poorly documented) NoSQL platforms all because "relational databases are hard." (So it's not like developers can't learn or won't spend effort—which is why I get cranky when they act like sissies and assume that NoSQL will solve ALL of their problems—magically.)

And, don't even get me going on the idiotic pursuit of web scale by developers working in established brick-and-mortar small businesses (that are likely never going to get or need web scale), or every other startup on the planet that still hasn't even figured out a business model yet. Scale is NOT the only concern.

But I digress. Because my point here isn't to call NoSQL names. (I just get sick of the trend of NoSQL being seen as a factory for magical, powerful, frolicking unicorns—that shoot gold coins out their backsides and can slay all of their enemies with fire or their horns.)

Instead, while I believe that NoSQL can, does, and will have some places where it legitimately makes sense (as long as you understand the trade-offs), I also believe that NoSQL is also the PATENTLY wrong choice for a number of different types of applications or use cases.

NoSQL is the Patently Wrong Choice for Many Solutions

Again, I have no real beef with NoSQL. I personally don't like or subscribe to what many think are some of its biggest virtues—but I'm not so petty or narrow-minded that I can’t see its honest strengths and benefits. Likewise, in many cases, I think that while a relational database would be the right choice for me, I can concede that a NoSQL implementation would be the patently RIGHT choice for a developer with other strengths, backgrounds, and needs.

All of that said, there are also plenty of cases where NoSQL is not only less enviable to use, but ends up being the patently worst thing to use.

Eventual Consistency Is Not Enough When Dealing with Financial Transactions

Enter this sad tale about how the architectural decision to use NoSQL was the patently wrong choice for a BitCoin exchange.

Note too, how imperative it is to call out that the security vulnerabilities in the failure detailed above was NOT merely a question of developers doing something stupid by using NoSQL incorrectly (SQL Injection, for example is a known vulnerability that DEVELOPERS run into by doing something stupid—it's not an explicit weakness of relational database engines it is, instead, a weakness of developers).

Instead, the failure or problem here was a simple, fundamental, oversight of a major failure or limitation of NoSQL in general—it's inability to handle truly transactional operations (i.e., transactions covered by that pesky, ACID stuff that's so hard for developers to understand—and therefore, one of the big reasons why they flee to NoSQL).

Or, in other words, the way that NoSQL manages web-scale capabilities is by letting multiple operations work against the same object or data at the same time—and then letting the true state of that data eventually become realized. So (in overly simplified terms), if I can debit my own account for $500—but manage to do that 20,000 times in a single second, then I’ll manage to pull out $10,000,000 from my account (that only has $501 in it) and, eventually, the state of my account will become eventually consistent with a balance of -$9,999,500. Or, as the article above succinctly calls out:

Any computer scientist worth her salt would immediately repeat this process all day, at web scale, until she emptied out all the cash at the exchange. And that's exactly what the attackers did. [Emphasis added]

Where the irony, of course, is that the attackers were able to take advantage of the overzealous focus by developers on web scale throughput to be able to mount this exact attack.

Related: NoSQL: What You Need to Know

Comments

Plain text