Big data dudes and dudettes, and those folks who starting to assemble big data stores: Are you faced with issues of how to wrangle database design and data modeling in their big data world?
If you've been a follower for any length of time, you know that my forte is data modeling and database design. Lately, I've been encountering discussions about big data, NoSQL (or Not Only SQL, as some call it), and how, in a big data world, data modeling is not required.
In my dim and distant past, I was heavily involved in big data modeling and processing with oil and gas exploration. By today's standards, the volume and velocity of data gathered would be considered small, but for the computing power of the day, it qualified as "big data." We definitely modeled the data, otherwise, there would have been no way to analyze and use the data being gathered, and we would never have been able to intelligently spot the next drill site.
No Data Modeling Paradigm?
Which is why I don't understand the NoSQL concept of the "no data modeling" paradigm . . . what exactly does that mean? Does that mean no data modeling at all and all of the data is just dumped into a big pile (heap structure) so that when you retrieve data you have to start at the beginning and search through, reading all the data to the end until you find what you’re looking for (table scan)? Probably not. Obviously, there’s got to be data modeling or data organization going on somewhere.
In the business world, transactional data is modeled via a set of rules and methods so that updates, inserts, and deletes don't invalidate the integrity of the data, and select operations are optimized (we hope!).
But what do you do for big data? How do you handle data modeling in a business big data world? How are you addressing organization within the massive scale that big data presents? What is your approach?
I'd really like to hear from you, I'd really like to get your input on this subject. I'm very curious!
See also: Big Data, To Model or Not to Model?