Words are important. My undergraduate degree is in Linguistics, and I have always been interested in words and language. Just last night, I was rereading one of my favorite classic Science Fiction novels, and the main character was discussing the difficulty that a person raised by aliens would have in learning English because it has such a huge number of synonyms. That’s great when you’re trying to express yourself, but not so great when you’re a non-native speaker who’s trying to understand what someone else is saying. Someone might know what “beautiful” means, but when they read “pulchritudinous,” they might be completely lost. Thesaurus.com gives several dozen synonyms for beautiful.
Sometimes it matters if you choose one synonym over another because there are different shades of meaning that can be necessary for complete clarity. But it often doesn’t matter which word you use if the meaning is understood.
In some cases, such as in a programming language, synonyms can’t be used arbitrarily, at least not for language keywords. For example, some implementations of SQL-based languages use RETRIEVE instead of SELECT, but in any particular implementation, these two words aren’t interchangeable. (I’m intentionally not mentioning the ability to create a synonym for a user-created object reference.)
But what about the terms we use when talking about a relational database? A frequent contributor to the Microsoft SQL Server public help forums, who is a SQL language expert (but not a SQL Server expert) and shall remain nameless in this article, insists that using the precise words is crucial to understanding and using SQL Server (or any relational database system). He becomes very insistent, and chastises new users in no uncertain terms, when they use the word “record” instead of “row” or “field” instead of “column.” He insists that the word “record” applies only to sequential tape files and should never be used for a relational database. As to the difference between “field” and “column,” he’s never very specific, other than to say that we shouldn’t use the word “field.” However, many SQL Server users have never had their hands on any kind of sequential tape system, and in their mind there’s no confusion. When they say “record,” they mean “row,” and everybody who reads their questions on the forums knows exactly what they mean.
So why does it matter? Every time I see the above referenced SQL expert rant about proper usage of these terms, I ask that question. SQL Server’s own documentation occasionally uses the word “record” instead of “row,” so it’s no surprise that some users will use that word also. For example, take a look at the documentation for the Dynamic Management Function sys.dm_db_index_physical_stats. Most of column names dealing with row information use the word “record,” such as min_record_size_in_bytes, max_record_size_in_bytes, and avg_record_size_in_bytes. But we all know what we mean when we talk about record size.
This issue came up for me again last week at PASS Summit 2010 in Seattle. Thursday’s keynote speaker was David DeWitt, from the Microsoft Jim Gray Systems Lab in Madison, Wisconsin. I would love to talk about his great content and delightful explanations for why optimization is such a complex process, but there’s not enough time or space right now. (You can watch the whole keynote on the PASS site.) However, I did want to point out that Dr. DeWitt used completely different terminology. When referring to tables, rows and columns, he used the words “relations,” “tuples,” “keys,” and “attributes.” I have a feeling no one listening to this talk cared at all which specific words he used. His meaning was crystal clear. In fact, according to Wikipedia, there’s no difference at all between all these terms:
“In the context of a relational database, a row—also called a record or tuple—represents a single, implicitly structured data item in a table. In simple terms, a database table can be thought of as consisting of rows and columns or fields.”
And if you can’t believe Wikipedia, who can you believe? Dave DeWitt makes it clear that it’s the content, not the specific word usage, that provides the value.