What's in a Name? - 24 Apr 2003 | ITPro Today: IT News, How-Tos, Trends, Case Studies, Career Tips, More

In Shakespeare, "That which we call a rose by any other name would smell as sweet." But although people might still be able to recognize an object when its name changes, SQL Server can't. A name, in the database sense of the word, is a string of characters that identifies an object—such as a table, a column, or a stored procedure—to the database and to users. When that string changes, your database thinks it's looking at a different object. And in the context of your database, users also might think they're looking at a different object. So it's important to establish and stick to a naming convention for your database objects.

In "Seven Deadly Sins," November 2002, InstantDoc ID 26727, I told you about some of the dangers you face when you don't use a naming convention for your databases. Databases evolve, and the names of database objects evolve too. Over time, naming inconsistencies can become so prevalent that they inhibit productivity and cause confusion when users try to access data. This month, I give you examples of my favorite naming conventions and hints about how to handle global name changes in your databases and in the programs that touch those databases.

A good, clear naming convention makes database objects easier for you and SQL Server to recognize and manage. If you have a good naming convention, you won't find yourself struggling to understand what a table named ILOrderMod, for example, is doing in your database. In addition, good names help you avoid confusion when you or members of your staff are writing programs. Did you mean to write a stored procedure for table prof_svc, prof_svc_src, or svc_src_prof? If you have a good naming convention, the table names will be clear, concise, and meaningful to everyone and everything that uses them.

Within your environment, all objects of the same type (e.g., all tables, all views, all stored procedures) must have unique names. SQL Server 2000 won't let you duplicate names among tables, views, and other objects such as constraints, defaults, and primary keys.

SQL Server uses a multipart naming convention for each object within each database. When a DBA creates a table named MyTable on a database called MYDB, the full name of that table is MYDB.dbo.MyTable. If the table is a user table and you don't use its full name in a query, SQL Server resolves the name by checking the local database for a table that has that name. If user Sam creates a table named MyTable in the MYDB database, his table's full name is MYDB.Sam.MyTable. To SQL Server, these two tables have two different names—the full names differentiate the tables.

Two instances of SQL Server can't have the same name if they're on the same network. If you deploy a distributed database solution on linked servers, you need to ensure that no two tables (or a table and a view) on the linked servers have the same name. If you find that you have duplicate object names, you have to use a four-level, fully qualified name for each object (e.g., AccountsSvr.Accounting.dbo.Expenses). On the positive side, using a fully qualified name for each object clarifies what you're talking about; you can think of this practice as a method of code self-documentation. Also on the positive side, a fully qualified name reduces performance overhead somewhat because SQL Server doesn't have to resolve names when it processes a query. But on the negative side, fully qualified names involve a lot of typing and an increased opportunity for typing mistakes.

Any naming convention you choose should produce object names that are descriptive, concise, and easy to manage. In SQL Server, you can create names as long as 128 characters. If you're working for a company that uses a heterogeneous database environment—that is, you integrate data and applications from different database platforms such as SQL Server, Oracle, and DB2—you have to use a more restrictive naming scheme than you do in a SQL Serveronly environment. For example, no platform other than SQL Server (and the Microsoft Access Jet Engine) supports spaces in object names. So in a heterogeneous environment, you'd choose a naming convention that excludes spaces so that your names are compatible across all your platforms. If you're faced with implementing cross-platform compatibility, you can find information about the lexical attributes for each of the three major database platforms in Peter Gulutzan's article "SQL Naming Conventions" at http://www.dbazine.com/gulutzan5.html. One of the article's tables lists the maximum lengths of names for some of the most common database objects, and another table lists the characters that various platforms allow for their database object names.

Gulutzan's article also lists some naming conventions for different database platforms. For some of these objects, I suggest that you add the letter u to the prefix to distinguish your user functions or user stored procedures from those that are part of the database management system. For example, Microsoft suggests using the prefix fn_ for function names, sp_ for procedure names, and trig_ for trigger names. But I suggest using ufn_, usp_, and utr_, respectively, because if you create a user stored procedure with a name such as sp_myproc, SQL Server will think that sp_myproc is a system stored procedure and search the master database first. The erroneous search will result in diminished performance, and SQL Server might even execute the wrong stored procedure if you've given your stored procedure the same name as a system stored procedure.

My favorite naming convention is a derivation of the Hungarian notation that Access programmers use. This naming convention uses no spaces, minimal special characters, mixed case for readability, and a prefix to identify the type of object. Some examples of this scheme are sysFullTextCatalog (a system table), tblTitleAuthor (a user table), colRoyaltyPer (a column), usp_ByRoyalty (a user stored procedure), utr_EmployeeInsurUPD (a user trigger), def_EmployeeJobID (a default constraint), ck_Employee_EmpID (a check constraint), pk_Author_AuID (a primary key), and fk_Sales2Titles (a foreign key reference). You can easily adapt this scheme to Visual Basic (VB) or .NET objects. The prefix makes it easy to identify what kind of object you're dealing with and helps avoid duplication of object names. Although elimination of blank spaces isn't necessary in SQL Server 2000, the practice makes your code compatible with code and objects from other database platforms.

After you've decided to adopt a naming convention and you've accounted for any object-name length limits to accommodate the various database platforms in your organization, you can implement tables and views and write code against the database objects. But what happens if you need to modify an object name? For example, the company president might decide to stop using the word customer and replace it with the word client. This is no minor request. The word customer and variations of it—such as CustID and CustName—are present throughout the database in tables, columns, stored procedures, triggers, views, functions, and constraints. The word is also in all the programs and applications that touch the database. To make the change to client, you have three options:

You can manually search through the database objects and the program code, hope that you find all instances of customer, and convert them to client.
You can use the tools that the software packages in your system provide to perform a global find-and-replace operation. For example, applications such as Query Analyzer, AppsChannel's Visual Workbench, and Microsoft FrontPage include embedded find or find-and-replace routines.
You can use a third-party utility for the find-and-replace operation.

The first option, manually searching, can consume an enormous amount of time and still leave you with unresolved objects and behavior problems that are hard to trace. For example, in a recent project, my clients couldn't understand why their system was no longer inserting login values into a log file. It turned out that in the process of upgrading SQL Server 6.5 to 2000, someone forgot to modify one instance of the suser_name() function. That instance happened to be in the trigger that caused the insertions into the log file. (In SQL Server 2000, suser_name() returns NULL and is included for backward compatibility only.) The manual-search technique is the most costly and the least reliable of the three options.

The second option, using the embedded find-and-replace routines, works as long as you remember to perform the find-and-replace operation for every type of software you're using and for every variation of the string value you want to change. Traditional find-and-replace routines have other limitations that I find too restrictive. For example, they force me to work in a sequential fashion. Each time the routine finds a match, I have to decide whether to replace the occurrence without knowing about any references that might follow. If I make a mistake, it's difficult to undo anything but the last change. In addition, my scope is usually limited to the current module, I can't control the order in which the routine presents occurrences, I can't view or print a list of occurrences, and I can't run the find-and-replace operation in test mode to see what the outcome will be like. Some of the products I use have an embedded find routine with no accompanying replace, so I have to generate scripts for the objects, then use another software package to find and replace terms. These embedded find-and-replace functions are better than manual-search techniques, but they're still time-consuming and error-prone.

The third option, using a third-party utility, can save time and help you organize your work. The effectiveness and efficiency of a third-party utility depends on its feature set. A good package for those of us who work in a SQL Server environment is typically a visual utility that you can use with SQL Server 2000, 7.0, and 6.5; Access 2000 and 97 databases and projects; and VB 6.0 and 5.0 projects. A good product also makes it easy for you to view all your active projects, employs a simple find-and-replace operation, and lets you search for multiple strings in one operation or even search across multiple projects, analyzing dependencies and determining the correct replacement sequence for dependent objects. You should be able to analyze and filter results, save the results for future analysis, and apply the results later. A good product also makes it easy to roll back some or all of the changes.

My favorite third-party product for performing find-and-replace operations is Black Moshannon Systems' Speed Ferret, which includes all of these features. Figure 1, page 39, shows the product's interface. One of the product features I like is that after I save my project and search, I can exit Speed Ferret and analyze the effect of the changes on my production environment. If I'm not satisfied with something, I can reenter Speed Ferret and roll back part or all of the changes that I made. Speed Ferret can also modify database objects such as table and column names and all references to these objects. This feature lets you establish naming standards on an existing database. As you can tell, I prefer to use a third-party product when I need to change object names, but your organization's needs might differ.

Naming conventions are about organization. The naming convention you choose isn't as important as choosing one and sticking with it. You need to be consistent with your naming convention and document it. The naming convention becomes part of your organization's documentation. Then, when your company president asks you to make a name change that affects a database object, you know what options you have for making the change. Managing your database environment just became a little easier.

Comments

Plain text