A Dependency Checker You Can Depend On

Downloads
95428.zip

Dependencies between database objects dictate the order of deployment in code releases. The order of object deployment is key to the success of database projects, especially when code is moved between different environments. Therefore, the dependency information viewed by developers and DBAs must be accurate. Although SQL Server 2005 and SQL Server 2000 have many great features, they still don't track object dependencies in a reliable fashion. Interestingly, nearly all third-party dependency tools rely on the limited information recorded by SQL Server, which can lead to functional and deployment errors.

To overcome these deficiencies, I created a dependency checker that returns the dependency and deployment order of objects. This checker uses a simple algorithm and some T-SQL code. Before I show you how the dependency checker works and how to use it, let's look at why SQL Server's built-in dependency tracking functionality doesn't work well and how the dependency checker avoids the same problems.

SQL Server's Deficiencies in Detecting Dependencies

From the application developers' view, SQL Server contains four standard object types: tables, views, stored procedures, and functions. All other objects are either prerequisites or attributes of objects in these groups. For example, logins, users, rules, and full-text catalogs are prerequisites, whereas triggers, foreign keys, and synonyms are attributes because they're typically created together with the parent object. When database projects are deployed, all prerequisites are created first. Hence, the main challenge is to identify dependencies and determine the deployment order of standard objects. With this goal in mind, let's concentrate on finding all possible dependencies between tables, views, stored procedures, and functions, potentially through their child objects.

Dependencies are expressed in many ways. A stored procedure that selects records from a table or view depends on underlying objects. Similarly, a view that selects data from a table or function depends on underlying objects. Another example is a trigger that manipulates data in table1 when data updates are made to table2; table1 therefore depends on table2.

SQL Server 2000 uses the system table sysdepends to store object dependencies. In SQL Server 2005, dependency information is recorded in the system view sys.sql _dependencies. Both of them operate in a similar manner and map an object's ID to each dependent's ID. Explicitly, sysdepends contains the columns \[id\] and \[depid\], whereas sys.sql_dependencies uses \[object_id\] and \[referenced_major_id\] to denote the relationship between an object and its dependent.

Two problems arise with the built-in dependency tracking functionality in SQL Server. First, dependencies are recorded only when the dependent item already exists in the underlying database. However, in most cases, nothing prevents users from creating an object that depends on another object when the latter doesn't yet exist, in which case SQL Server can't detect the dependency. Second, if a child of one object (e.g., a trigger called TRG_table1_AfterUpdate that is invoked after data updates to table1) depends on another object (e.g., the trigger manipulates data in table2), the parent object table1 depends on table2.

SQL Server's dependency tracking functionality doesn't account for these scenarios. Thus, if a user generates scripts that contain the schema of such objects by using popular auto-scripting tools, the order of operations in the scripts is incorrect. Consider the following examples:

Example 1: A stored procedure that references a table is created before the table, as Listing 1. In this case, each call to execute the stored procedure before the table is created results in errors.

Example 2: A trigger updates a target table when data changes are made to the trigger's parent table, but the trigger is created before the target table, as Listing 2 shows. Because the target table isn't available in the database (i.e., it hasn't been created yet), all data updates to the parent table incur errors.

Example 3: A view selects data from another view that doesn't exist in the underlying database, as Listing 3 shows. All attempts to select data from the non-existent view will result in binding errors.

To avoid such problems, the dependency checker analyzes databases for the following dependencies:

Tables that depend on other tables. Cross-table dependencies can occur when a table with a foreign-key constraint depends on the referenced primary table. Cross-table dependencies can also occur when the child of one table depends on a second table, as illustrated by Listing 2.
Tables that depend on stored procedures, functions, or views. If a trigger selects records from views or functions, or if the trigger code executes functions or stored procedures, then the parent table depends on the referenced objects.
Views, stored procedures, or functions that depend on other objects. This type of dependency can occur when views select rows from tables, functions, or other views. Similarly, it can occur when stored procedures or functions select data from the same set of objects or call other stored procedures or functions.
Objects that depend on themselves. A table that has a self-referencing foreign key constraint is considered to be self dependent. Other self-dependencies include a trigger that updates its parent table and a stored procedure or function that calls itself. For the purpose of project deployment, self-dependencies are of little interest, although they're still taken under consideration by the dependency checker's algorithm.

How the Checker Works

Foreign-key constraints are stored appropriately in SQL Server. However, dependency problems arise in all other cases in which relationships are embedded in the object text or in the code of child objects. Consequently, it seems natural to use a text-based dependency checker that parses T-SQL code to search for object references.

For the purposes of the dependency checker, let's define a text-based object as any single view, stored procedure, function, or trigger. One text-based object depends on another text-based object when the dependent object's name is referenced in the T-SQL code. Moreover, when an object is referenced, the first character before and after the object's name must be one of the values listed in Table 1. It's also worth noting that if a left square bracket appears before the object's name, the object name is most likely appended with a right square bracket.

The dependency checker operates by collecting all text-based objects in the database as requested by the user. Input and configuration parameters let the user process all database objects or just specify a collection of items. Next, the dependency checker retrieves the CREATE statements for all objects by querying sys.sql_definitions in SQL Server 2005 or syscomments in SQL Server 2000. The checker stores the statements in a temporary table.

To find references to other objects in the text, all T-SQL code comments are stripped out. The dependency checker does this by finding instances of strings that start with the -- or /* comment character, deleting each comment, and storing the resulting code in the same temporary table.

At this point, the functional raw code is at hand and the search process can begin. The search relies on the set of characters in Table 1. Explicitly, the algorithm uses a table called #tblTextBasedObjects that contains the columns ObjectID (INT) and ObjectTextWithoutComments (NTEXT or NVARCHAR(MAX)), which store the ID and the comment-free CREATE statement for each object, respectively. Furthermore, the algorithm uses a temporary table named #tblAllDBRoutinesTablesViews to store all standard objects in the database. This table contains four columns:

ObjectName of type NVARCHAR(128)
ObjectID of type INT
ObjectNameForLikeSearches1 of type NVARCHAR(200)
ObjectNameForLikeSearches2 of type NVARCHAR(200)

The last two columns are used for string-matching purposes and are populated using the code that Listing 4 shows. The dependency checker can now search for object references in a straightforward manner by applying the LIKE function on the patterns in #tblAllDBRoutinesTablesViews, as Listing 5 shows.

The code in Listings 4 and 5 illustrate the basic concept behind the dependency checker; the complete algorithm and code are a lot longer. You can download the complete algorithm and code, which includes additional details, from the SQL Server Magazine Web site.

How to Use the Checker

When you download the code, you'll find two files: DependencyViewer2K5.sql, which is the dependency checker for SQL Server 2005, and DependencyViewer2K.sql, which is the dependency checker for SQL Server 2000. Here are the main parameters for both scripts:

@IncludeAllDBObjects. The @ IncludeAllDBObjects option tells the script whether to consider all objects in the database (value of 1) or only user-provided objects (value of 0). If you set this parameter to 0, you must populate the temporary table #tblRequestedObjects with the names of all the objects you want examined.
@IncludeDependencies. If you set the @IncludeAllDBObjects parameter to 0 and provide the object names in the #tblRequestedObjects table, you can set the @IncludeDependencies option. Setting this option to 1 tells the script to find objects that the items specified in #tblRequestedObjects rely on.
@IncludeDependants. Like the @IncludeDependencies option, the @IncludeDependants option is available if you set @IncludeAllDBObjects to 0 and provide the object names in #tblRequestedObjects. Setting this option to 1 tells the script to find objects that depend on those items listed in #tblRequestedObjects.

There are additional options that you can set. The scripts' usage instructions describe them in detail. When you use the scripts, keep the following considerations in mind:

Circular dependencies. Circular dependencies can occur when each item in a group of objects depend on each other. When checking for dependencies, it's important for you to recognize circular relationships. However, for deployment purposes, there's no right or wrong ordering; items that fall under this category can be deployed in an arbitrary order.

False positives and false negatives. False positives refer to instances in which the algorithm reports dependencies that don't actually exist. These events occur when objects' names contain the special characters listed in Table 1. For example, if a database contains the \[sales orders customers\] and \[orders\] tables and a stored procedure references the \[orders\] table, the algorithm will falsely report a dependency on the \[sales orders customers\] table as well. In spite of the possibility of false positives, they don't affect the correctness of the deployment order and therefore are insignificant. But this isn't the case with false negatives.

False negatives refer to instances when dependencies remain undetected. This occurs in several situations, such as when a text-based object is created with the ENCRYPTION option enabled (in which case, the dependency checker doesn't process the body text) or when the referenced object's name is noncontiguous. False negatives can also occur when several objects have the same name yet belong to different schemas, in which case the DBA responsible for the database will run into much bigger problems than identifying dependencies. To conclude, false negatives can happen in rare scenarios, but SQL Server and other tools also overlook these dependencies.

Size limitation. The dependency checker works well for analyzing dependencies and deployment orders for dozens or even hundreds of objects. However, because it's written in T-SQL, which isn't too efficient in string manipulations, it won't work well for databases with large numbers of objects. These performance issues can be resolved by translating the T-SQL code to C# or by using full-text indexes and searches inside the T-SQL code.

A Reliable Alternative

SQL Server 2005, SQL Server 2000, and many third-party products don't track object dependencies in a reliable fashion. The dependency checker is a much more reliable alternative.